Facts tables play a very
dynamic role in the data warehouses or data marts because they consist of the
measurements, metrics or facts of a business process e.g., sales revenue by
month by product.
Example of Fact Table
In
the schema above, we have fact table FactSales that has a grain which
gives us a number of units sold by date, by store, by customer and by
product.
All
other tables such as DimDate, DimStore, DimCustomer and
DimProduct
are dimensions tables. This schema is known as star schema.
In
this fact table, each entry represents the sale of a specific product on a
specific day to a specific customer in accordance with a specific promotion
at a specific store. The business measurements captured are the value of the
sale, the cost to the store, and the quantity sold.
|
In a very simple way, we
can say that a fact table is nothing but might be contained business sales
events such as cash register transactions or the contributions and expenditures
of a nonprofit organization. Fact tables usually contain large numbers of rows,
sometimes in the hundreds of millions of records when they contain one or more
years of history for a large organization.
In the real world, it is possible to have a fact table that contains no measures or facts. These tables are called "Factless fact tables", or "Junction tables". The "Factless fact tables" can for example be used for modeling many-to-many relationships or capture events
Important
Features of Fact Tables
- They placed at the center of a star schema or a snowflake schema surrounded by dimension tables. Where multiple fact tables are used, these are arranged as a fact constellation schema. A fact table typically has two types of columns: those that contain facts and those that are a foreign key to dimension tables.
- The primary key of a fact table is usually a composite key that is made up of all of its foreign keys. Fact tables contain the content of the data warehouse and store different types of measures like additive, non-additive, and semi additive measures.
- Mostly fact tables contain numerical data (facts) that can be summarized to provide information about the history of the operation of the organization.
- Each fact table also includes a multipart index that contains as foreign keys the primary keys of related dimension tables, which contain the attributes of the fact records.
- Fact tables should not contain descriptive information or any data other than the numerical measurement fields and the index fields that relate the facts to corresponding entries in the dimension tables.
- Fact tables provide the (usually) additive measures that act as independent variables by which dimensional attributes are analyzed. Fact tables are often defined by their grain. The grain of a fact table represents the most atomic level by which the facts may be defined.
- Additive measures allow summary information to be obtained by adding various quantities of the measure, such as the sales of a specific item at a group of stores for a particular time period. Non-additive measures such as inventory quantity-on-hand values can also be used in fact tables, but different summarization techniques must then be used.
Measure
types
Fact table can store
different types of measures such as given below-
- Additive Measure– As it name implied, additive measures are measures which can be added across all dimensions.
- Non-additive Measure– different from additive measures, non-additive measures are measures that cannot be added across all dimensions.
- Semi additive Measure– semi additive measures are measure that can be added across only some dimensions and not across other.
A fact table might contain
either detail level facts or facts that have been aggregated (fact tables that
contain aggregated facts are often instead called summary tables).
Special care must be taken
when handling ratios and percentage. One good design rule is to never store percentages or ratios in
fact tables but only calculate these in the data access tool. Thus only store
the numerator and denominator in the fact table, which then can be aggregated
and the aggregated stored values can then be used for calculating the ratio or
percentage in the data access tool.
Types
of fact tables
All fact tables are
categorized by four most basic measurement events:
Transactional
– Transactional fact
table is the most basic one that each grain associated with it indicated as
“one row per line in a transaction”, e.g., every line item appears on an
invoice. Transaction fact table stores data of the most detailed level
therefore it has high number of dimensions associated with.
Periodic
snapshots – A Periodic
snapshots fact table stores data that is a snapshot in a period of time. The
source data of periodic snapshots fact table is data from a transaction fact
table where you choose period to get the output. A periodic snapshot table is
dependent on the transactional table, as it needs the detailed data held in the
transactional fact table in order to deliver the chosen performance output.
Accumulating
snapshots – The accumulating
snapshots fact table describes activity of a business process that has clear
beginning and end. This type of fact table therefore has multiple date columns
to represent milestones in the process. A good example of accumulating
snapshots fact table is processing of a material. As steps towards handling the
material are finished, the corresponding record in the accumulating snapshots
fact table get updated.
Temporal
snapshots - By applying
Temporal Database theory and modelling techniques the Temporal Snapshot Fact
Table allows to have the equivalent of daily snapshots without really having
daily snapshots. It introduces the concept of Time Intervals into a fact table,
allowing to save a lot of space, optimizing performances while allowing the end
user to have the logical equivalent of the "picture of the moment" he
is interested in.
Aggregation
in Fact Tables
Aggregation is the process
of calculating summary data from detail records. It is often tempting to reduce
the size of fact tables by aggregating data into summary records when the fact
table is created. However, when data is summarized in the fact table, detailed
information is no longer directly available to the analyst. If detailed
information is needed, the detail rows that were summarized will have to be
identified and located, possibly in the source system that provided the data.
Fact table data should be maintained at the finest granularity possible.
Aggregating data in the fact table should only be done after considering the
consequences.
Designing
fact table steps
There are very simple four
steps to design a fact table described by Kimball:
Identify
business process to model
– The first step is to decide what business process to model by gathering and
understanding business needs and available data such as Identify a business
process for analysis (like sales)
Identify
measures of facts
– by declaring a grain means describing exactly what a fact table record
represents
Identify
dimensions for facts – once grain of fact table is stated clearly, it is time to
determine dimensions for the fact table.
List
the columns that describe each dimension (Identify facts) – identify carefully which facts will
appear in the fact table.
Please visit to learn more on -
- Collaboration of OLTP and OLAP systems.
- Major differences between OLTP and OLAP.
- Data Warehouse
- Data Warehouse - Multidimensional Cube
- Data Warehouse - Multidimensional Cube Types
- Data Warehouse - Architecture and Multidimensional Model.
- Data Warehouse - Dimension tables.
- Data Warehouse - Fact tables.
- Data Warehouse - Conceptual Modeling.
- Data Warehouse - Star schema.
- Data Warehouse - Snowflake schema.
- Data Warehouse - Fact constellations.
- Data Warehouse - OLAP Servers.
nice
ReplyDeleteA data warehouse is an enterprise strategy that aims to overcome the common problem of data silos, or isolated pockets of data, which are inaccessible to other parts of the enterprise and not well integrated.seo dashboard
ReplyDelete