Additive, non-additive and semi additive facts in data warehouse fact table

In a data warehouse fact table, fact can be classified as additive, semi additive or non-additive based on their characteristics. Additive facts are measures that can be aggregated by ALL or ANY of the fact table dimensions, on the other hand non-additive facts are measures that cannot be aggregated over any dimensions, or it may be inappropriate to do so. Semi additive facts are bit complex, it only provides meaningful information when aggregated over some dimensions and require different aggregation types such as AVG, MAX, MIN for other dimensions.

The following example of a website stats table and dimensions tables is used to explain the additive and non-additive facts in detail.  This table contains metrics such as pageviews and unique visitors to the website and sliced with dimensions such as date, geo location and web page of the site.


Additive:
The aggregation of pageviews provides meaningful information when aggregated across any dimensions, making it considered an additive fact. For example, it can be aggregated by GeoId, which will give insights about site performance by cities/countries, etc. Similarly, it can be aggregated by PageId, providing insights about performance of pages within the website. In any combination, it can be aggregated and still provide meaningful information.

Non additive:
However, attempting to aggregate the “Unique User” metric would be inappropriate, as the same person who viewed “PageId 100” might also have visited “PageId 200”. Aggregating this metric could lead to inaccurate results as there is chance for double counting. Therefore, the ‘Unique User’ metric is classified as non-additive.

Semi additive:

The population of each country is loaded once every year into the fact table, as shown in the above picture. The population cannot be summed across countries. Adding the population by country, for example, Singapore will result in 10,010,000 (5,000,000 + 5,010,000), which is incorrect. However, it can be added across time periods to get the total population for a specific year.

Consider another example of a fact table that periodically records employee head count for each department (Refer to the below picture for sample data). Aggregating employee headcount by department may misleading, for example adding headcount for IT department would produce 850 which is incorrect. on the other hand, headcount in Q2 of 2018 would be 730 which is meaningful insight.

Leave a Reply

Your email address will not be published. Required fields are marked *

%d bloggers like this: