So what's an activity schema anyway?
The fundamental idea behind the activity schema is that a single table can model any data required for analysis.
This means all data is in the same, consistent structure - a series of events (called activities) done by an entity (called a customer) over time.
Any source data in a data warehouse can be easily modeled into this format with some simple SQL. Because the format is consistent, it requires a lot less upfront work and though than a star schema approach.
How is an activity schema better than a star schema?
The special structure has a few very cool attributes
- no need for foreign keys or explicit joins - any model can be joined to any other using relationships in time
- automated analyses - a consistent structure allows software tools to automatically generate queries and analyses that can work on anyone's data
The diagram below shows how an activity schema is used: one table, queried using relationships in time, powers all the queries you need to build out your metrics, dashboards, etc.
The overall claim is that adopting an activity schema in your data warehouse makes data modeling and analysis substantially faster compared to dimensional modeling with a star schema.
The entire topic is somewhat complex, but Ahmed's talk does a great job of diving into how the activity schema came about and how it can be used. He also shows that it can actually answer any possible data question.
Check out the full video for yourself!