Slowly changing dimensions (SCD)
Dimension attributes change over time. SCD is how a warehouse handles that: Type 1 overwrites, Type 2 keeps history as versioned rows, Type 3 keeps the previous value.
A slowly changing dimension is a dimension table1 whose descriptive attributes change over time — a customer moves city, a product is recategorized, a salesperson changes region — and SCD is the set of patterns for how a warehouse records those changes. The question each pattern answers is simple but consequential: when an attribute changes, do you keep the old value, and if so, how?
🔗 Learn more — 1 Star schema and dimensional modeling
This matters because of how dimensions are used. A fact table records events (an order, a payment); dimension tables give them context (who, what, where). If a customer's region changes, the way you store that change decides whether last year's sales still roll up under the region they were actually made in — or get silently rewritten to the new one.
The three common types
- Type 1 — overwrite. Just update the attribute in place; no history. Yesterday's value is gone. Simple and cheap, correct when the old value has no analytical meaning (fixing a typo, a misspelled name). The danger is using it where history does matter — it quietly rewrites the past.
- Type 2 — add a new row. The workhorse. When an attribute changes, you insert a new version of the dimension row with the new value and mark validity: an effective-from / effective-to date pair, or a "current" flag, plus a surrogate key so facts can point at the exact version that was current when the event happened. This preserves full history — last year's sales stay attributed to the region they were really made in. The cost is that the dimension grows with every change, and queries must join on the version valid at the fact's date.
- Type 3 — keep the previous value. Add a column for the prior value (
previous_regionalongsidecurrent_region). Cheap, but remembers only one step of history. Rare; useful when you specifically want "current vs immediately-prior" and nothing deeper.
There are higher-numbered hybrids, but Types 1 and 2 cover the overwhelming majority of real cases.
Choosing one
The decision is entirely about whether history has analytical value:
- The old value is noise or a correction → Type 1.
- You must be able to report the world as it was at the time of each event → Type 2.
- You only ever need current-versus-previous → Type 3.
Type 2 is the default for anything where time-accurate reporting matters, which in a warehouse is most things — but it is not free. Every tracked attribute that changes spawns rows, dimensions balloon, and every query has to be written to pick the correct version. The common mistake is reflexively Type-2-ing every attribute; track history where it changes a number someone will ask about, and overwrite the rest. The right SCD type is a per-attribute decision, not a per-table one.