Dimensional Model - Deepankar-Singh-Pawar/LearningNotes GitHub Wiki
Dimensional Modeling: Types of Dimensions and Facts
Types of Dimensions
1. Conformed Dimensions
- Definition: Shared dimensions used across multiple fact tables or data marts, ensuring consistent analysis.
- Example: A
Customer
dimension shared between Sales
and Support
fact tables, with attributes like Customer ID
, Name
, and Region
.
- Implementation: Maintain the same structure and keys in all associated data marts.
2. Role-Playing Dimensions
- Definition: A single dimension table that plays multiple roles within a fact table.
- Example: A
Date
dimension used as Order Date
, Ship Date
, and Delivery Date
.
- Implementation: Use aliases or multiple joins in SQL to reference the same table with different roles.
3. Slowly Changing Dimensions (SCD)
- Definition: Handle changes to dimension attributes over time.
- Types:
- Type 1: Overwrite old data (e.g., correcting a typo in
Customer Address
).
- Type 2: Create a new row for each change with effective dates (e.g., tracking
Customer Address
history).
- Type 3: Add new columns to store both current and historical values (e.g.,
Current Region
and Previous Region
).
- Implementation: Choose SCD type based on the business requirement for historical tracking.
4. Junk Dimensions
- Definition: Combine unrelated low-cardinality attributes into a single dimension to simplify the model.
- Example: A
Flag Dimension
for attributes like IsPromotional
and IsReturned
.
- Implementation: Create a single table with all combinations of these attributes, reducing clutter in the fact table.
5. Degenerate Dimensions
- Definition: A dimension stored as an attribute within the fact table, not as a separate dimension table.
- Example:
Order Number
or Invoice ID
in a Sales
fact table.
- Implementation: Store as a column in the fact table since it doesn’t have additional descriptive attributes.
6. Hierarchical Dimensions
- Definition: Capture hierarchical relationships, either in a single table or as normalized tables.
- Example: A
Geography
dimension with Country → State → City
.
- Implementation: Include hierarchical levels as separate columns or use snowflake schema for normalization.
7. Mini Dimensions
- Definition: Handle rapidly changing attributes by splitting them into smaller dimensions.
- Example: A
Customer Demographics
mini-dimension for attributes like Age Group
and Income Level
.
- Implementation: Create a separate dimension table and use a surrogate key to link it to the fact table.
8. Shrunken Dimensions
- Definition: A subset of attributes from a larger dimension, often used for aggregated facts.
- Example: A
Time
dimension with only Month
and Year
levels for a monthly sales summary.
- Implementation: Create a smaller version of the original dimension to improve performance.
9. Inferred Dimensions
- Definition: Placeholder dimension rows created during ETL when complete data is unavailable.
- Example: A
Customer
dimension row with only a Customer ID
and default values for other attributes.
- Implementation: Update the inferred row later when full data is available.
10. Role-Constrained Dimensions
- Definition: A single dimension filtered for specific roles.
- Example: A
Product
dimension filtered for Current Products
and Discontinued Products
.
- Implementation: Apply filters or constraints in reporting tools or views.
Types of Facts
1. Additive Facts
- Definition: Can be summed across all dimensions.
- Example:
Sales Revenue
or Units Sold
.
- Implementation: Use for measures that aggregate easily across dimensions.
2. Semi-Additive Facts
- Definition: Can be summed across some dimensions but not all.
- Example:
Account Balance
(can be summed by accounts but not across time).
- Implementation: Use business rules to define aggregation behavior.
3. Non-Additive Facts
- Definition: Cannot be summed across any dimension.
- Example: Ratios like
Profit Margin
or percentages.
- Implementation: Calculate non-additive facts at query time using derived metrics.
4. Factless Fact Tables
- Definition: Contain no measures, only foreign keys to dimensions to track events or relationships.
- Example: A
Student Attendance
fact table to record which students attended which classes.
- Implementation: Use for tracking many-to-many relationships or events.
5. Cumulative Snapshot Facts
- Definition: Capture measures that accumulate over time.
- Example:
Total Revenue
or Total Inventory
.
- Implementation: Update the snapshot at regular intervals.
6. Transactional Facts
- Definition: Record detailed, granular metrics about individual transactions.
- Example:
Sales Transaction
capturing Price
, Quantity
, and Discount
.
- Implementation: Use for fine-grained analysis and operational reporting.
7. Periodic Snapshot Facts
- Definition: Record metrics at regular intervals to analyze trends.
- Example:
Daily Account Balances
.
- Implementation: ETL processes should create snapshots at consistent time intervals.
8. Derived Facts
- Definition: Facts calculated from existing measures.
- Example:
Profit Margin
calculated as (Revenue - Cost) / Revenue
.
- Implementation: Store derived facts in reporting tools or materialize them in views.
Summary Table
Dimensions
Dimension Type |
Example |
Use Case |
Conformed Dimensions |
Customer, Product |
Shared across data marts |
Role-Playing Dimensions |
Date (Order Date, Ship Date) |
Multiple roles in a fact table |
Slowly Changing Dimensions |
Customer Address |
Track attribute changes over time |
Junk Dimensions |
Flags (IsReturned, IsPromotional) |
Simplify low-cardinality attributes |
Degenerate Dimensions |
Order Number |
Operational identifiers |
Hierarchical Dimensions |
Geography (Country, State, City) |
Analyze hierarchical relationships |
Mini Dimensions |
Customer Demographics |
Rapidly changing attributes |
Shrunken Dimensions |
Time (Year, Month) |
Aggregated fact tables |
Inferred Dimensions |
Placeholder Customers |
Missing data in ETL |
Facts
Fact Type |
Example |
Use Case |
Additive Facts |
Sales Revenue |
Summable across dimensions |
Semi-Additive Facts |
Account Balance |
Summable across some dimensions |
Non-Additive Facts |
Profit Margin |
Ratios and percentages |
Factless Fact Tables |
Student Attendance |
Track events or relationships |
Cumulative Snapshot Facts |
Total Revenue |
Track metrics over time |
Transactional Facts |
Sales Transactions |
Granular transaction-level analysis |
Periodic Snapshot Facts |
Daily Account Balances |
Time-series trend analysis |
Derived Facts |
Profit Margin |
Calculated measures |
Would you like further examples or diagrams to visualize these concepts?