Data Warehouse Terminologies - sachit914/datawarehouse GitHub Wiki
Data Warehouse Terminologies
ETL Vs ELT
ETL
Extract Transform Load (ETL) Used in On premise BI / data warehouse Solution
ELT
Extract Transform Load (ETL) Used in On premise BI / data warehouse Solution
- We extract data and dump raw data into data Lake then using etl tool like glue and use spark for data transformation
Stagging Vs ODS
staging
- A landing zone, is an intermediate storage area used for data processing during the extract, transform and load (ETL) process.
- A landing zone, is an intermediate storage area used for data processing during the extract, transform and load (ETL) process.
ODS- Operational Data Source
- An Operational Data Store (ODS) is a central database that collects and stores the most up-to-date data from multiple operational (transactional) systems. It is primarily used for operational reporting and allows businesses to generate quick reports and perform analysis on real-time data without affecting the core systems.
- Real-Time Data: Unlike a data warehouse, which usually stores historical data, an ODS provides a snapshot of the most recent data. This means it is regularly updated to reflect the current state of operational systems, like customer transactions or orders.
- Separate from Core Systems
- Real-Time Data
OLTP VS OLAP
OLTP
- OLTP: Online transaction processing (OLTP) captures, stores, and processes data from transactions in real time.
OLAP
- OLAP: Online analytical processing (OLAP) uses complex queries to analyze aggregated historical data from OLTP systems.
- In OLAP data will be in multi deminesional model
Fact Vs Dimension
Fact
- Fact: Represents what you want to analyze.
- Fact table Contains Measures along with Foreign Key reference of Dimension Tables.
- Usually Transaction tables from OLTP System
Dimension Table
- Dimension: Describes by which criteria you want to analyze the facts.
- A dimension is a Business entity and have it's own attributes and this table is related to fact.
- Usually Reference / Lookup tables in OLTP System
Star Schema Vs Snow Flake Schema
star schema
Snow Flake Schema
Types of Dimensions
- Confirmed Dimension
- Role-Playing Dimension
- Junk Dimension
- Degenerate Dimension
- Slowly Changing Dimension