Build - davidkhala/data-warehouse GitHub Wiki
General data warehouse load task
- Ingest the new data to be loaded into a data lake, applying pre-load cleansing or transformations as required.
- Load the data from files into staging tables in the relational data warehouse.
- Load the dimension tables from the dimension data in the staging tables, updating existing rows or inserting new rows and generating surrogate key values as necessary.
- Load the fact tables from the fact data in the staging tables, looking up the appropriate surrogate keys for related dimensions.
- Perform post-load optimization by updating indexes and table distribution statistics.