Home - davidkhala/ETL GitHub Wiki
Welcome to the ETL wiki!
虽然最吸引注意力的是数据科学和数据分析,但实际上大部分时间花在数据清洗上。在很多项目中,花在搞明白并重构数据上的时间占比高达90%。有人称之为data engineering 或数据整理,但传统上这称为ETL
-
Data ingestion/extract is about moving raw data from various sources into a central repository
-
data loading involves taking the transformed or processed data and loading it into the final storage destination for analysis and reporting.
Any data migration endeavor is incomplete without also moving the tooling that processes, transforms and loads the data into the warehouse.
Refs
Related repos