Architecture: Data Lake - davidkhala/gcp-collection GitHub Wiki
LifeCycle
1. Ingest
Batch
- Storage Transfer Service
- BigQuery Data Transfer Service
- Transfer Appliance
Streaming
- Pub/Sub
2. Store
Storage decision tree

3. Process and Analyze
Data cleansing and normalize
- Cloud Dataprep Data Harvest
- Dataplex ETL
- Dataflow and Cloud Data Fusion for data absorption
Warehouse
- Dataproc and BigQuery
4. Explore and Visualize
- DataLab -> Vertex AI Workbench
- Looker & Looker Studio
- Data Catalog in Dataplex
- predicate-based search experience for metadata associated with a data entry