data lake - bobbae/gcp GitHub Wiki
A data lake is a centralized repository designed to store, process, and secure large amounts of structured, semistructured, and unstructured data. It can store data in its native format and process any variety of it, ignoring size limits.
https://cloud.google.com/learn/what-is-a-data-lake
https://www.guru99.com/data-lake-architecture.html
https://www.guru99.com/data-lake-vs-data-warehouse.html
https://cloud.google.com/biglake
Extend BigQuery to multi-cloud data lakes and open formats such as Parquet and ORC with fine-grained security controls.
https://medium.com/google-cloud/gcp-biglake-introduction-570fb88be132
https://cloud.google.com/blog/products/data-analytics/announcing-apache-iceberg-support-for-biglake/
https://docs.databricks.com/delta/index.html
https://medium.com/google-cloud/delta-tables-with-dataproc-jupyter-and-bigquery-ea2509ca9e0f
https://www.guru99.com/data-mining-tutorial.html
https://www.guru99.com/datastage-tutorial.html
https://www.guru99.com/what-is-data-reconciliation.html
https://cloud.google.com/solutions/data-lake
Data Warehouse is a blend of technologies and components for the strategic use of data. It collects and manages data from varied sources to provide meaningful business insights. It is the electronic storage of a large amount of information designed for query and analysis instead of transaction processing. It is a process of transforming data into information.
https://www.guru99.com/data-lake-vs-data-warehouse.html
Dataplex is an intelligent data fabric that provides unified analytics and data management across your data lakes, data warehouses, and data marts.
https://github.com/treeverse/lakeFS
https://tech.groww.in/building-a-data-lake-on-google-cloud-platform-98634fa3d66f
https://jomach.medium.com/how-we-build-a-cloud-data-lake-using-elt-instead-of-etl-c05c076001e0