Data lake - bobbae/gcp GitHub Wiki

A data lake is a centralized repository designed to store, process, and secure large amounts of structured, semistructured, and unstructured data. It can store data in its native format and process any variety of it, ignoring size limits.

https://cloud.google.com/learn/what-is-a-data-lake

https://www.guru99.com/data-lake-architecture.html

Data lake vs Data warehouse

https://www.guru99.com/data-lake-vs-data-warehouse.html

BigLake

https://cloud.google.com/biglake

Extend BigQuery to multi-cloud data lakes and open formats such as Parquet and ORC with fine-grained security controls.

https://cloud.google.com/blog/products/data-analytics/unifying-data-lakes-and-data-warehouses-across-clouds-with-biglake

https://medium.com/google-cloud/gcp-biglake-introduction-570fb88be132

Apache Iceberg

https://cloud.google.com/blog/products/data-analytics/announcing-apache-iceberg-support-for-biglake/

Delta Lake

https://delta.io/

https://docs.databricks.com/delta/index.html

https://medium.com/google-cloud/delta-tables-with-dataproc-jupyter-and-bigquery-ea2509ca9e0f

Data Mining

https://www.guru99.com/data-mining-tutorial.html

Data stage

https://www.guru99.com/datastage-tutorial.html

Data reconciliation

https://www.guru99.com/what-is-data-reconciliation.html

Data lake modernization

https://cloud.google.com/solutions/data-lake

Data lake vs data warehouse

Data Warehouse is a blend of technologies and components for the strategic use of data. It collects and manages data from varied sources to provide meaningful business insights. It is the electronic storage of a large amount of information designed for query and analysis instead of transaction processing. It is a process of transforming data into information.

https://www.guru99.com/data-lake-vs-data-warehouse.html

Dataplex

Dataplex is an intelligent data fabric that provides unified analytics and data management across your data lakes, data warehouses, and data marts.

lakeFS

https://github.com/treeverse/lakeFS

Examples

Building Data Lake on GCP

https://tech.groww.in/building-a-data-lake-on-google-cloud-platform-98634fa3d66f

Building a Data lake using ELT

https://jomach.medium.com/how-we-build-a-cloud-data-lake-using-elt-instead-of-etl-c05c076001e0