stream replication peerdb fivetan airbyte pgstream - ghdrako/doc_snipets GitHub Wiki
Modern cloud data warehouses, such as BigQuery, can handle and transform vast amounts of data using just SQL. This capability has led many to switch from the traditional ETL to extract, load, transform (ELT), where data is first extracted, then loaded, and finally transformed within the warehouse itself. This change means the data warehouse, which is optimized for such tasks, executes the transformation workload instead of specialized ETL tools.
This tools focus on connecting to various data sources to extract data without meddling with its structure. The transformation responsibility is handed over to the data warehouse. Further assisting in this process, tools such as dbt, Dataform, and SQLMesh offer frameworks to help organize and execute data transformations. However, the heavy lifting – the actual data processing – is done by the data warehouse itself.
pgcapture
cloudquery
pgstream - PostgreSQL replication with DDL changes
- https://github.com/xataio/pgstream
- https://xata.io/blog/pgstream-postgres-replication-schema-changes
Replicating Postgres data and schema changes to an Elasticsearch compatible store, with special handling of field IDs to minimise re-indexing caused by column renames.
PeerDB
Airbyte
- https://airbyte.com/
- https://github.com/PacktPublishing/Fundamentals-of-Analytics-Engineering/blob/main/chapter_8/guides/setting_up_airbyte_cloud.md