cdc stream replication peerdb fivetan airbyte pgstream - ghdrako/doc_snipets GitHub Wiki
Modern cloud data warehouses, such as BigQuery, can handle and transform vast amounts of data using just SQL. This capability has led many to switch from the traditional ETL to extract, load, transform (ELT), where data is first extracted, then loaded, and finally transformed within the warehouse itself. This change means the data warehouse, which is optimized for such tasks, executes the transformation workload instead of specialized ETL tools.
This tools focus on connecting to various data sources to extract data without meddling with its structure. The transformation responsibility is handed over to the data warehouse. Further assisting in this process, tools such as dbt, Dataform, and SQLMesh offer frameworks to help organize and execute data transformations. However, the heavy lifting – the actual data processing – is done by the data warehouse itself.
- https://blog.sqlterritory.com/2018/12/11/5-ways-to-track-database-schema-changes-part-4-ddl-trigger/
pgcapture - A scalable Netflix DBLog implementation for PostgreSQL
cloudquery
pgstream - PostgreSQL replication with DDL changes
- https://github.com/xataio/pgstream
- https://xata.io/blog/pgstream-postgres-replication-schema-changes
Replicating Postgres data and schema changes to an Elasticsearch compatible store, with special handling of field IDs to minimise re-indexing caused by column renames.
PeerDB
Airbyte
- https://airbyte.com/
- https://github.com/PacktPublishing/Fundamentals-of-Analytics-Engineering/blob/main/chapter_8/guides/setting_up_airbyte_cloud.md
Fivetran
PgSync Replicating to ElasticSearch
Pg-capture
https://pg-capture.onrender.com/
Kuvasz
redpanda
sequin
- https://github.com/sequinstream/sequin
- Postgres change data capture to streams and queues like Kafka, SQS, HTTP endpoints, and more
Pg_flo
Key Features
- Real-time Data Streaming - Capture inserts, updates, deletes, and DDL changes in near real-time
- Fast Initial Loads - Parallel copy of existing data with automatic follow-up continuous replication
- Powerful Transformations - Filter and transform data on-the-fly (see rules)
- Flexible Routing - Route to different tables and remap columns (see routing)
- Production Ready - Supports resumable streaming, DDL tracking, and more