stream replication debezium - ghdrako/doc_snipets GitHub Wiki
Debezium
Documentation
- https://debezium.io/documentation/reference/stable/index.html
 - https://debezium.io/documentation/reference/stable/operations/debezium-server.html
 - https://debezium.io/documentation/reference/stable/integrations/serdes.html
 
Examples (composer)
Tutorials
- https://www.baeldung.com/debezium-intro
 - https://www.mastertheboss.com/jboss-frameworks/debezium/getting-started-with-debezium/
 
Using PubSub
Full refresh:
Postgres
- https://debezium.io/documentation/reference/1.4/connectors/postgresql.html
 - https://debezium.io/documentation/reference/stable/postgres-plugins.html
 - https://debezium.io/documentation/reference/stable/connectors/postgresql.html
 
Debezium’s supported logical decoding plug-ins,
Debezium sink options:
Debezium provides a ready-to-use application that streams change events from a source database to messaging infrastructure like
- Amazon Kinesis,
 - Google Cloud Pub/Sub,
 - Apache Pulsar or
 - Redis (Stream)
 
Examples
- https://github.com/rbiedrawa/cdc-postgres
 - https://github.com/debezium/debezium-examples/blob/main/tutorial/docker-compose-postgres.yaml
 - https://medium.com/@alexander.murylev/kafka-connect-debezium-mysql-source-sink-replication-pipeline-fb4d7e9df790
 
Architecture
Implementation Debezium
- Kafka Connect
 - Debezium Embedded Engine and
 - Debezium Server
 
Kafka Connect is a framework and runtime for implementing and operating:
- Source connectors such as Debezium that send records into Kafka
 - Sink connectors that propagate records from Kafka topics to other systems
 
Kafka Connect operates as a separate service besides the Kafka broker.
Debezium Server (using Pub-Sub)
- https://debezium.io/documentation/reference/operations/debezium-server.html
 - https://github.com/debezium/debezium-examples/tree/main/kinesis
 - https://github.com/debezium/debezium-examples/tree/main/debezium-server/debezium-server-sink-pubsub
 - https://infinitelambda.com/postgres-cdc-debezium-google-pubsub/
 - https://stackoverflow.com/questions/72614792/how-to-apply-debezium-cdc-events-from-pub-sub-onto-a-database
 - https://medium.com/google-cloud/change-data-capture-with-debezium-server-on-gke-from-cloudsql-for-postgresql-to-pub-sub-d1c0b92baa98
 - https://blog.devgenius.io/realtime-change-data-capture-to-bigquery-no-kafka-no-dataflow-fb4a6994441b
 - https://cloud.google.com/dataflow/docs/guides/templates/provided/mysql-change-data-capture-to-bigquery
 - https://infinitelambda.com/postgres-cdc-debezium-google-pubsub/
 
The Debezium server is a configurable, ready-to-use application that streams change events from a source database to a variety of messaging infrastructures.
Debezium Engine
library embedded into your custom Java applications.
Example
Debezium Kafka Connect configuration for PostgreSQL
{
"name": "visits-connector",
  "config": {
    "connector.class":
    "io.debezium.connector.postgresql.PostgresConnector",
    "database.hostname": "postgres",
    "database.port": "5432",
    "database.user": "postgres",
    "database.password": "postgres",
    "database.dbname" : "postgres",
    "database.server.name": "dbserver1",
    "schema.include.list": "dedp_schema",
    "topic.prefix": "dedp"
  }
}
It defines the connection parameters, all the schemas to include in the watching operation, and finally the prefix for the created topic for each synchronized table. As a result, if there is a dedp_schema.events table, the connector will write all the changes to the dedp.dedp_schema.events
topic.