postgres logical terminology replication slots - ghdrako/doc_snipets GitHub Wiki
- https://techcommunity.microsoft.com/t5/azure-database-for-postgresql/change-data-capture-in-postgres-how-to-use-logical-decoding-and/ba-p/1396421
- https://www.postgresql.org/docs/current/logicaldecoding-explanation.html
- https://www.postgresql.org/docs/current/logicaldecoding-walsender.html
- https://www.postgresql.org/docs/current/protocol-replication.html
- https://www.postgresql.fastware.com/blog/inside-logical-replication-in-postgresql
Subscription
It defines the connection to another database and set of publications (one or more) to which it wants to subscribe.A subscriber node may have multiple subscriptions. It is possible to define multiple subscriptions between a single publisher-subscriber pair.
Each subscription will receive changes via one replication slot. Additional replication slots may be required for the initial synchronization of pre-existing table data, which will be dropped at the end of data synchronization.
When a subscription is created, the subscription information will be added to the pg_subscription
catalog table
SELECT oid, subdbid, subname, subconninfo, subpublications FROM pg_subscription;
The subscriber will connect to the publisher and get the list of tables that the publisher is publishing.
The subscriber connects to the publisher and creates a replication slot, whose information is available in pg_replication_slots
:
SELECT slot_name, plugin, type, datoid, database, temporary, active,
active_pid, restart_lsn, confrm_flush_lsn FROM pg_replication_slots;
Subscribers add the subscription stats information to pg_stat_subscription
SELECT subid, subname, received_lsn FROM pg_stat_subscription;
The initial part of the CREATE SUBSCRIPTION command will be completed and returned to the user. The remaining work will be done in the background by the replication launcher, walsender, apply worker, and tablesync worker after the CREATE SUBSCRIPTION command is completed.
Replication slot
A replication slot ensures that the publisher will retain the WAL logs that are needed by the replicas even when they are disconnected from the subscriber.
As mentioned earlier, each (active) subscription receives changes from a replication slot on the remote (publishing) side.
Additional table synchronization slots are normally transient, created internally to perform initial table synchronization, and dropped automatically when they are no longer needed.
Normally, the remote replication slot is created automatically when the subscription is created during CREATE SUBSCRIPTION, and it is dropped automatically when the subscription is dropped during DROP SUBSCRIPTION.
Replication slots provide an automated way to ensure that the primary does not remove WAL segments until they have been received by all standbys.
Logical decoding outputs data changes as a stream. That stream is called a logical replication slot.
A slot represents a stream of changes that can be replayed to a client in the order they were made on the origin server.
A logical slot will emit each change just once in normal operation. The current position of each slot is persisted only at checkpoint, so in the case of a crash the slot may return to an earlier LSN, which will then cause recent changes to be sent again when the server restarts. Logical decoding clients are responsible for avoiding ill effects from handling the same message more than once. Clients may wish to record the last LSN they saw when decoding and skip over any repeated data or (when using the replication protocol) request that decoding start from that LSN rather than letting the server determine the start point. The Replication Progress Tracking feature is designed for this purpose, refer to replication origins.
Multiple independent slots may exist for a single database. Each slot has its own state, allowing different consumers to receive changes from different points in the database change stream.
A logical replication slot knows nothing about the state of the receiver(s). Only one receiver may consume changes from a slot at any given time.They will prevent removal of required resources even when there is no connection using them. This consumes storage because neither required WAL nor required rows from the system catalogs can be removed by VACUUM as long as they are required by a replication slot.
- Each slot has one output plugin (you choose which).
- Each slot provides changes from only one database.
- But a single database can have multiple slots.
- Each data change is normally emitted once per slot.
- However, if the Postgres instance restarts, a slot may re-emit changes. Your consumer needs to handle that situation.
- An unconsumed slot is a threat to your Postgres instance’s availability. If a slot’s stream isn’t being consumed, Postgres will hold on to all the WAL files for those unconsumed changes. This can lead to storage full or transaction ID wraparound.
- Postgres has a table called pg_replication_slots that tracks the state of all replication slots.
- consumer is any application that can connect to Postgres and ingest the logical decoding stream in ex pg_recvlogical
Replication sets (pglogical)
Replication sets provide a mechanism to control which tables in the database will be replicated and which actions on those tables will be replicated.
Each replicated set can specify individually if INSERTs, UPDATEs, DELETEs and TRUNCATEs on the set are replicated. Every table can be in multiple replication sets and every subscriber can subscribe to multiple replication sets as well. The resulting set of tables and actions replicated is the union of the sets the table is in. The tables are not replicated until they are added into a replication set.
There are three preexisting replication sets named "default", "default_insert_only" and "ddl_sql". The "default" replication set is defined to replicate all changes to tables in it. The "default_insert_only" only replicates INSERTs and is meant for tables that don't have primary key (see Limitations section for details). The "ddl_sql" replication set is defined to replicate schema changes specified by pglogical.replicate_ddl_command