postgres logical replication parameters tables - ghdrako/doc_snipets GitHub Wiki
- https://www.postgresql.org/docs/current/logical-replication.html
- https://www.pgedge.com/blog/logical-replication-evolution-in-chronological-order-clustering-solution-built-around-logical-replication
wal_level='logical'
max_wal_senders=10 # greater than number of subscribers (or replicas)
max_replication_slots=10 # greater than number of subscribers (or replicas)
max_worker_processes=10 # greater than number of subscribers (or replicas)
max_logical_replication_workers # greater than number of subscribers (or replicas)
max_sync_workers_per_subscription # depends on number of tables being replicated
max_worker_processes > max_logical_replication_workers > max_sync_workers_per_subscription
Aditional parameter:
-
wal_sender_timeout (default 1min) if replication timeout problem increase to higher value in example 5 min
-
wal_receiver_timeout, (default 1 min) if replication timeout problem increase to higher value in example 5 min
-
wal_receiver_status_interval
-
wal_retrieve_retry_interval
-
max_replication_slots
sets the maximum number of replication slots that can be created on the server. A replication slot is a named, persistent reservation for a replication connection to send WAL data to a replica. -
max_wal_senders sets
the maximum number of simultaneously connected WAL sender processes. WAL sender processes are used to stream the WAL from the primary server to the replica. -
wal_sender_timeout
sets the maximum time, in milliseconds, that a WAL sender waits for a response from the replica before giving up and reconnecting. -
wal_receiver_timeout sets
the maximum time, in milliseconds, that a replica waits for WAL data from the primary database before timing out. -
log_replication_commands
, when set to on, runs the replication-related SQL statements.
wal_level='logical'
All changes made to the database are written to the WAL in a format that can be read and applied to a replica.Setting wal_level to logical can increase the amount of data written to the WAL.
max_wal_senders
Must be always greater than the number of replicas.
max_replication_slots
A replication slot is a named, persistent reservation for a replication connection to send WAL data to a replica.
In general, all the data changes occurring on the tables are written to WAL files in pg_xlog / pg_wal which are termed as WAL records. Wal sender process would pick-up those WAL records (belonging to the tables being replicated) and sends across to the replicas and the wal_receiver process on the replica site would apply those changes at the subscriber node.
The WAL files are removed from the pg_xlog/pg_wal location whenever checkpoint occurs. If the WAL files are removed even before the changes are applied to the subscriber node, then, replication would break and lag behind. In-case subscriber node lags behind, a replication slot would ensure all the WAL files needed for the subscriber to get in sync with the provider are retained. It is recommended to configure one replication slot to each subscriber node.
max_worker_processes
max_sync_workers_per_subscription
default:2
Controls the level of parallelism of the initial data copy during the subscription initialization or when new tables are added. The synchronization workers workers used per subscription are taken from the pool of apply workers defined by max_logical_replication_workers.Currently, there can be only one synchronization worker per table.
Note that max_sync_workers_per_subscription may exceed max_logical_replication_workers without causing an error or warning.
max_logical_replication_workers
default: 4
Specifies maximum number of workers to replicate table data between publisher and subscriber. It includes leader apply workers, parallel apply workers, and table synchronization workers.
Copying replication slots (Postgres 12)
The features allow the replication slots to be copied using pg_copy_physical_replication_slot()
and pg_copy_logical_replication_slot()
functions. The logical slot starts from the same LSN as the source logical slot.
logical_decoding_work_mem (PostgreSQL 13)
This parameter to specify the amount of memory allocated to the WAL sender for saving changes in memory before spilling it to the disk. You can increase the parameter value to keep more changes in memory and reduce disk writes, or decrease the value to reduce the memory usage of WAL sender. The default value of this parameter is 64 MB, and it doesn’t require a database server restart, but the configuration file needs to be reloaded if it is modified. Each subscription spawns a WAL sender process on the publisher node to process the changes from publisher to subscriber. The size of the WAL sender process determines the amount of changes to keep in memory in publisher before spilling it to the disk.
Partitioned table support (PostgreSQL 13)
The publish_via_partition_root
option (used when creating a publication) controls whether the changes to a partition contained in the publication will be published using the identity and schema of the parent table, rather than that of the partitions that are actually changed.
Parallel Apply(PostgreSQL 16)
Specify the parallel streaming option while creating the subscriber. The max_parallel_apply_workers_per_subscription
parameter controls the maximum number of parallel workers per subscription.
Support binary mode for COPY (PostgreSQL 16)
Before text mode was only supported for the initial table copy, and binary transfer mode was only supported for the replication of changes. With PostgreSQL 16, you can set binary=true
while creating the subscription to perform the initial data copy in binary mode (much faster then text mode). This option is only supported if both publisher and subscriber are on PostgreSQL 16.
max_slot_wal_keep_size (integer) (Postgres 13)
The max_slot_wal_keep_size parameter can be set in the postgresql.conf file or on the command line. It determines the number of WAL files required by replication_slots to be kept in the pg_log directory; any replication slots exceeding the specified value are marked invalid.
Monitoring
Publisher
SELECT application_name, wait_event_type, wait_event, query, backend_type FROM pg_stat_activity WHERE state='active';
Subscriber
- https://www.postgresql.org/docs/current/progress-reporting.html#COPY-PROGRESS-REPORTING
- monitor the progress of the initial data copy
SELECT * FROM pg_stat_progress_copy;
Subscription parameters
binary (default false)
if set to true - subscription will request the publisher to send the data in binary format (as opposed to text). Binary format can be faster than the text format, but it is less portable across machine architectures and PostgreSQL versions. Binary format is very data type specific; for example, it will not allow copying from a smallint column to an integer column, even though that would work fine in text format. Even when this option is enabled, only data types having binary send and receive functions will be transferred in binary. Note that the initial synchronization requires all data types to have binary send and receive functions, otherwise the synchronization will fail (see CREATE TYPE for more about send/receive functions).
If the publisher is a PostgreSQL version before 16, then any initial table synchronization will use text format even if binary = true.
Even when this option is enabled, only data types that have binary send and receive functions will be transferred in binary. When doing cross-version replication, if the subscriber lacks a binary receive function for the type, the data transfer will fail, and this option can't be used.
CREATE SUBSCRIPTION mysub CONNECTION 'dbname=mydb host=primary_host user=myuser password=mypassword' PUBLICATION pub1 WITH (binary = true);
copy_data (default true)
Specifies whether to copy pre-existing data in the publications that are being subscribed to when the replication starts. If the publications contain WHERE clauses, it will affect what data is copied.
CREATE SUBSCRIPTION sub1
CONNECTION 'dbname=mydb host=primary_host user=myuser password=mypassword'
PUBLICATION pub1 WITH (copy_data=false);
Row filtering
CREATE PUBLICATION pub3b FOR TABLE t3 WHERE (e > 5);
streaming
Specifies whether to enable streaming of in-progress transactions for this subscription. The default value is off, meaning all transactions are fully decoded on the publisher and only then sent to the subscriber as a whole.
If set to on, the incoming changes are written to temporary files and then applied only after the transaction is committed on the publisher and received by the subscriber.
If set to parallel, incoming changes are directly applied via one of the parallel apply workers, if available. If no parallel apply worker is free to handle streaming transactions then the changes are written to temporary files and applied after the transaction is committed. Note that if an error happens in a parallel apply worker, the finish LSN of the remote transaction might not be reported in the server log.
CREATE SUBSCRIPTION sub CONNECTION 'dbname=mydb host=primary_host user=myuser password=mypassword' PUBLICATION pub WITH (copy_data = off, streaming=parallel, synchronous_commit=remote_apply);