postgres physical replication - ghdrako/doc_snipets GitHub Wiki
- https://mindhub365.com/sql/postgresql-streaming-replication-wal-what-it-is-and-how-to-configure-one/
In PostgreSQL, there are two kinds of physical replication techniques:
- Asynchronous replication: In asynchronous replication, the primary device (source) sends a continuous flow of data to the secondary one (target), without receiving any return code from the target. This type of copying has the advantage of speed, but it brings with it greater risks of data loss because the received data is not acknowledged.
- Synchronous replication: In synchronous replication, a source sends the data to a target, that is, the second server; at this point, the server acknowledges that the changes are correctly written. If the check is successful, the transfer is completed.
Summary of the key information about WAL segments:
- The WAL size is fixed at 16 MB.
- By default, WAL files are deleted as soon as they are older than the latest checkpoint.
- We can maintain extra WAL segments using wal_keep_segments.
- WAL segments are stored in the pg_wal directory
The wal_level directive
The wal_level directive sets what kind of information should be stored in WAL segments. The default value is minimal. With this value, all information that is stored in a WAL segment can support archiving and physical replication.
Default value wal_level=replica
is for physical replication and wal_level=logical
for logical.
Streaming replication
The idea behind streaming replication is to copy the WAL files from the primary server to another (replica) server.
Streaming replication works by continuously transferring Write-Ahead Log (WAL) data from the primary server to the standby server in real-time, keeping the standby's database nearly identical to the primary.
The replica server will be in a state of continuous recovery, and it continuously executes the WAL that is passed by the primary machine; this way, the replica machine binarily replicates the data of the primary machine through the WAL.
In a streaming replication context, a communication channel will be open between the replica and primary, and the primary will send the WAL segments through it. The replica server will receive the WAL segments and rerun them, remaining in a permanent recovery state.
Asynchronous replication
On the primary server
-
Modify
listen_addresses
so that it listens to the network.If we setlisten_addresses = '*'
, PostgreSQL will listen to any IP; otherwise, we can specify a list of IP addresses separated by commas. This change requires a restart of the PostgreSQL service. -
Create a new user that is able to perform the replication
CREATE role replicarole WITH REPLICATION ENCRYPTED PASSWORD 'SuperSecret' LOGIN;
- Modify the
pg_hba.conf
file so that from the replica machine with the userreplicarole
, it is possible to reach the primary machine:
host replication replicarole 192.168.122.11/32 scram-sha-256
- To make this configuration active, we need to run a reload of the PostgreSQL server.
select pg_reload_conf();
- On the replica server, we have to turn off the PostgreSQL service, destroy the PGDATA directory, and remake it – this time, empty and with the right permissions. To do this, we can use these statements:
root@pg1:/# systemctl stop postgresql
root@pg1:/# cd /var/lib/postgresql/16/
root@pg1:/# rm -rf main
root@pg1:/# mkdir main
root@pg1:/# chown postgres:postgres main
root@pg1:/# chmod 0700 main
The wal_keep_segments option
The postgresql.conf directive that tells PostgreSQL how many WAL segments to keep on disk is called wal_keep_segments; by default, wal_keep_segments is set to zero because the replica is not installed by the PostgreSQL installation process. This means that PostgreSQL will not store any extra WAL segments as buffers. This means that if the replica machine (standby) goes down, then it will no longer be able to realign itself when it comes back up. This happens because in the time it takes the replica to get back up, it is possible that the primary machine has produced and deleted new WAL segments. The first way to overcome this problem is to set the wal_keep_ segments directive to a value greater than zero in postgresql.conf. For example, if we set a value of wal_keep_segments = 100, this means that at least 100 files of WAL segments will be present in the pg_wal folder, for a total occupied disk space of 100 * 16 MB = 1.6 GB. In this case, the primary always keeps these extra WAL segments, and if the replica should go down, then it will only be able to realign itself, once back up, if the primary has produced a number of WAL segments less than wal_keep_segments. This solution offers a static buffer in that you can store old WAL segments and offers a save an- chor that is shorter than the time taken by the primary to produce a number of WAL segments greater than wal_keep_segments. This solution is a static solution; it also has the disadvantage that the space occupied on disk is always equal to wal_keep_segments * 16 MB, even when it is no longer necessary to keep WAL segments on the primary server (because they have already been processed by the replica server). The advantage of this solution is that if the network goes down, PostgreSQL uses a maximum disk space equal to wal_keep_segments * 16 MB to avoid filling all the disk space if the primary server goes down; so if we don’t have much disk space, we can use this solution, keeping in mind that if we exceed the size of wal_keep_segments * 16 MB, the replica will no longer be synchronized, and we will have to rebuild it.
The slot way
In PostgreSQL, there is another approach that can be used to solve the problem of storing WAL segments: the slot technique. Through the slot technique, we can tell PostgreSQL to keep all the WAL segments on the primary until they have been transferred to the replica servers. In this way, we have dynamic, variable, and fully automated management of the number of WAL segments that the primary server must keep as a buffer. This is a very easy way to manage our physical replicas
Create a new slot
SELECT * FROM pg_create_physical_replication_slot('master');
drop a slot
select pg_drop_replication_slot('master');
Asynchronous replication
By default, in PostgreSQL, physical replication is asynchronous.
The replica server will now have the PostgreSQL service turned off and the PGDATA data folder created, empty, and with the right permissions granted:
- go inside the PGDATA directory as the system postgres user
root@pg2:# su - postgres
postgres@pg2:~$ cd /var/lib/PostgreSQL/16/main
- run the pg_basebackup command This command will execute the base_backup command from the primary machine to the replica machine and prepare the replica machine to receive and execute the received WAL segments, causing the replica server to remain in a state of permanent recovery:
postgres@pg2:~/16/main$ pg_basebackup -h 192.168.122.10 -U replicarole -p5432 -D /var/lib/PostgreSQL/16/main -Fp -Xs -P -R -S master
- If the pg_basebackup doesn’t start quickly, that means that it is waiting for a checkpoint from the primary, so to improve the performance of this operation, we can go on the primary server and execute:
postgres=# checkpoint ;
- -h: With this option, we see the host that we want the replica to connect to.
- -U: This is the user created on the primary server used for replication.
- -p: This is the port where the primary server listens.
- -D: This is the PGDATA value on the replica server.
- -Fp: This performs a backup on the replica, maintaining the same data structure present on the primary.
- -Xs: This opens a second connection to the primary server and starts the transfer of the WAL segments at the same time as the backup is performed.
- -P: This shows the progress of the backup.
- -S: This is the slotname created on the primary server.
- -R: This creates the standby.signal file and adds the connection settings to the PostgreSQL.auto.conf file:
postgres@pg2:~/16/main$ cat postgresql.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
primary_conninfo = ‘user=replicarole password=SuperSecret channel_binding=-
disable host=192.168.122.10 port=5432 sslmode=disable sslcompression=0 sslcert-
mode=disable sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=disable
krbsrvname=postgres target_session_attrs=any load_balance_hosts=disable’
primary_slot_name = ‘master
- start the PostgreSQL service on the replica machine, and physical replication should work. As the root user, let’s execute the following:
root@pg2:/var/lib/postgresql/16# systemctl start postgresql
Replica monitoring
- https://www.postgresql.org/docs/current/monitoring-stats.html#PG-STAT-REPLICATION-VIEW PostgreSQL offers us a view through which we can monitor the status of replicas in real time; its name is pg_stat_replication. This view must be queried by connecting to the primary node.
\x
select * from pg_stat_replication ;
Synchronous replication
In asynchronous replication, the primary server does not wait for the replica server to actually replicate the data. In synchronous replication, when the primary performs a commit, all the replicated servers synchronously commit. In synchronous replication, after the execution of the commit, we are sure that the data is replicated on the primary and all the replicas. When we want to achieve synchronous replication, it is good practice to have all identical machines and a good network connection between the machines; otherwise, performance can become slow.
PostgreSQL settings
it is possible to change from asynchronous replication to synchronous replication.
Primary server
On the primary server, we have to check whether the synchronous_commit
parameter is set to on.
Now, synchronous_commit = on
is the default value on a new PostgreSQL installation.
add the synchronous_standby_names
parameter, listing the names of all standby servers that will
replicate the data synchronously. We can also use the '*' wildcard, thus indicating to PostgreSQL
that each standby server can potentially have a synchronous replica.
synchronous_standby_names = 'pg2'
synchronous_commit = on
After this, we need to restart our server:
# systemctl restart postgresql
Standby server
On the standby server, we have to add a parameter to the connection string to the primary so that
the primary knows from whom the reply request comes. We need to edit the postgresql.auto.conf
file;
it is currently as follows:
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
primary_conninfo = 'user=replicarole password=SuperSecret channel_
binding=disable host=192.168.122.10 port=5432 sslmode=disable
sslcompression=0 sslcertmode=disable sslsni=1 ssl_min_protocol_
version=TLSv1.2 gssencmode=disable krbsrvname=postgres target_session_
attrs=any load_balance_hosts=disable'
primary_slot_name = 'master'
We need to change this to the following:
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
primary_conninfo = 'user=replicarole password=SuperSecret channel_
binding=disable host=192.168.122.10 port=5432 sslmode=disable
sslcompression=0 sslcertmode=disable sslsni=1 ssl_min_protocol_
version=TLSv1.2 gssencmode=disable krbsrvname=postgres target_session_
attrs=any load_balance_hosts=disable application_name=pg2'
primary_slot_name = 'master'
We have added the application_name=pg2
option.
After doing this, let’s restart the standby server.
Status
select * from pg_stat_replication;
Output show the primary server and standby servers are replicated in a synchronous way by
sync_state=sync
.
Delayed replication
The delay occurs only on WAL records for transaction commits. Other records are replayed as quickly as possible, which is not a problem because MVCC visibility rules ensure their effects are not visible until the corresponding commit record is applied.
The delay occurs once the database in recovery has reached a consistent state, until the standby is promoted or triggered. After that the standby will end recovery without further waiting.
WAL records must be kept on the standby until they are ready to be applied. Therefore, longer delays will result in a greater accumulation of WAL files, increasing disk space requirements for the standby's pg_wal directory.
!!! Nie nadaje sie do replikacji synchronicznej. Synchronous replication is affected by this setting when synchronous_commit is set to remote_apply; every COMMIT will need to wait to be applied.
recovery_min_apply_delay - definjujemy jakie ma byc opoznienie w stosunku do serwera primary. Maksymalne jest 24h. Z dokumentacji wynika ze ponad 24 dni ???? - https://pgpedia.info/r/recovery_min_apply_delay.html
postgresql.conf:
recovery_min_apply_delay = 5000
recovery_min_apply_delay = '2h'
and we make a reload of the postgresql service on the replica server:
root@pg2:~# systemctl reload postgresql
Using a delay on the replica server means that WAL files are regularly downloaded from the primary server, but they are processed with the delay specified on the parameter recovery_min_ apply_delay.
Promoting a replica server to a primary
it is possible to promote the replica node to the primary; to achieve this goal, on the replica node, as a postgres user, we have to execute this statement:
postgres@pg2:~$ pg_ctl promote -D /var/lib/PostgreSQL/16/main
we can see that the replica is 5 seconds behind the primary because the time unit used on recovery_ min_apply_delay is milliseconds.