postgres logical replication decoding plugins wal2json pgoutput dblog - ghdrako/doc_snipets GitHub Wiki

logical_decoding_work_mem - parametr 64 MB - jasie nie mieści to spliting on disk - aby buforować niezatwierdzone tranzakcje - po rollback-u sa wyrzucane a po commicie zmiany sa emitowane. Specifies the maximum amount of memory to be used by logical decoding, before some of the decoded changes are written to local disk.

Output plugins transform the data from the write-ahead log's internal representation into the format the consumer of a replication slot desires.

An output plugins receives data from the WAL. The plugin then decides what information to keep and how to present that information to you.

For example, new row data is always sent from the WAL to the output plugin. However, wal2json chooses not to output new row data for an UPDATE if the table has no primary key. test_decoding, on the other hand, will publish that row. But test_decoding is not JSON formatted with name/value pairs.

test_decoding

test_decoding: Available on Postgres 9.4+. Though created to be just an example of an output plugin, test_decoding is still useful if your consumer supports it (e.g. Qlik replicate).

Decoderbufs

Pgoutput

Available since Postgres 10. pgoutput is used by Postgres to support logical replication, and is supported by some consumers for decoding (e.g. Debezium).

wal2json

obraz obraz

When a row is changed in a Postgres table, that change is recorded in the WAL. If logical decoding is enabled, the record of that change is passed to the output plugin. The output plugin changes that record from the WAL format to the plugin’s format (e.g. a JSON object). Then the reformatted change exits Postgres via a replication slot. Finally, there’s the consumer. A consumer is any application of your choice that connects to Postgres and receives the logical decoding output.

Logical decoding can only output information about DML (data manipulation) events in Postgres, that is INSERT, UPDATE, and DELETE. DDL (data definition) changes like CREATE TABLE, ALTER ROLE, and DROP INDEX are not emitted by logical decoding.

UPDATE AND DELETE is ased on REPLICA IDENTITY

 REPLICA IDENTITY { DEFAULT | USING INDEX index_name | FULL | NOTHING }

Changes can be consumed using the streaming protocol (logical replication slots) or by a special SQL API. To generate a message event per action (commit per row) To reduce database loads

  • format version 1 produces a JSON object per transaction. All of the new/old tuples are available in the JSON object. Also, there are options to include properties such as transaction timestamp, schema-qualified, data types, and transaction ids.

  • format version 2 produces a JSON object per tuple. Optional JSON object for beginning and end of transaction. Also, there are a variety of options to include properties.

postgressql.conf
wal_level = logical
max_replication_slots = 1
max_wal_senders = 1
pg_hba.conf
local  replication  myrep  <IP address>/0  trust
local  replication  myre                  trust

Read wal2json using pg_recvlogical

  • Create slot
$ pg_recvlogical -h sever -U user@server -d mydb --slot logical_slot --create-slot -P wal2json
  • Start slot and send changes as output to console
$ pg_recvlogical -d mydb --slot logical_slot --start -o pretty-print=1 -o add-msg-prefixes=wal2json -f -
  • Unnecessary changes can be filtered out ( --start add-tables=social.* )
pg_recvlogical -h ['host'] -d ['dbname'] -p ['port'] --slot ['name_slot] -U ['user'] --start add-tables=social.* -o include-types=0 
-o include-timestamp=true
$ pg_recvlogical -d mydb --slot logical_slot --create-slot -P wal2json
CREATE ROLE rep_user WITH LOGIN PASSWORD '******' REPLICATION;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO rep_user;

Monitorig slot space usage

SELECT slot_name, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(),restart_lsn)) AS lag, active from pg_replication_slots WHERE slot_type='logical';

Drop slot

version 1
$ pg_recvlogical -d postgres --slot logical_slot --drop-slot
* pg_recvlogical -h server -U user@server -d mydb --slot logical_slot --drop-slot

Read from wal2json using sql

version 1
SELECT data FROM pg_logical_slot_get_changes('test_slot', NULL, NULL, 'pretty-print', '1', 'add-msg-prefixes', 'wal2json');
SELECT 'stop' FROM pg_drop_replication_slot('test_slot');
version 2
SELECT data FROM pg_logical_slot_get_changes('test_slot', NULL, NULL, 'format-version', '2', 'add-msg-prefixes', 'wal2json');
SELECT 'stop' FROM pg_drop_replication_slot('test_slot');

A word of caution

  • Big transactions issues ( more than 1GB of memory )
  • Wal2Json can not handle too big transaction unless the use of option write-in-chunks but the json is not well formed

Monitoring interfaces

  • pg_stat_replication
  • pg_replication_slots
  • pg_stat_activity

DBLog Framework

DBLog is a Java-based framework, able to capture changes in real-time and to take dumps. Dumps are taken in chunks so that they interleave with real-time events and don’t stall real-time event processing for an extended period of time. Dumps can be taken any time, via a provided API. This allows downstream consumers to capture the full database state initially or at a later time for repairs.

⚠️ **GitHub.com Fallback** ⚠️