PostgreSQL 변경 데이터 전송 - kayform/bwcontrol GitHub Wiki
"PostgreSQL 9.4부터 Replication을 위해 추가된기능":https://www.postgresql.org/docs/9.5/static/logicaldecoding.html
- WAL log로부터 변화된 데이터를 Decoding하여 Custum form data로 Streaming(e.g JSON format)
- Output Plugin 개발 필요(test_output plugin 기본제공)
PG Server로부터 changed data stream을 수신하는 utility "control PostgreSQL logical decoding streams receiver":https://www.postgresql.org/docs/9.5/static/app-pgrecvlogical.html
아래...
-
Logical Decoding 기능을 사용하기위해 옵션 조정
postgresql.conf
"wal_level=logical":https://www.postgresql.org/docs/9.5/static/runtime-config-wal.html "max_worker_processes=10 #???":https://www.postgresql.org/docs/9.5/static/runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES "max_replication_slots=10":https://www.postgresql.org/docs/9.5/static/runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-SENDER "max_wal_senders=10":https://www.postgresql.org/docs/9.5/static/runtime-config-replication.html
-
Replication 접속 허용
pg_hba.conf
local replication postgres trust
-
PostgreSQL 기본 Output Plugin 설치
PostgreSQL Source directory에서 make; make install $PGHOME/lib/test_decoding.so
-
Restart PostgreSQL
-
"Logical Decoding Examples":https://www.postgresql.org/docs/9.5/static/logicaldecoding-example.html
Slot 생성 SELECT * FROM pg_create_logical_replication_slot('regression_slot', 'test_decoding');
Slot 삭제 SELECT pg_drop_replication_slot('regression_slot');
*** 상기 쿼리로 생성시 각각의 backend process 의 Cache에 생성됨(개발 적용시 수정 필요) Recommand - cursor.copy_expert("COPY (select pg_logical_slot_stream_relation(..)) TO STDOUT" ...)
-
pg_recvlogical 실행
ex) pg_recvlogical -d postgres -S regression_slot --start --file=./test.log
pg_recvlogical controls PostgreSQL logical decoding streams.
Usage:
pg_recvlogical [OPTION]...
Action to be performed:
--create-slot create a new replication slot (for the slot's name see --slot)
--drop-slot drop the replication slot (for the slot's name see --slot)
--start start streaming in a replication slot (for the slot's name see --slot)
Options:
-f, --file=FILE receive log into this file, - for stdout
-F --fsync-interval=SECS
time between fsyncs to the output file (default: 10)
--if-not-exists do not error if slot already exists when creating a slot
-I, --startpos=LSN where in an existing slot should the streaming start
-n, --no-loop do not loop on connection lost
-o, --option=NAME[=VALUE]
pass option NAME with optional value VALUE to the
output plugin
-P, --plugin=PLUGIN use output plugin PLUGIN (default: test_decoding)
-s, --status-interval=SECS
time between status packets sent to server (default: 10)
-S, --slot=SLOTNAME name of the logical replication slot
-v, --verbose output verbose messages
-V, --version output version information, then exit
-?, --help show this help, then exit
Connection options:
-d, --dbname=DBNAME database to connect to
-h, --host=HOSTNAME database server host or socket directory
-p, --port=PORT database server port number
-U, --username=NAME connect as specified database user
-w, --no-password never prompt for password
-W, --password force password prompt (should happen automatically)
- Run DML operation(insert, update, delete)
- Check the log (test.log)
-
테스트 환경
System : VirtualBox OS : CentOS7 64bit DB : PostgreSQL 9.3devel CPU : single core i5-5200U CPU @ 2.20GHz Memory :1024MB
-
Query : insert into emp_org select md5(random()::text), s from generate_series(1,1000000) s;
|Normal |Rep Slot only|Decode Send & Recv| |Time: 7636 ms|Time: 6551 ms|Time: 8229 ms | |Time: 4946 ms|Time: 8193 ms|Time: 8820 ms | |Time: 5673 ms|Time: 8229 ms|Time: 9952 ms | |Time: 5329 ms|Time: 6287 ms|Time: 9806 ms | |Time: 5516 ms|Time: 5218 ms|Time: 9691 ms | |Time: 5918 ms|Time: 5248 ms|Time: 10140 ms| |Time: 5640 ms|Time: 7444 ms|Time: 10881 ms| |Time: 5919 ms|Time: 5598 ms|Time: 9573 ms | |Time: 5782 ms|Time: 5588 ms|Time: 10815 ms| |Time: 7496 ms|Time: 6762 ms|Time: 9249 ms | |AVG : 5985 ms|AVG : 6511 ms|AVG : 9715 ms|
- Query : insert into emp select * from emp_org;
|Normal |Rep Slot only|Decode Send & Recv| |Time: 3024 ms|Time: 2154 ms|Time: 4884 ms | |Time: 3005 ms|Time: 3065 ms|Time: 6544 ms | |Time: 2947 ms|Time: 3186 ms|Time: 5652 ms | |Time: 2646 ms|Time: 2191 ms|Time: 6348 ms | |Time: 3264 ms|Time: 3067 ms|Time: 5607 ms | |Time: 4233 ms|Time: 3059 ms|Time: 5835 ms | |Time: 3371 ms|Time: 2172 ms|Time: 5726 ms | |Time: 4155 ms|Time: 2214 ms|Time: 5359 ms | |Time: 3558 ms|Time: 2234 ms|Time: 5740 ms | |Time: 3891 ms|Time: 3338 ms|Time: 4256 ms | |AVG : 3409 ms|AVG : 2668 ms|AVG : 5595 ms|
*결과
- Output plugin 개발 혹은 "추가 오픈소스 활용":http://222.110.153.159/projects/sw/wiki/PostgreSQL_%EB%B3%80%EA%B2%BD_%EB%8D%B0%EC%9D%B4%ED%84%B0_%EC%A0%84%EC%86%A1#추가
- 데이터 decode, transfer 에 대한 최적화 필요(pg_recvlogical 개선)
- 전송 데이터 개선(불필요한 meta data 등)
- Virtual machine이 아닌 플랫폼에서 추가 테스트 필요
- "현재 postgres incremental update 방법중 가장 효율적인 방법으로 평가":http://stackoverflow.com/questions/32836324/how-to-query-postgres-incremental-updates-since-a-specified-point-a-timestamp-o
- pglogical 검토 고려
- logical decoding
"https://www.postgresql.org/docs/9.5/static/logicaldecoding.html":https://www.postgresql.org/docs/9.5/static/logicaldecoding.html "https://www.pgcon.org/2014/schedule/attachments/326_logical-decoding-pgcon-2014-05-23.pdf":https://www.pgcon.org/2014/schedule/attachments/326_logical-decoding-pgcon-2014-05-23.pdf "https://tech.zalando.de/blog/streaming-huge-databases-using-logical-decoding/":https://tech.zalando.de/blog/streaming-huge-databases-using-logical-decoding/ "http://michael.otacoo.com/content/materials/20140919_pgopen_logirep.pdf":http://michael.otacoo.com/content/materials/20140919_pgopen_logirep.pdf "http://www.slideshare.net/8kdata/postgresql-logical-decoding":http://www.slideshare.net/8kdata/postgresql-logical-decoding "http://www.confluent.io/blog/bottled-water-real-time-integration-of-postgresql-and-kafka/":http://www.confluent.io/blog/bottled-water-real-time-integration-of-postgresql-and-kafka/
- pglogical(2cnQuadrant)
"https://2ndquadrant.com/en/resources/pglogical/":https://2ndquadrant.com/en/resources/pglogical/
##. 추가
-another open source projects using logical decoding
"wal2json":https://github.com/eulerto/wal2json - JSON output plugin for changeset extraction - "stackoverflow loop":http://stackoverflow.com/questions/32407764/its-possible-to-use-logical-decoding-to-replicate-a-single-table "pg_kafka":https://github.com/xstevens/pg_kafka - A PostgreSQL extension to produce messages to Apache Kafka. - not using logical decoding(trigger) - "stackoverflow loop":http://stackoverflow.com/questions/26231384/postgresql-logical-log-streaming-to-apache-kafka "bottledwater-pg":https://github.com/confluentinc/bottledwater-pg - Change data capture from PostgreSQL into Kafka - "blog":http://www.confluent.io/blog/bottled-water-real-time-integration-of-postgresql-and-kafka/ "decoder_raw":https://github.com/michaelpq/pg_plugins/tree/master/decoder_raw