U1.61 Ubuntu Quick Start (QS): Kubernetes PostgreSQL HA Cluster on premises - chempkovsky/CS2WPF-and-CS2XAMARIN GitHub Wiki
- First attempt
- Second attempt
- After restarting master pod
- Service resources
- Node Affinity
- Restarting master pod
- Using Service Resources
- Create Database
- SyncFailed Status
- Cluster manifest reference
- read the article Zalando SE: Postgres Operator
- we continue to work with the Kubernetes cluster prepared in the article
Click to show Kubernetes cluster info
yury@u2004s01:~$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
u2004s01 Ready control-plane,master 14m v1.23.2 192.168.100.61 <none> Ubuntu 20.04.3 LTS 5.4.0-91-generic docker://20.10.12
u2004s02 Ready <none> 9m42s v1.23.2 192.168.100.62 <none> Ubuntu 20.04.3 LTS 5.4.0-91-generic docker://20.10.12
u2004s03 Ready <none> 8m2s v1.23.2 192.168.100.63 <none> Ubuntu 20.04.3 LTS 5.4.0-91-generic docker://20.10.12
u2004s04 Ready <none> 7m v1.23.2 192.168.100.64 <none> Ubuntu 20.04.3 LTS 5.4.0-91-generic docker://20.10.12
yury@u2004s01:~$ kubectl get pods -n second-local-path-storage
NAME READY STATUS RESTARTS AGE
second-local-path-storage-local-path-provisioner-75958b75882bf8 1/1 Running 0 36s
- read the article Quickstart
- for u2004s01
git clone https://github.com/zalando/postgres-operator.git
cd postgres-operator
kubectl create -f manifests/configmap.yaml
kubectl create -f manifests/operator-service-account-rbac.yaml
kubectl create -f manifests/postgres-operator.yaml
kubectl create -f manifests/api-service.yaml
kubectl apply -k github.com/zalando/postgres-operator/manifests
yury@u2004s01:~/postgres-operator$ kubectl get pods -n default -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
postgres-operator-849dddc998-gbhcg 1/1 Running 0 4m5s 10.32.121.129 u2004s04 <none> <none>
Click to show the responses
yury@u2004s01:~$ git clone https://github.com/zalando/postgres-operator.git
Cloning into 'postgres-operator'...
remote: Enumerating objects: 23247, done.
remote: Counting objects: 100% (366/366), done.
remote: Compressing objects: 100% (226/226), done.
remote: Total 23247 (delta 221), reused 228 (delta 123), pack-reused 22881
Receiving objects: 100% (23247/23247), 8.82 MiB | 9.20 MiB/s, done.
Resolving deltas: 100% (16633/16633), done.
yury@u2004s01:~$ cd postgres-operator
yury@u2004s01:~/postgres-operator$ sudo nano manifests/configmap.yaml
[sudo] password for yury:
yury@u2004s01:~/postgres-operator$ kubectl create -f manifests/configmap.yaml
configmap/postgres-operator created
yury@u2004s01:~/postgres-operator$ kubectl create -f manifests/operator-service-account-rbac.yaml
serviceaccount/postgres-operator created
clusterrole.rbac.authorization.k8s.io/postgres-operator created
clusterrolebinding.rbac.authorization.k8s.io/postgres-operator created
clusterrole.rbac.authorization.k8s.io/postgres-pod created
yury@u2004s01:~/postgres-operator$ kubectl create -f manifests/postgres-operator.yaml
deployment.apps/postgres-operator created
yury@u2004s01:~/postgres-operator$ kubectl create -f manifests/api-service.yaml
service/postgres-operator created
yury@u2004s01:~/postgres-operator$ kubectl get pods -n default -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
postgres-operator-849dddc998-gbhcg 1/1 Running 0 4m5s 10.32.121.129 u2004s04 <none> <none>
- read the article Deploy the operator UI
- for u2004s01
kubectl apply -f ui/manifests/
Click to show the responses (WITH ERROR)
yury@u2004s01:~/postgres-operator$ kubectl apply -f ui/manifests/
deployment.apps/postgres-operator-ui created
ingress.networking.k8s.io/postgres-operator-ui created
service/postgres-operator-ui created
serviceaccount/postgres-operator-ui created
clusterrole.rbac.authorization.k8s.io/postgres-operator-ui created
clusterrolebinding.rbac.authorization.k8s.io/postgres-operator-ui created
error: unable to recognize "ui/manifests/kustomization.yaml": no matches for kind "Kustomization" in version "kustomize.config.k8s.io/v1beta1"
yury@u2004s01:~/postgres-operator$ kubectl get pods -n default -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
postgres-operator-849dddc998-gbhcg 1/1 Running 0 49m 10.32.121.129 u2004s04 <none> <none>
postgres-operator-ui-5889cfdc78-vx7zb 1/1 Running 0 20m 10.32.105.1 u2004s03 <none> <none>
- for u2004s01
kubectl port-forward --address 0.0.0.0 svc/postgres-operator-ui 8081:80
- or
kubectl port-forward --address 0.0.0.0 pod/postgres-operator-ui-5889cfdc78-vx7zb 8081:80
- we could not connect to the pod using the browser
Click to show the log
operator_ui.spiloutils INFO Common Cluster Label: {"application":"spilo"}
--- Logging error ---
Traceback (most recent call last):
File "/usr/lib/python3.8/logging/handlers.py", line 934, in emit
self.socket.send(msg)
File "/usr/lib/python3.8/site-packages/gevent/_socketcommon.py", line 722, in send
return self._sock.send(data, flags)
File "/usr/lib/python3.8/site-packages/gevent/_socket3.py", line 55, in _dummy
raise OSError(EBADF, 'Bad file descriptor')
OSError: [Errno 9] Bad file descriptor
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.8/logging/handlers.py", line 855, in _connect_unixsocket
self.socket.connect(address)
File "/usr/lib/python3.8/site-packages/gevent/_socketcommon.py", line 628, in connect
raise _SocketError(result, strerror(result))
FileNotFoundError: [Errno 2] No such file or directory
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.8/logging/handlers.py", line 937, in emit
self._connect_unixsocket(self.address)
File "/usr/lib/python3.8/logging/handlers.py", line 866, in _connect_unixsocket
self.socket.connect(address)
File "/usr/lib/python3.8/site-packages/gevent/_socketcommon.py", line 628, in connect
raise _SocketError(result, strerror(result))
FileNotFoundError: [Errno 2] No such file or directory
Call stack:
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/operator_ui/__main__.py", line 1, in <module>
from .main import main
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 848, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/operator_ui/main.py", line 43, in <module>
from .spiloutils import (
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 848, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/operator_ui/spiloutils.py", line 26, in <module>
logger.info("Common Cluster Label: {}".format(COMMON_CLUSTER_LABEL))
Message: 'Common Cluster Label: {"application":"spilo"}'
Arguments: ()
operator_ui.spiloutils INFO Common Pooler Label: {"application":"db-connection-pooler"}
--- Logging error ---
Traceback (most recent call last):
File "/usr/lib/python3.8/logging/handlers.py", line 934, in emit
self.socket.send(msg)
File "/usr/lib/python3.8/site-packages/gevent/_socketcommon.py", line 722, in send
return self._sock.send(data, flags)
File "/usr/lib/python3.8/site-packages/gevent/_socket3.py", line 55, in _dummy
raise OSError(EBADF, 'Bad file descriptor')
OSError: [Errno 9] Bad file descriptor
...
- for u2004s01
kubectl delete -f ui/manifests/
- read the article Create a Postgres cluster
- now we will use the manifests/minimal-postgres-manifest.yaml-without changes
- for u2004s01
kubectl apply -f- <<EOF
apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
name: acid-minimal-cluster
namespace: default
spec:
teamId: "acid"
volume:
size: 1Gi
numberOfInstances: 2
users:
zalando: # database owner
- superuser
- createdb
foo_user: [] # role for application foo
databases:
foo: zalando # dbname: owner
preparedDatabases:
bar: {}
postgresql:
version: "14"
EOF
- we got the STATUS == CreateFailed
yury@u2004s01:~/postgres-operator$ kubectl get pods -n default -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
acid-minimal-cluster-0 1/1 Running 0 5m45s 10.32.105.3 u2004s03 <none> <none>
acid-minimal-cluster-1 1/1 Running 0 4m11s 10.32.121.131 u2004s04 <none> <none>
postgres-operator-849dddc998-gbhcg 1/1 Running 0 109m 10.32.121.129 u2004s04 <none> <none>
yury@u2004s01:~/postgres-operator$ kubectl get postgresql
NAME TEAM VERSION PODS VOLUME CPU-REQUEST MEMORY-REQUEST AGE STATUS
acid-minimal-cluster acid 14 2 1Gi 7m20s CreateFailed
yury@u2004s01:~/postgres-operator$ kubectl get svc -l application=spilo -L spilo-role
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SPILO-ROLE
acid-minimal-cluster ClusterIP 10.110.78.245 <none> 5432/TCP 8m4s master
acid-minimal-cluster-config ClusterIP None <none> <none> 6m26s
acid-minimal-cluster-repl ClusterIP 10.109.123.102 <none> 5432/TCP 8m4s replica
- at first we delete the cluster
- for u2004s01
kubectl delete -f- <<EOF
apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
name: acid-minimal-cluster
namespace: default
spec:
teamId: "acid"
volume:
size: 1Gi
numberOfInstances: 2
users:
zalando: # database owner
- superuser
- createdb
foo_user: [] # role for application foo
databases:
foo: zalando # dbname: owner
preparedDatabases:
bar: {}
postgresql:
version: "14"
EOF
- at second we try create the cluster with volume.size=2Gi
- for u2004s01
kubectl create -f- <<EOF
apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
name: acid-minimal-cluster
namespace: default
spec:
teamId: "acid"
volume:
size: 2Gi
numberOfInstances: 2
users:
zalando: # database owner
- superuser
- createdb
foo_user: [] # role for application foo
databases:
foo: zalando # dbname: owner
preparedDatabases:
bar: {}
postgresql:
version: "14"
EOF
- Here is a result
- we got the STATUS == CreateFailed
yury@u2004s01:~/postgres-operator$ kubectl get pods -n default -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
acid-minimal-cluster-0 1/1 Running 0 33s 10.32.105.6 u2004s03 <none> <none>
acid-minimal-cluster-1 1/1 Running 0 20s 10.32.121.134 u2004s04 <none> <none>
postgres-operator-849dddc998-gbhcg 1/1 Running 0 115m 10.32.121.129 u2004s04 <none> <none>
yury@u2004s01:~/postgres-operator$ kubectl get postgresql
NAME TEAM VERSION PODS VOLUME CPU-REQUEST MEMORY-REQUEST AGE STATUS
acid-minimal-cluster acid 14 2 2Gi 9m3s CreateFailed
yury@u2004s01:~/postgres-operator$ kubectl get postgresql
NAME TEAM VERSION PODS VOLUME CPU-REQUEST MEMORY-REQUEST AGE STATUS
acid-minimal-cluster acid 14 2 2Gi 32m CreateFailed
Click to show the log of acid-minimal-cluster-0 : Containers : postgres : Logs
2022-01-23 20:52:53,076 - bootstrapping - INFO - Figuring out my environment (Google? AWS? Openstack? Local?)
2022-01-23 20:52:55,091 - bootstrapping - INFO - Could not connect to 169.254.169.254, assuming local Docker setup
2022-01-23 20:52:55,094 - bootstrapping - INFO - No meta-data available for this provider
2022-01-23 20:52:55,095 - bootstrapping - INFO - Looks like your running local
2022-01-23 20:52:55,160 - bootstrapping - INFO - Configuring pam-oauth2
2022-01-23 20:52:55,161 - bootstrapping - INFO - Writing to file /etc/pam.d/postgresql
2022-01-23 20:52:55,161 - bootstrapping - INFO - Configuring certificate
2022-01-23 20:52:55,161 - bootstrapping - INFO - Generating ssl self-signed certificate
2022-01-23 20:52:55,414 - bootstrapping - INFO - Configuring wal-e
2022-01-23 20:52:55,414 - bootstrapping - INFO - Configuring standby-cluster
2022-01-23 20:52:55,414 - bootstrapping - INFO - Configuring pgbouncer
2022-01-23 20:52:55,415 - bootstrapping - INFO - No PGBOUNCER_CONFIGURATION was specified, skipping
2022-01-23 20:52:55,415 - bootstrapping - INFO - Configuring crontab
2022-01-23 20:52:55,415 - bootstrapping - INFO - Skipping creation of renice cron job due to lack of SYS_NICE capability
2022-01-23 20:52:55,416 - bootstrapping - INFO - Configuring bootstrap
2022-01-23 20:52:55,416 - bootstrapping - INFO - Configuring pgqd
2022-01-23 20:52:55,417 - bootstrapping - INFO - Configuring patroni
2022-01-23 20:52:55,441 - bootstrapping - INFO - Writing to file /run/postgres.yml
2022-01-23 20:52:55,450 - bootstrapping - INFO - Configuring log
2022-01-23 20:52:57,092 INFO: Selected new K8s API server endpoint https://192.168.100.61:6443
2022-01-23 20:52:57,143 INFO: No PostgreSQL configuration items changed, nothing to reload.
2022-01-23 20:52:57,160 INFO: Lock owner: None; I am acid-minimal-cluster-0
2022-01-23 20:52:57,391 INFO: trying to bootstrap a new cluster
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
The database cluster will be initialized with locale "en_US.utf-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
Data page checksums are disabled.
fixing permissions on existing directory /home/postgres/pgdata/pgroot/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Etc/UTC
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok
Success. You can now start the database server using:
/usr/lib/postgresql/14/bin/pg_ctl -D /home/postgres/pgdata/pgroot/data -l logfile start
2022-01-23 20:53:01,148 INFO: postmaster pid=80
/var/run/postgresql:5432 - no response
2022-01-23 20:53:01 UTC [80]: [1-1] 61edc02d.50 0 LOG: Auto detecting pg_stat_kcache.linux_hz parameter...
2022-01-23 20:53:01 UTC [80]: [2-1] 61edc02d.50 0 LOG: pg_stat_kcache.linux_hz is set to 250000
2022-01-23 20:53:01 UTC [80]: [3-1] 61edc02d.50 0 LOG: redirecting log output to logging collector process
2022-01-23 20:53:01 UTC [80]: [4-1] 61edc02d.50 0 HINT: Future log output will appear in directory "../pg_log".
/var/run/postgresql:5432 - accepting connections
/var/run/postgresql:5432 - accepting connections
2022-01-23 20:53:02,216 INFO: establishing a new patroni connection to the postgres cluster
2022-01-23 20:53:02,282 INFO: running post_bootstrap
DO
DO
DO
CREATE EXTENSION
NOTICE: version "1.1" of extension "pg_auth_mon" is already installed
ALTER EXTENSION
GRANT
CREATE EXTENSION
NOTICE: version "1.4" of extension "pg_cron" is already installed
ALTER EXTENSION
ALTER POLICY
REVOKE
GRANT
GRANT
ERROR: cannot change name of input parameter "job_name"
HINT: Use DROP FUNCTION cron.schedule(text,text,text) first.
REVOKE
GRANT
REVOKE
GRANT
REVOKE
GRANT
GRANT
CREATE EXTENSION
DO
CREATE TABLE
GRANT
ALTER TABLE
ALTER TABLE
ALTER TABLE
CREATE FOREIGN TABLE
GRANT
CREATE VIEW
ALTER VIEW
GRANT
CREATE FOREIGN TABLE
GRANT
CREATE VIEW
ALTER VIEW
GRANT
CREATE FOREIGN TABLE
GRANT
CREATE VIEW
ALTER VIEW
GRANT
CREATE FOREIGN TABLE
GRANT
CREATE VIEW
ALTER VIEW
GRANT
CREATE FOREIGN TABLE
GRANT
CREATE VIEW
ALTER VIEW
GRANT
CREATE FOREIGN TABLE
GRANT
CREATE VIEW
ALTER VIEW
GRANT
CREATE FOREIGN TABLE
GRANT
CREATE VIEW
ALTER VIEW
GRANT
CREATE FOREIGN TABLE
GRANT
CREATE VIEW
ALTER VIEW
GRANT
RESET
SET
NOTICE: schema "zmon_utils" does not exist, skipping
DROP SCHEMA
DO
NOTICE: language "plpythonu" does not exist, skipping
DROP LANGUAGE
NOTICE: function plpython_call_handler() does not exist, skipping
DROP FUNCTION
NOTICE: function plpython_inline_handler(internal) does not exist, skipping
DROP FUNCTION
NOTICE: function plpython_validator(oid) does not exist, skipping
DROP FUNCTION
CREATE SCHEMA
GRANT
SET
CREATE TYPE
CREATE FUNCTION
CREATE FUNCTION
GRANT
You are now connected to database "postgres" as user "postgres".
CREATE SCHEMA
GRANT
SET
CREATE FUNCTION
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
GRANT
RESET
CREATE EXTENSION
CREATE EXTENSION
CREATE EXTENSION
NOTICE: version "3.0" of extension "set_user" is already installed
ALTER EXTENSION
GRANT
GRANT
GRANT
CREATE SCHEMA
GRANT
GRANT
SET
CREATE FUNCTION
REVOKE
GRANT
GRANT
CREATE VIEW
REVOKE
GRANT
GRANT
CREATE FUNCTION
REVOKE
GRANT
GRANT
CREATE VIEW
REVOKE
GRANT
GRANT
CREATE FUNCTION
REVOKE
GRANT
GRANT
CREATE VIEW
REVOKE
GRANT
GRANT
RESET
You are now connected to database "template1" as user "postgres".
CREATE SCHEMA
GRANT
SET
CREATE FUNCTION
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
GRANT
RESET
CREATE EXTENSION
CREATE EXTENSION
CREATE EXTENSION
NOTICE: version "3.0" of extension "set_user" is already installed
ALTER EXTENSION
GRANT
GRANT
GRANT
CREATE SCHEMA
GRANT
GRANT
SET
CREATE FUNCTION
REVOKE
GRANT
GRANT
CREATE VIEW
REVOKE
GRANT
GRANT
CREATE FUNCTION
REVOKE
GRANT
GRANT
CREATE VIEW
REVOKE
GRANT
GRANT
CREATE FUNCTION
REVOKE
GRANT
GRANT
CREATE VIEW
REVOKE
GRANT
GRANT
RESET
2022-01-23 20:53:10,277 WARNING: Could not activate Linux watchdog device: "Can't open watchdog device: [Errno 2] No such file or directory: '/dev/watchdog'"
2022-01-23 20:53:10,541 INFO: initialized a new cluster
2022-01-23 20:53:22,031 INFO: no action. I am (acid-minimal-cluster-0) the leader with the lock
2022-01-23 20:53:23,829 INFO: no action. I am (acid-minimal-cluster-0) the leader with the lock
2022-01-23 20:53:34,336 INFO: no action. I am (acid-minimal-cluster-0) the leader with the lock
2022-01-23 20:53:44,483 INFO: no action. I am (acid-minimal-cluster-0) the leader with the lock
2022-01-23 20:53:54,292 INFO: no action. I am (acid-minimal-cluster-0) the leader with the lock
2022-01-23 20:53:56.590 35 LOG Starting pgqd 3.3
2022-01-23 20:53:56.591 35 LOG auto-detecting dbs ...
2022-01-23 20:54:04,418 INFO: no action. I am (acid-minimal-cluster-0) the leader with the lock
2022-01-23 20:54:14,327 INFO: no action. I am (acid-minimal-cluster-0) the leader with the lock
2022-01-23 20:54:24,404 INFO: no action. I am (acid-minimal-cluster-0) the leader with the lock
...
2022-01-23 21:34:27.791 35 LOG {ticks: 0, maint: 0, retry: 0}
2022-01-23 21:34:34,218 INFO: no action. I am (acid-minimal-cluster-0) the leader with the lock
2022-01-23 21:34:44,271 INFO: no action. I am (acid-minimal-cluster-0) the leader with the lock
2022-01-23 21:34:54,220 INFO: no action. I am (acid-minimal-cluster-0) the leader with the lock
Click to show the log of acid-minimal-cluster-1 : Containers : postgres : Logs
2022-01-23 20:53:05,517 - bootstrapping - INFO - Figuring out my environment (Google? AWS? Openstack? Local?)
2022-01-23 20:53:07,528 - bootstrapping - INFO - Could not connect to 169.254.169.254, assuming local Docker setup
2022-01-23 20:53:07,531 - bootstrapping - INFO - No meta-data available for this provider
2022-01-23 20:53:07,532 - bootstrapping - INFO - Looks like your running local
2022-01-23 20:53:07,639 - bootstrapping - INFO - Configuring log
2022-01-23 20:53:07,640 - bootstrapping - INFO - Configuring standby-cluster
2022-01-23 20:53:07,640 - bootstrapping - INFO - Configuring pgqd
2022-01-23 20:53:07,641 - bootstrapping - INFO - Configuring wal-e
2022-01-23 20:53:07,641 - bootstrapping - INFO - Configuring crontab
2022-01-23 20:53:07,642 - bootstrapping - INFO - Skipping creation of renice cron job due to lack of SYS_NICE capability
2022-01-23 20:53:07,643 - bootstrapping - INFO - Configuring pam-oauth2
2022-01-23 20:53:07,644 - bootstrapping - INFO - Writing to file /etc/pam.d/postgresql
2022-01-23 20:53:07,645 - bootstrapping - INFO - Configuring bootstrap
2022-01-23 20:53:07,645 - bootstrapping - INFO - Configuring patroni
2022-01-23 20:53:07,682 - bootstrapping - INFO - Writing to file /run/postgres.yml
2022-01-23 20:53:07,690 - bootstrapping - INFO - Configuring pgbouncer
2022-01-23 20:53:07,691 - bootstrapping - INFO - No PGBOUNCER_CONFIGURATION was specified, skipping
2022-01-23 20:53:07,691 - bootstrapping - INFO - Configuring certificate
2022-01-23 20:53:07,691 - bootstrapping - INFO - Generating ssl self-signed certificate
2022-01-23 20:53:08,841 INFO: Selected new K8s API server endpoint https://192.168.100.61:6443
2022-01-23 20:53:08,897 INFO: No PostgreSQL configuration items changed, nothing to reload.
2022-01-23 20:53:08,903 INFO: Lock owner: None; I am acid-minimal-cluster-1
2022-01-23 20:53:09,103 INFO: waiting for leader to bootstrap
2022-01-23 20:53:10,424 INFO: Lock owner: acid-minimal-cluster-0; I am acid-minimal-cluster-1
2022-01-23 20:53:10,428 INFO: trying to bootstrap from leader 'acid-minimal-cluster-0'
2022-01-23 20:53:10,431 INFO: No PostgreSQL configuration items changed, nothing to reload.
2022-01-23 20:53:20,867 INFO: Lock owner: acid-minimal-cluster-0; I am acid-minimal-cluster-1
2022-01-23 20:53:21,035 INFO: bootstrap from leader 'acid-minimal-cluster-0' in progress
1024+0 records in
1024+0 records out
16777216 bytes (17 MB, 16 MiB) copied, 0.0411527 s, 408 MB/s
2022-01-23 20:53:23,684 INFO: Lock owner: acid-minimal-cluster-0; I am acid-minimal-cluster-1
2022-01-23 20:53:23,685 INFO: bootstrap from leader 'acid-minimal-cluster-0' in progress
NOTICE: all required WAL segments have been archived
2022-01-23 20:53:25,265 INFO: replica has been created using basebackup_fast_xlog
2022-01-23 20:53:25,267 INFO: bootstrapped from leader 'acid-minimal-cluster-0'
2022-01-23 20:53:25,820 INFO: postmaster pid=94
/var/run/postgresql:5432 - no response
2022-01-23 20:53:25 UTC [94]: [1-1] 61edc045.5e 0 LOG: Auto detecting pg_stat_kcache.linux_hz parameter...
2022-01-23 20:53:25 UTC [94]: [2-1] 61edc045.5e 0 LOG: pg_stat_kcache.linux_hz is set to 333333
2022-01-23 20:53:25 UTC [94]: [3-1] 61edc045.5e 0 LOG: redirecting log output to logging collector process
2022-01-23 20:53:25 UTC [94]: [4-1] 61edc045.5e 0 HINT: Future log output will appear in directory "../pg_log".
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - rejecting connections
/var/run/postgresql:5432 - accepting connections
2022-01-23 20:53:27,920 INFO: Lock owner: acid-minimal-cluster-0; I am acid-minimal-cluster-1
2022-01-23 20:53:27,921 INFO: establishing a new patroni connection to the postgres cluster
2022-01-23 20:53:27,970 INFO: no action. I am a secondary (acid-minimal-cluster-1) and following a leader (acid-minimal-cluster-0)
2022-01-23 20:53:34,386 INFO: no action. I am a secondary (acid-minimal-cluster-1) and following a leader (acid-minimal-cluster-0)
...
2022-01-23 21:42:34,454 INFO: no action. I am a secondary (acid-minimal-cluster-1) and following a leader (acid-minimal-cluster-0)
2022-01-23 21:42:44,312 INFO: no action. I am a secondary (acid-minimal-cluster-1) and following a leader (acid-minimal-cluster-0)
- after a while we got STATUS == SyncFailed
yury@u2004s01:~/postgres-operator$ kubectl get postgresql
NAME TEAM VERSION PODS VOLUME CPU-REQUEST MEMORY-REQUEST AGE STATUS
acid-minimal-cluster acid 14 2 2Gi 58m SyncFailed
- our cluster has been deployed into nodes as follows
- u2004s03 (acid-minimal-cluster-0)
- acid-minimal-cluster-0 had the master-role
- u2004s04 (acid-minimal-cluster-1)
- acid-minimal-cluster-1 had the replica-role
- u2004s03 (acid-minimal-cluster-0)
-
u2004s03-machine has been rebooted
- with
sudo poweroff
-command - after a while u2004s03-machine has been started
- with
- Here is the expected result
- acid-minimal-cluster-1 must become a master:
yury@u2004s01:~$ kubectl get pods -l application=spilo -L spilo-role
NAME READY STATUS RESTARTS AGE SPILO-ROLE
acid-minimal-cluster-0 1/1 Running 0 11h replica
acid-minimal-cluster-1 1/1 Running 0 12h master
- for u2004s01
yury@u2004s01:~$ kubectl get svc -l application=spilo -L spilo-role
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SPILO-ROLE
acid-minimal-cluster ClusterIP 10.104.184.254 <none> 5432/TCP 12h master
acid-minimal-cluster-config ClusterIP None <none> <none> 12h
acid-minimal-cluster-repl ClusterIP 10.105.27.187 <none> 5432/TCP 12h replica
- read the article Use taints, tolerations and node affinity for dedicated PostgreSQL nodes
- at first we delete the cluster
- for u2004s01
Click to show deletion script
kubectl delete -f- <<EOF
apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
name: acid-minimal-cluster
namespace: default
spec:
teamId: "acid"
volume:
size: 2Gi
numberOfInstances: 2
users:
zalando: # database owner
- superuser
- createdb
foo_user: [] # role for application foo
databases:
foo: zalando # dbname: owner
preparedDatabases:
bar: {}
postgresql:
version: "14"
EOF
- at second we try create the cluster with nodeAffinity
- Note: They do not use the chain affinity.nodeAffinity. Instead, nodeAffinity is used directly.
- for u2004s01
kubectl apply -f- <<EOF
apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
name: acid-minimal-cluster
namespace: default
spec:
teamId: "acid"
volume:
size: 2Gi
numberOfInstances: 2
users:
zalando: # database owner
- superuser
- createdb
foo_user: [] # role for application foo
databases:
foo: zalando # dbname: owner
preparedDatabases:
bar: {}
postgresql:
version: "14"
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- u2004s02
- u2004s03
EOF
Click to show the result
yury@u2004s01:~$ kubectl get pods -n default -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
acid-minimal-cluster-0 1/1 Running 0 13m 10.32.105.14 u2004s03 <none> <none>
acid-minimal-cluster-1 1/1 Running 0 13m 10.32.27.195 u2004s02 <none> <none>
postgres-operator-849dddc998-gbhcg 1/1 Running 0 15h 10.32.121.129 u2004s04 <none> <none>
yury@u2004s01:~$ kubectl get pods -l application=spilo -L spilo-role -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES SPILO-ROLE
acid-minimal-cluster-0 1/1 Running 0 15m 10.32.105.14 u2004s03 <none> <none> master
acid-minimal-cluster-1 1/1 Running 0 15m 10.32.27.195 u2004s02 <none> <none> replica
yury@u2004s01:~$ kubectl get svc -l application=spilo -L spilo-role -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR SPILO-ROLE
acid-minimal-cluster ClusterIP 10.104.30.236 <none> 5432/TCP 16m <none> master
acid-minimal-cluster-config ClusterIP None <none> <none> 16m <none>
acid-minimal-cluster-repl ClusterIP 10.101.194.33 <none> 5432/TCP 16m application=spilo,cluster-name=acid-minimal-cluster,spilo-role=replica replica
yury@u2004s01:~$ kubectl get postgresql -o wide
NAME TEAM VERSION PODS VOLUME CPU-REQUEST MEMORY-REQUEST AGE STATUS
acid-minimal-cluster acid 14 2 2Gi 17m CreateFailed
- for u2004s03
- we do not use
sudo reboot
. Instead, we run the command below and wait for a while after virtual machine shutdown. Then we turn on u2004s03 again.
- we do not use
sudo poweroff
- Cluster works as expected
- acid-minimal-cluster-1 now has master-role
Click to show the result
yury@u2004s01:~$ kubectl get pods -l application=spilo -L spilo-role -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES SPILO-ROLE
acid-minimal-cluster-0 1/1 Running 0 4m18s 10.32.105.16 u2004s03 <none> <none> replica
acid-minimal-cluster-1 1/1 Running 0 40m 10.32.27.195 u2004s02 <none> <none> master
- acid-minimal-cluster should be used for SQL-write operations
- acid-minimal-cluster-repl should be used for SQL-select operations
yury@u2004s01:~$ kubectl get svc -l application=spilo -L spilo-role
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SPILO-ROLE
acid-minimal-cluster ClusterIP 10.104.30.236 <none> 5432/TCP 48m master
acid-minimal-cluster-config ClusterIP None <none> <none> 47m
acid-minimal-cluster-repl ClusterIP 10.101.194.33 <none> 5432/TCP 48m replica
- read the article Connect to the Postgres cluster via psql
- instead we login into master pod
kubectl exec --stdin --tty acid-minimal-cluster-1 -- /bin/bash
psql -U postgres
\l
SELECT datname FROM pg_database;
create database mydb;
\quit
exit
- Now we login into replica pod
kubectl exec --stdin --tty acid-minimal-cluster-0 -- /bin/bash
psql -U postgres
SELECT datname FROM pg_database;
\quit
exit
Click to show the responses
yury@u2004s01:~$ kubectl exec --stdin --tty acid-minimal-cluster-0 -- /bin/bash
____ _ _
/ ___| _ __ (_) | ___
\___ \| '_ \| | |/ _ \
___) | |_) | | | (_) |
|____/| .__/|_|_|\___/
|_|
This container is managed by runit, when stopping/starting services use sv
Examples:
sv stop cron
sv restart patroni
Current status: (sv status /etc/service/*)
run: /etc/service/patroni: (pid 27) 4047s
run: /etc/service/pgqd: (pid 28) 4047s
root@acid-minimal-cluster-0:/home/postgres# psql -U postgres
psql (14.0 (Ubuntu 14.0-1.pgdg18.04+1))
Type "help" for help.
postgres=# SELECT datname FROM pg_database;
datname
-----------
postgres
mydb
template1
template0
(4 rows)
postgres=# \quit
root@acid-minimal-cluster-0:/home/postgres# exit
exit
- we tested replication and it works. But why the STATUS == SyncFailed???
yury@u2004s01:~$ kubectl get postgresql -o wide
NAME TEAM VERSION PODS VOLUME CPU-REQUEST MEMORY-REQUEST AGE STATUS
acid-minimal-cluster acid 14 2 2Gi 110m SyncFailed
Click to show the fragment of the operator log
time="2022-01-24T12:13:22Z" level=warning msg="could not connect to Postgres database: dial tcp: lookup acid-minimal-cluster.default.svc.cluster.local on 10.96.0.10:53: no such host" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2022-01-24T12:13:37Z" level=warning msg="could not connect to Postgres database: dial tcp: lookup acid-minimal-cluster.default.svc.cluster.local on 10.96.0.10:53: no such host" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2022-01-24T12:13:52Z" level=warning msg="could not connect to Postgres database: dial tcp: lookup acid-minimal-cluster.default.svc.cluster.local on 10.96.0.10:53: no such host" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2022-01-24T12:14:07Z" level=warning msg="could not connect to Postgres database: dial tcp: lookup acid-minimal-cluster.default.svc.cluster.local on 10.96.0.10:53: no such host" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2022-01-24T12:14:22Z" level=warning msg="could not connect to Postgres database: dial tcp: lookup acid-minimal-cluster.default.svc.cluster.local on 10.96.0.10:53: no such host" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2022-01-24T12:14:37Z" level=warning msg="could not connect to Postgres database: dial tcp: lookup acid-minimal-cluster.default.svc.cluster.local on 10.96.0.10:53: no such host" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2022-01-24T12:14:52Z" level=warning msg="could not connect to Postgres database: dial tcp: lookup acid-minimal-cluster.default.svc.cluster.local on 10.96.0.10:53: no such host" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2022-01-24T12:15:07Z" level=warning msg="could not connect to Postgres database: dial tcp: lookup acid-minimal-cluster.default.svc.cluster.local on 10.96.0.10:53: no such host" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2022-01-24T12:15:07Z" level=warning msg="error while syncing cluster state: could not sync roles: could not init db connection: could not init db connection: still failing after 8 retries" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2022-01-24T12:15:07Z" level=error msg="could not sync cluster: could not sync roles: could not init db connection: could not init db connection: still failing after 8 retries" cluster-name=default/acid-minimal-cluster pkg=controller worker=0
- read the article manifest reference