gcp cloudsql upgrade migration import export pg_dump - ghdrako/doc_snipets GitHub Wiki

When you perform an in-place major version upgrade, Cloud SQL retains your database settings, including your instance name, IP address, explicitly configured flag values, and user data. However, the default value of the system variables might change. For example, the default value of the password_encryption flag in PostgreSQL 13 and earlier is md5. When you upgrade to PostgreSQL 14, the default value of this flag changes to scram-sha-256.

Upgrade in place

Cloud SQL publishes error logs to projects/PROJECT_ID/logs/cloudsql.googleapis.com%2Fpostgres-upgrade.log.

  • prepared statement the root cause of the Major version upgrade failure is the presence of prepared transactions in the source cluster.

Prepared transactions are part of PostgreSQL's two-phase commit (2PC) mechanism, and they must be resolved (either committed or rolled back) before performing the upgrade. The pg_upgrade utility will fail if there are any unresolved prepared transactions because they can cause data inconsistency during the migration.

Check for Prepared Transactions: you can use the following query to check if there are any unresolved prepared transactions:

SELECT * FROM pg_prepared_xacts;

Resolve Transactions: If prepared transactions are found, you need to either commit or roll them back:

To commit:

COMMIT PREPARED 'transaction_id';

To rollback:

ROLLBACK PREPARED 'transaction_id';

After resolving the prepared transactions, you should be able to proceed with the upgrade process.

  • unloged tabeles with loged index discrepancies in the persistence characteristics between tables and their sequences for generated columns [1].

To mitigate this issue, unlogged sequences were introduced in PostgreSQL v15+. To address this, either drop the unlogged tables associated with these generated columns (assuming they can be recreated without any data loss) or temporarily convert them to Permanent tables before attempting MVU.

ALTER TABLE <TABLE_NAME> SET LOGGED;

You can identify all such tables using the query below:

SELECT
    c.relname AS table_name,
    a.attname AS column_name,
    a.attidentity AS identity_type
FROM
    pg_catalog.pg_class c
    JOIN pg_catalog.pg_attribute a ON a.attrelid = c.oid
    JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE
    a.attidentity IN ('a', 'd')  -- 'a' for GENERATED ALWAYS, 'd' for GENERATED BY DEFAULT
    AND c.relkind = 'r'  -- Only consider ordinary tables
    AND c.relpersistence = 'u'  -- Include unlogged tables
ORDER BY
    c.relname, a.attname;

Check requierments

  1. Check the LC_COLLATE value for the template and postgres databases. The character set for each database must be en_US.UTF8.
SELECT datname, pg_encoding_to_char(encoding) AS encoding, datcollate, datctype 
FROM pg_database 
WHERE datname IN ('template1', 'postgres');
  1. Manage your read replicas. You cannot upgrade the primary instance while the instance is replicating to the read replicas.
SELECT * FROM pg_stat_replication;
SHOW cloudsql.replication;
gcloud sql instances list --filter="masterInstanceName=YOUR_PRIMARY_INSTANCE"
  1. plglogical extension replication need be desable
SELECT * FROM pg_extension WHERE extname = 'pglogical';
SELECT * FROM pglogical.subscription;
SELECT * FROM pglogical.show_subscription_status();
SELECT * FROM pglogical.node;
SELECT * FROM pglogical.replication_set_table;
SELECT * FROM pglogical.alter_subscription_disable(subscription_name name, immediate bool);
SELECT pg_drop_replication_slot(slot_name) FROM pg_replication_slots
  WHERE slot_name IN (SELECT slot_name FROM pg_replication_slots);

after upgrade

SELECT pglogical.create_subscription(
    subscription_name := 'test_sub',
    provider_dsn := 'host=primary-ip port=5432 dbname=postgres user=replication_user password=replicapassword'
); 
SELECT * FROM pglogical.show_subscription_status('test_sub');
  1. Composite types - may cause incompatibility between majore version
  • Lista Composite Types utworzonych przez użytkownika
SELECT n.nspname AS schema_name, t.typname AS type_name
FROM pg_type t
JOIN pg_namespace n ON n.oid = t.typnamespace
WHERE t.typtype = 'c'
AND n.nspname NOT IN ('pg_catalog', 'information_schema')
ORDER BY schema_name, type_name;
  • Sprawdzenie, czy jakiekolwiek tabele używają Composite Types
SELECT c.relname AS table_name, a.attname AS column_name, t.typname AS composite_type
FROM pg_attribute a
JOIN pg_class c ON a.attrelid = c.oid
JOIN pg_type t ON a.atttypid = t.oid
JOIN pg_namespace n ON n.oid = t.typnamespace
WHERE t.typtype = 'c'
AND c.relkind = 'r'  -- Tylko zwykłe tabele
AND n.nspname NOT IN ('pg_catalog', 'information_schema')
ORDER BY table_name, column_name;
  • Sprawdzenie, czy funkcje używają Composite Types
SELECT p.proname AS function_name, t.typname AS composite_type, n.nspname AS schema_name
FROM pg_proc p
JOIN pg_type t ON p.prorettype = t.oid
JOIN pg_namespace n ON n.oid = t.typnamespace
WHERE t.typtype = 'c'
AND n.nspname NOT IN ('pg_catalog', 'information_schema')
ORDER BY function_name;

Other problms:

  • occurred due to a temporary database connectivity loss during an Out of Memory (OOM) event
  • transient connectivity issues as a possible cause,
  • composite types have OIDs that may change between PostgreSQL versions, so progressing with the upgrade without dropping these columns could result in a composite type being applied to the column which is incompatible with the data it contains. Furthermore, user-defined composite type OIDs will not change during an upgrade, therefore columns using those composite types do not need to be removed.

Migration export/import way

Using pg_dump pg_dumpall pg_restore

Usable pg_dump format for pg_restore: custom, directory

pg_dump \
-U USERNAME \
--format=custom \
--no-owner \
--no-acl \
DATABASE_NAME > DATABASE_NAME.dmp

parralel export

pg_dump \
-U USERNAME \
--format=directory \
--no-owner \
--no-acl \
--jobs=<njobs>
--file=<directory-to-export>
DATABASE_NAME 
pg_restore \
--list DATABASE_NAME.dmp | sed -E 's/(.* EXTENSION )/; \1/g' >  DATABASE_NAME.toc

You can use pg_dump export to import using gcloud sql import sql - describe : https://cloud.google.com/sql/docs/postgres/import-export/import-export-dmp#external-server

Export data from multiple files in parallel from Cloud SQL for PostgreSQL

You can only use the directory output format to export data from multiple files in parallel. To export in parallel, use the -j NUM_CORES flag. NUM_CORES is the number of cores on the source instance.

Using gcloud sql import/export

The export sql command does not contain triggers or stored procedures, but does contain views. To export triggers and/or stored procedures, use the pg_dump tool. - chyba nie jest to aktualne bo na zapasie sa funkcje i peocedury

With serverless export, Cloud SQL creates a separate, temporary instance to offload the export operation. Offloading the export operation allows databases on the primary instance to continue to serve queries and perform operations at the usual performance rate. When the data export is complete, the temporary instance is deleted automatically. This might be a good option if you're taking a one-time export of a large database. Use the Google Cloud Console, gcloud, or REST API export functions, with the offload flag, to perform a serverless export operation. To use serverless exports, then use the offload parameter.

gcloud sql export sql INSTANCE_NAME gs://BUCKET_NAME/sqldumpfile.gz \
--database=DATABASE_NAME \
--offload
Flag Description
--async Return immediately, without waiting for the operation in progress to complete.
--offload reduces strain on source instances and allows other operations to be performed while the export is in progress.
--parallel Perform a parallel export
--threads=THREADS Specifies the number of threads to use for the parallel export. Use with --parallel

Export in parallel to multiple files - number of files is equal THREAD_NUMBER

gcloud sql export sql INSTANCE_NAME gs://BUCKET_NAME/BUCKET_PATH/FOLDER_NAME \
  --offload \
  --parallel \
  --threads=THREAD_NUMBER \
  --database=DATABASE_NAME \
  --table=TABLE_EXPRESSION

The export sql command doesn't contain triggers or stored procedures, but does contain views. To export triggers or stored procedures, use a single thread for the export. This thread uses the pg_dump tool.

After the export completes, you should have files in a folder in the Cloud Storage bucket in the pg_dump directory format.

  • gcloud sql import sql

    • Only one long-running Cloud SQL import or export operation can run at a time on an instance.
    • Before importing a dump file, all the database users who own objects or were granted permissions on objects in the dumped database must exist in the target database. If they don't, the import operation fails to recreate the objects with the original ownership or permissions.Create the database users before importing.
    • PostgreSQL imports data in a single transaction. Therefore, if you cancel the import operation, then Cloud SQL doesn't persist data from the import.
gcloud sql import sql INSTANCE_NAME  gs://BUCKET_NAME/IMPORT_FILE_NAME \
--database=DATABASE_NAME
Flag Description
--async Return immediately, without waiting for the operation in progress to complete.
--parallel Perform a parallel export
--threads=THREADS Specifies the number of threads to use for the parallel export. Use with --parallel

In parallel only avaliable format is directory!!! - so export gcloud sql export sql with parallel option !!!

gcloud sql import sql INSTANCE_NAME gs://BUCKET_NAME/BUCKET_PATH/FOLDER_NAME \
  --offload \
  --parallel \ 
  --threads=THREAD_NUMBER \
  --database=DATABASE_NAME
  • If your instance has only one virtual CPU (vCPU), then you can't import or export multiple files in parallel. The number of vCPUs for your instance can't be smaller than the number of threads that you're using for the import or export operation, and the number of threads must be at least two.
  • The pg_dump utility can't chunk any tables that you export. Therefore, if you have one very large table, then it can become a bottleneck for the speed of the export operation.

Monitoring

$ gcloud sql operations list --instance=<instance_name> --project=<project_name>
NAME                                  TYPE             START                          END                            ERROR           STATUS
bcdeb673-7b4b-410b-9854-75b700000046  IMPORT           2024-12-30T09:59:27.023+00:00  T                              -               RUNNING
$ gcloud sql operations describe bcdeb673-7b4b-410b-9854-75b700000046

Cloud Logging Query

resource.type="cloudsql_database"
resource.labels.database_id="<project_name>:<instance_name>"
protoPayload.methodName="cloudsql.instances.import"

Migration

Migration using logical replication

  • Check for tables without a primary key
SELECT tab.table_schema, tab.table_name
    FROM information_schema.tables tab
    LEFT JOIN information_schema.table_constraints tco
              ON tab.table_schema = tco.table_schema
              AND tab.table_name = tco.table_name
              AND tco.constraint_type = 'PRIMARY KEY'
    WHERE tab.table_type = 'BASE TABLE'
          AND tab.table_schema NOT IN('pg_catalog', 'information_schema', 'pglogical')
          AND tco.constraint_name IS NULL
    ORDER BY table_schema, table_name;
  • Speeding Up Initialization - disable not PK index in replicated db
UPDATE pg_index SET indisready = FALSE WHERE indexrelid IN (
      SELECT idx.indexrelid FROM pg_index idx
      INNER JOIN pg_class cls ON idx.indexrelid = cls.oid
      INNER JOIN pg_namespace nsp ON cls.relnamespace = nsp.oid
      WHERE nsp.nspname !~ '^pg_'
      AND nsp.nspname NOT IN ('information_schema', 'pglogical')
      AND idx.indisprimary IS FALSE
    );
  • Initialization - Migrate users and structure
pg_dumpall -h localhost -U myuser — globals-only -f pg_users_roles_backup.sql
pg_dump - file=/postgres_20_03_20_06-dump.sql - schema-only - dbname=postgres - username=postgres - host=127.0.0.1 - port=5432

# first you must be sure that the users and roles match with master - create it with CREATE USER command or pg_dumpall
pg_restore - dbname=postgres - clean - username=postgres - host=127.0.0.1 - port=5454 /postgres10_33_35-dump.sql
  • Enable logical replication

  • Wait for initialization to complete

SELECT * FROM pglogical.local_sync_status WHERE NOT sync_status = 'r';
  • After inicialization
SELECT CONCAT('"', nsp.nspname, '"."', cls.relname, '"') FROM pg_index idx
      INNER JOIN pg_class cls ON idx.indexrelid = cls.oid
      INNER JOIN pg_namespace nsp ON cls.relnamespace = nsp.oid
      WHERE nsp.nspname !~ '^pg_'
      AND nsp.nspname NOT IN ('information_schema', 'pglogical')
      AND idx.indisprimary IS FALSE
      AND idx.indisready IS FALSE;
REINDEX INDEX CONCURRENTLY $index;
  • Switch to new wersion

Automating export operations

Commands:

gcloud sql operations list \
--instance INSTANCE_NAME \
--limit 10
⚠️ **GitHub.com Fallback** ⚠️