Updating staging data by restoring a backup - bennettoxford/openprescribing GitHub Wiki

Note: this process takes about 5 hours to run and the backup system takes a snapshot of the database cluster just before 11pm UTC every night. You don't really want the two processes to overlap, so you need to be careful about timings here.

1. Download backup file

First, go to the disk partition with lots of space and download yesterday's backup (amending the date below as appropriate):

cd /mnt/volume-fra1-02/db_backups/
gsutil cp "gs://ebmdatalab_db_backups/prescribing.2019_02_12.sql.gz" data-for-staging.sql.gz

This grabs a file of around 15GB but should only take a few minutes over the server's connection.

2. Create database to restore into

psql -U prescribing -c 'CREATE DATABASE staging_new;'

3. Double check that we're restoring into the correct database

The backup file comes with commands which automatically create and connect to the production database. We don't want to do this so we need to remove these commands. The below grep expression should do that, but before running anything we want to double check we're doing the right thing. So run the command below and inspect the first few stanzas of SQL:

zcat data-for-staging.sql.gz \
 | grep -vP '^(\\connect|CREATE DATABASE|ALTER DATABASE)' \
 | less

Once you get to the section where specific tables are being created you ought to be in safe territory, but for everything before this make sure you read and understand what it's doing. If you're unsure, stop!

4. Restore the data into the new database

Assuming the above looks good, we can proceed with the restore (in a screen/tmux session obviously). This takes about 5 hours.

screen -S database_restore
time zcat data-for-staging.sql.gz \
 | grep -vP '^(\\connect|CREATE DATABASE|ALTER DATABASE)' \
 | psql -U prescribing -d staging_new

5. Delete all user related data

We don't want data relating to user signups in our staging database. By emptying the user table with a CASCADE we automatically delete data from all the related tables as well.

psql -U prescribing staging_new -c 'TRUNCATE auth_user CASCADE;'

6. Swap out the old and new staging databases

Because Postgres won't let you rename the current database we need to connect to a database which is neither the old or new staging databases:

psql -U prescribing postgres

We then swap the databases:

BEGIN;
ALTER DATABASE prescribing_staging RENAME TO staging_old;
ALTER DATABASE staging_new RENAME TO prescribing_staging;
COMMIT;

7. Delete the old staging data and backup file

Assuming everything looks good with staging we can now drop the old staging database

DROP DATABASE staging_old;

And remove the old backup file:

rm data-for-staging.sql.gz