How to Restore From Backups - QualitativeDataRepository/TechnicalTeam GitHub Wiki


This page provides step by step instructions on how to restore application data from backups.


Available Backups

name script log file includes host
Drupal data /opt/backup_scripts/drupal-data /var/log/backup-drupal-data.log /var/www/source_files /var/www/thumbs /var/www/drupal_data qdr-prod
Dataverse files /usr/bin/duply dataverse-files pre+bkp_post_purge --force /var/log/backup-dataverse-files.log /srv/glassfish/dataverse qdr-prod
LDAP /opt/backup_scripts/ldap /var/log/backup-ldap.log /etc/ldap /var/lib/ldap qdr-prod
postgres /opt/backup_scripts/postgres /var/log/backup-postgres.log full database dump qdr-prod
MySQL /opt/backup_scripts/mysql /var/log/backup-mysql.log drupal and mysql database dumps qdr-prod
SOLR /opt/backup_scripts/solr /var/log/backup-solr.log /opt/solr/* qdr-prod
all (offsite) /root/backup.sh none qdr-backups S3 bucket 128.230.189.161 (QDR DR host)
dataverse (offsite) /root/backup.sh none qdr-dataverse S3 bucket 128.230.189.161 (QDR DR host)
full instance snapshot AWS Backup none entire qdr-prod instance AWS

Create On-Demand Backup

Before starting a disaster recovery exercise, we need to create a backup of Prod.

  1. In the AWS console, Under resources navigate to 'Instances (running)'
  2. In list of instances, click the box 'qdr-prod'
  3. Copy the instance id (should be displayed below, and will look like 'i-[character string]`
  4. Navigate back to the console panel
  5. Select AWS Backup
  6. From the left hand panel select 'Protected Resources'
  7. Click on the 'Create on-demand backup` button in the upper right hand corner (Big orange button)

In the Create on-demand backup panel

  1. Select 'EC2' from the Resource Type dropbdown panel
  2. Enter the qdr-prod instance id that you copied in step 3 above
  3. Backup-window = Create backup now
  4. Retention period = Days, 12
  5. Backup-vault = production
  6. IAM Roles = Defaut role
  7. In the bottom right hand corner - select orange button that reads "Create on-demand backup"

The job will run and a backup should be created in ~1 hour (likely less) On-Demand Backup

Restoring Production from AWS Backup

Steps to start restoration to a new instance

Restoring from AWS Backup

Steps to add restored instance to the load balancer

  • Go to AWS EC2 and find the new instance in running instances (it should have no name)
  • Name the newly restored instance - naming convention that corresponds with mm-dd-yy when the restoration is happening (e.g. qdr-prod-091222)
  • Go to Target Groups
  • Select prod target group
  • Click "Register targets"
  • Select newly restored instance
  • Click "Include as Pending below"
  • After a few moments, the newly restored instance will start receiving production traffic.

Adding restored instance to load balancer target group

Turn off old instance

  • Before turning off the old instance, make sure that is has been removed from the prod2 and prod-annorep target groups
  • To turn of the instance, ssh to it and run sudo shutdown -hf now

Restoring Individual Backups

Preparation - do this before you attempt any restoration

  • Make sure you have at least 20 GB of free disk space available before attempting to download and extract backups
df -h /backup/
Filesystem           Size  Used Avail Use% Mounted on
/dev/mapper/vg2-lv0   98G   14G   84G  14% /backup
  • Create a temporary directory in /backup
mkdir -p /backup/restore-drill
cd /backup/restore-drill

Drupal data

  • script: /opt/backup_scripts/drupal-data
  • log file: /var/log/backup-drupal-data.log
  • includes: /var/www/source_files /var/www/thumbs /var/www/drupal_data
  • host: qdr-prod

Restoration steps

  • Find available backups in S3
aws s3 ls s3://qdr-backups/  | grep drupal-data
2020-12-04 06:06:23 7314702162 drupal-data-2020-12-04-06-00-01-UTC.tgz
2020-12-05 06:06:17 7314702162 drupal-data-2020-12-05-06-00-01-UTC.tgz
2020-12-06 06:06:15 7314702162 drupal-data-2020-12-06-06-00-01-UTC.tgz
2020-12-07 06:06:19 7314702162 drupal-data-2020-12-07-06-00-01-UTC.tgz
2020-12-08 06:06:24 7314702162 drupal-data-2020-12-08-06-00-01-UTC.tgz
  • Download the desired file
aws s3 cp s3://qdr-backups/drupal-data-2020-12-04-06-00-01-UTC.tgz .
  • Extract
tar xzf drupal-data-2020-12-04-06-00-01-UTC.tgz
  • At this point your working directory should contain the var directory. Run ls -l to verify.
ls -l
total 7143280
drwxr-xr-x 3 root root       4096 Dec  9 01:32 ./
drwxr-xr-x 8 root root       4096 Dec  9 01:30 ../
-rw-r--r-- 1 root root 7314702162 Dec  4 06:06 drupal-data-2020-12-04-06-00-01-UTC.tgz
drwxr-xr-x 3 root root       4096 Dec  9 01:32 var/
  • Restore what is needed using rsync
rsync -Pa var/www/drupal_data/ /var/www/drupal_data/
rsync -Pa var/www/source_files/ /var/www/source_files
rsync -Pa var/www/thumbs/ /var/www/thumbs/
  • Cleanup
cd /backup
rm -fR restore-drill/

Dataverse files

  • script: /usr/bin/duply dataverse-files pre+bkp_post_purge --force
  • log file: /var/log/backup-dataverse-files.log
  • includes: /srv/glassfish/dataverse
  • host: qdr-prod

Restoration steps

  • Restore from S3 to disk
duply dataverse-files restore /backup/restore-drill
  • Extract
tar -xf dataverse.tar
  • Copy what is needed using rsync
rsync -Pa srv/glassfish/dataverse/ /srv/glassfish/dataverse
  • Cleanup
cd /backup
rm -fR restore-drill/

LDAP

  • script: /opt/backup_scripts/ldap
  • log file: /var/log/backup-ldap.log
  • includes: /etc/ldap /var/lib/ldap
  • host: qdr-prod

Restoration steps

  • List available files
aws s3 ls s3://qdr-backups | grep ldap
2020-12-04 06:00:05    3880605 ldap-2020-12-04-06-00-01-UTC.tgz
2020-12-05 06:00:05    3878541 ldap-2020-12-05-06-00-01-UTC.tgz
2020-12-06 06:00:05    3879433 ldap-2020-12-06-06-00-01-UTC.tgz
2020-12-07 06:00:06    3885071 ldap-2020-12-07-06-00-01-UTC.tgz
2020-12-08 06:00:05    3897521 ldap-2020-12-08-06-00-01-UTC.tgz
  • Download
aws s3 cp s3://qdr-backups/ldap-2020-12-04-06-00-01-UTC.tgz  .
download: s3://qdr-backups/ldap-2020-12-04-06-00-01-UTC.tgz to ./ldap-2020-12-04-06-00-01-UTC.tgz
  • Extract
tar xzf ldap-2020-12-04-06-00-01-UTC.tgz
  • At this point your working directory contains etc and var directories
ll
total 3808
drwx------ 4 root root    4096 Dec  9 01:55 ./
drwxr-xr-x 8 root root    4096 Dec  9 01:41 ../
drwxr-xr-x 3 root root    4096 Dec  9 01:55 etc/
-rw-r--r-- 1 root root 3880605 Dec  4 06:00 ldap-2020-12-04-06-00-01-UTC.tgz
drwxr-xr-x 3 root root    4096 Dec  9 01:55 var/
  • Stop SLAPD
service slapd stop
  • Restore files
rsync -Pav var/lib/ldap/ /var/lib/ldap
  • Restart SLAPD
service slapd start
  • Cleanup
cd /backup
rm -fR restore-drill/

Postgres

  • script: /opt/backup_scripts/postgres
  • log file: /var/log/backup-postgres.log
  • includes: full database dump
  • host: qdr-prod, qdr-dev

Restoration steps

  • List available files
aws s3 ls s3://qdr-backups | grep postg
2020-12-09 02:27:02    4044209 dev-postgres-2020-12-09-02-26-59-UTC.tgz
2020-12-04 06:00:43  129074934 postgres-2020-12-04-06-00-01-UTC.tgz
2020-12-05 06:00:42  129221271 postgres-2020-12-05-06-00-01-UTC.tgz
2020-12-06 06:00:41  129381570 postgres-2020-12-06-06-00-01-UTC.tgz
2020-12-07 06:00:43  129521080 postgres-2020-12-07-06-00-01-UTC.tgz
2020-12-08 06:00:43  129677442 postgres-2020-12-08-06-00-01-UTC.tgz```
  • Download
aws s3 cp s3://qdr-backups/postgres-2020-12-04-06-00-01-UTC.tgz .
  • Extract
tar xfz postgres-2020-12-04-06-00-01-UTC.tgz
  • Fix permissions
chown -R postgres.postgres backup/
  • Restore
psql -U dvnuser -d dvndb < backup/postgres/postgres-2020-12-04-06-00-01-UTC.sql
  • Cleanup
cd /backup
rm -fR restore-drill/

MySQL

  • script: /opt/backup_scripts/mysql
  • log file: /var/log/backup-mysql.log
  • includes: drupal and mysql database dumps (SQL)
  • host: qdr-prod

Restoration steps

  • List available files
aws s3 ls s3://qdr-backups | grep mysql
2020-12-04 06:00:24   25618817 mysql-2020-12-04-06-00-01-UTC.tgz
2020-12-05 06:00:23   25618814 mysql-2020-12-05-06-00-01-UTC.tgz
2020-12-06 06:00:23   25618801 mysql-2020-12-06-06-00-01-UTC.tgz
2020-12-07 06:00:23   25618837 mysql-2020-12-07-06-00-01-UTC.tgz
2020-12-08 06:00:22   25618833 mysql-2020-12-08-06-00-01-UTC.tgz
  • Download
aws s3 cp s3://qdr-backups/mysql-2020-12-04-06-00-01-UTC.tgz .
download: s3://qdr-backups/mysql-2020-12-04-06-00-01-UTC.tgz to ./mysql-2020-12-04-06-00-01-UTC.tgz
  • Extract
tar xzf mysql-2020-12-04-06-00-01-UTC.tgz
  • SQL dumps are located in the newly created backup directory
ll
total 25032
drwx------ 3 root root     4096 Dec  9 01:59 ./
drwxr-xr-x 8 root root     4096 Dec  9 01:41 ../
drwxr-xr-x 2 root root     4096 Dec  9 01:59 backup/
-rw-r--r-- 1 root root 25618817 Dec  4 06:00 mysql-2020-12-04-06-00-01-UTC.tgz
  • To restore the databases
mysql drupal < backup/mysql-drupal-2020-12-04-06-00-01-UTC.sql
mysql mysql < mysql-mysql-2020-12-04-06-00-01-UTC.sql
  • Cleanup
cd /backup
rm -fR restore-drill/

SOLR

  • script: /opt/backup_scripts/solr
  • log file: /var/log/backup-solr.log
  • includes: /opt/solr/*
  • host: qdr-prod

Restoration steps

  • List available files
aws s3 ls s3://qdr-backups | grep solr
2020-12-04 06:01:29  824530809 solr-2020-12-04-06-00-01-UTC.tgz
2020-12-05 06:01:28  823910244 solr-2020-12-05-06-00-01-UTC.tgz
2020-12-06 06:01:26  824068577 solr-2020-12-06-06-00-01-UTC.tgz
2020-12-07 06:01:28  823442222 solr-2020-12-07-06-00-01-UTC.tgz
2020-12-08 06:01:28  824170651 solr-2020-12-08-06-00-01-UTC.tgz
  • Download
aws s3 cp s3://qdr-backups/solr-2020-12-04-06-00-01-UTC.tgz .
  • Extract
tar xfz solr-2020-12-04-06-00-01-UTC.tgz
  • At this point your working directory should contain opt direcotry
ll
total 805224
drwx------ 3 root root      4096 Dec  9 02:03 ./
drwxr-xr-x 8 root root      4096 Dec  9 01:41 ../
drwxr-xr-x 4 root root      4096 Dec  9 02:03 opt/
-rw-r--r-- 1 root root 824530809 Dec  4 06:01 solr-2020-12-04-06-00-01-UTC.tgz
  • Stop SOLR
service solr stop
  • Restore using rsync
rsync -Pav opt/solr-7.7.2/ /opt/solr-7.7.2/
  • Restart SOLR
service solr start
  • Cleanup
cd /backup
rm -fR restore-drill/

After taking new prod out of maintenance

  • Restart payara removing /usr/local/payara-5.2021.8/glassfish/domains/dataverse/imq/instances/imqbroker/loc

Check the EC2 tags of newly restored instance

The instance should be checked, and should contain the following key-value pairs:

  • Name: qdr-prod (or qdr-prod2, etc)
  • Environment: prod

Retrieve new IP address

Where to find Private IP address

Update DNS

Edit private DNS record

AnnoRep Restore

  • Restart the Docker image for AnnoRep OR - restart 'AR-Server'
  • JM note- The code is in /opt/annorep and the server is started as the counter user right now with - java -jar ARServer-0.0.2-SNAPSHOT.jar --server.port=11111 --dataverse.url=https://data.qdr.syr.edu > arlog3.txt &
  • The new instance is added to the prod-annorep target group
  • The old instance is removed from the prod-annorep target group
  • /srv/annorep/annorep-env has to be edited to update the ARCORE_SERVER_URL IP address to that of the new instance and then the AR client has to be restarted:
  • as annorep user: cd /srv/annorep
  • DEPLOY_TAG=<8char from commit on desired branch (currently main on prod)> ./deploy