Backup & restore - digibib/ls.ext GitHub Wiki

There are two databases that are vital for the systems integrity. They contain the "source of truth" for the state in the system: MySQL holds all the patrons, items and ciruclation data, and Fuseki is master for the knowledge graph representing the catalog and related metadata. Needless to say, backups of those should be performed regularly and stored in a safe place.

All databases in the system use docker volumes to persist data.

MySQL

Normally, we take backups by simply copying the datavolume, but in the case of MySQL, we found there is a risk of getting a corrupted backup, probably due to the large number of writes happening at any time. So to get a safe snapshot of the DB use mysqldump:

docker exec -i koha_mysql_slave bash -c \
  'mysqldump -uroot -p$MYSQL_PASSWORD $MYSQL_DATABASE \
  --single-transaction --master-data=2 --flush-logs \
  --routines --triggers --events --hex-blob' \
  > /data/dumps/`date +%F`/koha_mysqldump_full.sql

At OPL we are running the backup against a read-only slave, as not to create additional burden on the master DB.

Restore by piping the dump into MySQL:

docker exec -i koha_mysql -c 'mysql -uroot -p$MYSQL_PASSWORD $MYSQL_DATABASE' < /data/dumps/2017-03-14/koha_mysqldump_full.sql

Fuseki

Backup with

docker run --rm \
  -v <networkname>_fuseki_data:/from \
  -v $(pwd):/to \
   alpine ash -c "cd /from ; tar -cf /to/fusekidata.tar ."

Restore with

docker stop fuseki
docker run --rm \
  -v $(pwd):/from \
  -v <networkname>_fuseki_data:/to \
  alpine ash -c "cd /to ; tar -xf /from/fusekidata.tar"

Replace <networkname> with the name of your docker-network.

Others

Elasticsearch

Elasticsearch is the search index and does not contain any data that cannot be reconstructed from other sources, but performing a full reindex takes several hours, so you might consider taking backups.

Prepare backup area (only needed the first time):

docker exec elasticsearch curl -XPUT 'localhost:9200/_snapshot/search_backup' -d '
{
      "type": "fs",
      "settings": {
          "location": "/usr/share/elasticsearch/data/backup",
          "compress": "true" 
      }
}'

Take snapshot:

docker exec elasticsearch curl -XPUT 'localhost:9200/_snapshot/search_backup/snapshot_1?wait_for_completion=true'

Import snapshot into to volume (if snapshot is from a different server)

docker run --rm \
   -v $(pwd):/from \
   -v <dockernetwork>_elasticsearch_data:/to \
    alpine ash -c "cd /to/backup ; tar -xf /from/esdata.tar"

Close index for writes

docker exec elasticsearch curl -XPOST 'localhost:9200/search/_close'

Restore

docker exec elasticsearch curl -XPOST 'localhost:9200/_snapshot/search_backup/snapshot_1/_restore'

Open index for writes again

docker exec elasticsearch curl -XPOST 'localhost:9200/search/_open'

When we are deploying upgrades that requires a full reindex (due to changes in the Elasticsearch document mappings), we perform the indexing on a staging server, and copy the data to our production server using the backup/restore commands displayed above.

Koha Zebra index

TODO