Docker Recovery From Full Disk - earthcubeprojects-chords/chords GitHub Wiki

This emergency recovery was triggered by issue #375.

The tzvolcano portal ran out of disk space. These are the relevant factors:

The m3.large instance creates a root disk of 8G. We typically bump this up to 20G for tzvolcano.
But the instance type of m3.large had created a root disk of type 'standard'. Normally the volume page shows a disk type of 'gp2' for other instance types.
It appears that you can't resize this type of disk.

I was able to log into tzvolcano, but I couldn't do anything with docker, due to the lack of disk space. To increase the disk space, I did the following from the ec2 console (here is guidance on changing storage types):

Stop the tzvolcano instance.
Write down the availability zone of the instance (uswest-2b).
Write down the root device name for the instance (/dev/sda1).
Made a snapshot of the original volume.
Created a new volume from the snapshot. The new volume was 20G, on the SSD media, with the previous availability zone.
Detach the old volume from the instance.
Attach the new volume to the instance, with the same disk device name.
Restart the instance.

I ssh'ed back into tzvolcano. It now had plenty of free disk space. However, docker was completely hosed, apparently from a corrupted devmapper scheme. Any docker command would produce a result such as:

[root@ip-172-31-45-116 ~]# docker pull alpine
Using default tag: latest
latest: Pulling from library/alpine
88286f41530e: Extracting [==================================================>]  1.99 MB/1.99 MB
failed to register layer: devmapper: Thin Pool has 47639 free data blocks which is less than minimum required 163840 free data blocks. Create more free space in thin pool or use dm.min_free_space option to change behavior

The advice for this is to completely remove docker. But this would wipe out the tzvolcano data. Also, I didn't want to lose the CHORDS image that was running currently, since it is 'latest', and a pull would have gotten a much newer one. To deal with these issues, the following was performed:

# Save the CHORDS image: 
docker save <image_id> > ~/chords.img
# Save the CHORDS volumes: 
tar -cvf /var/lib/docker/volumes ~/volumes.tar
# Uninstall docker: 
service docker stop 
yum remove docker
# Get rid of docker artifacts: 
mv /var/lib/docker /var/lib/docker.save
# Reinstall docker:
yum install docker
# Restore the volumes: 
cd /var/lib/docker
tar -xvf ~/volumes.tar
# Restart docker: 
service docker start

At this point, docker was functioning properly. I was able to pull the alpine image, and docker volume ls showed the CHORDS volumes were intact.

Finally, it was time to restore the CHORDS images:

# Pull the images
cd
docker-compose -p chords pull
# replace ncareol/chords:latest with the saved version
docker rmi ncareol/chords:latest
docker load < ~/chords.img
docker images
docker tag <image_id> ncareol/chords:latest
# Verify
docker images

After verifying that the images were correct, the portal was restarted:

docker-compose -p chords up -d

A critical aspect of this exercise was the fact that I was able to save and restore the named volumes just by copying the existing contents of /var/lib/docker/volumes into the new /var/lib/docker/volumes.

Docker Recovery From Full Disk - earthcubeprojects-chords/chords GitHub Wiki

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️