backups vault - juancamilocc/virtual_resources GitHub Wiki

Automated Backups for Vault Server deployed with Raft storage

In this guide, you will learn how to set up an automated backup for Vault server in Kubernetes environment saving it in the cloud storage for different providers like AWS, GCP, Azure and OCI.

Bash Script

The Bash script that automates the backup process is shown below.

NOTE: This script assumes that you have deployed a Vault Server in a vault namespace.

#!/bin/bash

SNAPSHOT_FILE="vault-snapshot-$(date +%F).snap"
VAULT_ADDR="vault.vault.svc.cluster.local:8200"
BUCKET_NAME="backup-vault-juancamilocc"
CLOUD_PROVIDER=${1:-"aws"}

echo "Finding Vault leader pod..."
VAULT_PODS=$(kubectl get pods -n vault -l app.kubernetes.io/name=vault -o jsonpath='{.items[*].metadata.name}')

LEADER_POD=""
for POD in $VAULT_PODS; do
    if kubectl exec -n vault $POD -- vault operator raft list-peers 2>/dev/null | grep -q "leader"; then
        LEADER_POD=$POD
        break
    fi
done

if [ -z "$LEADER_POD" ]; then
    echo "Error: No leader pod found. Exiting."
    exit 1
fi

echo "Leader pod found: $LEADER_POD"

echo "Logging into Vault..."
kubectl -n vault exec $LEADER_POD -- vault login $VAULT_TOKEN
if [ $? -ne 0 ]; then
    echo "Error: Vault login failed. Exiting."
    exit 1
fi

echo "Creating snapshot file..."
kubectl -n vault exec $LEADER_POD -- vault operator raft snapshot save /tmp/$SNAPSHOT_FILE
if [ $? -ne 0 ]; then
    echo "Error: Snapshot creation failed. Exiting."
    exit 1
fi

echo "Copying snapshot file from leader pod..."
kubectl cp vault/$LEADER_POD:/tmp/$SNAPSHOT_FILE $SNAPSHOT_FILE
if [ $? -ne 0 ]; then
    echo "Error: Failed to copy snapshot file. Exiting."
    exit 1
fi

echo "Uploading snapshot to bucket ($CLOUD_PROVIDER)..."

case "$CLOUD_PROVIDER" in
    aws)
        aws s3 cp "$SNAPSHOT_FILE" s3://$BUCKET_NAME/
        ;;
    gcp)
        gsutil cp "$SNAPSHOT_FILE" gs://$BUCKET_NAME/
        ;;
    azure)
        az storage blob upload --account-name <account-name> --container-name $BUCKET_NAME --file "$SNAPSHOT_FILE" --name "$(basename "$SNAPSHOT_FILE")"
        ;;
    oci)
        oci os object put --bucket-name "$BUCKET_NAME" --file "$SNAPSHOT_FILE" --name "$(basename "$SNAPSHOT_FILE")"
        ;;
    *)
        echo "Error: Unsupported cloud provider '$CLOUD_PROVIDER'. Exiting."
        exit 1
        ;;
esac

if [ $? -ne 0 ]; then
    echo "Error: Failed to upload snapshot. Exiting."
    exit 1
fi

echo "Backup completed successfully!"

Let's go through each step of the script.

First, we define the important variables like snapshot file with its date, Vault internal address in Kubernetes, bucket name where we will save the snapshot file and finally the cloud provider by default, in this case AWS.

SNAPSHOT_FILE="vault-snapshot-$(date +%F).snap"
VAULT_ADDR="vault.vault.svc.cluster.local:8200"
BUCKET_NAME="backup-vault"
CLOUD_PROVIDER=${1:-"aws"}

Get the Vault pods and filter by leader, we do this because the Raft storage just allows us to make snapshots from the leader pod and this can change dinamically due to unexpected restarts of pods or failures.

echo "Finding Vault leader pod..."
VAULT_PODS=$(kubectl get pods -n vault -l app.kubernetes.io/name=vault -o jsonpath='{.items[*].metadata.name}')

LEADER_POD=""
for POD in $VAULT_PODS; do
    if kubectl exec -n vault $POD -- vault operator raft list-peers 2>/dev/null | grep -q "leader"; then
        LEADER_POD=$POD
        break
    fi
done

if [ -z "$LEADER_POD" ]; then
    echo "Error: No leader pod found. Exiting."
    exit 1
fi

Logging into the Vault leader pod and creating a snapshot.

echo "Logging into Vault..."
kubectl -n vault exec $LEADER_POD -- vault login $VAULT_TOKEN
if [ $? -ne 0 ]; then
    echo "Error: Vault login failed. Exiting."
    exit 1
fi

echo "Creating snapshot file..."
kubectl -n vault exec $LEADER_POD -- vault operator raft snapshot save /tmp/$SNAPSHOT_FILE
if [ $? -ne 0 ]; then
    echo "Error: Snapshot creation failed. Exiting."
    exit 1
fi

Finally, we copy the snapshot and upload it to the appropriate cloud provider's bucket.

echo "Copying snapshot file from leader pod..."
kubectl cp vault/$LEADER_POD:/tmp/$SNAPSHOT_FILE $SNAPSHOT_FILE
if [ $? -ne 0 ]; then
    echo "Error: Failed to copy snapshot file. Exiting."
    exit 1
fi

echo "Uploading snapshot to bucket ($CLOUD_PROVIDER)..."

case "$CLOUD_PROVIDER" in
    aws)
        aws s3 cp "$SNAPSHOT_FILE" s3://$BUCKET_NAME/
        ;;
    gcp)
        gsutil cp "$SNAPSHOT_FILE" gs://$BUCKET_NAME/
        ;;
    azure)
        az storage blob upload --account-name <account-name> --container-name $BUCKET_NAME --file "$SNAPSHOT_FILE" --name "$(basename "$SNAPSHOT_FILE")"
        ;;
    oci)
        oci os object put --bucket-name "$BUCKET_NAME" --file "$SNAPSHOT_FILE" --name "$(basename "$SNAPSHOT_FILE")"
        ;;
    *)
        echo "Error: Unsupported cloud provider '$CLOUD_PROVIDER'. Exiting."
        exit 1
        ;;
esac

if [ $? -ne 0 ]; then
    echo "Error: Failed to upload snapshot. Exiting."
    exit 1
fi

echo "Backup completed successfully!"

You can execute the script according to your cloud, as follows.

./backup-vault.sh       # For AWS S3
./backup-vault.sh gcp   # For Google Cloud Storage
./backup-vault.sh azure # For Azure Blob Storage
./backup-vault.sh oci   # For OCI Object Storage

The snapshot file will look similar like this.

Snapshot file

Dockerfile

Use the following Dockerfile. You must also uncomment specific lines depending on your cloud provider.

# Choose your cloud provider

# AWS Cloud
# FROM amazon/aws-cli

# GCP Cloud
# FROM google/cloud-sdk

# Azure Cloud
# FROM chainguard/az:latest-dev

# OCI Cloud
# FROM ghcr.io/oracle/oci-cli:latest 
# ENV OCI_CLI_SUPPRESS_FILE_PERMISSIONS_WARNING=True

USER root

COPY backup-vault.sh backup-vault.sh

# AWS Cloud
# RUN yum install -y unzip curl

# Azure or GCP Cloud
# RUN apt-get update && apt-get install -y curl

# OCI Cloud
# RUN dnf update -y && dnf install -y curl bash-completion

RUN curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl" && \
    chmod +x kubectl && \
    mv kubectl /usr/local/bin/

ENTRYPOINT ["bash", "backup-vault.sh", "<YOUR_CLOUD>"]

Vault configuration

In this section, we will create a token with limited permissions only to make snapshots.

Enter to the vault, as follows.

kubectl -n vault exec -it vault-0 -- sh
/ $ vault login
# Token (will be hidden): <root_token> 
# Success! You are now authenticated. The token information displayed below
# is already stored in the token helper. You do NOT need to run "vault login"
# again. Future Vault requests will automatically use this token.

Now, let's create a limited token.

vault policy write backup-policy - <<EOF
    # Allow to create snapshots
    path "sys/storage/raft/snapshot" {
        capabilities = ["create", "read"]
    }

    # Allow to restore snapshots
    path "sys/storage/raft/snapshot/restore" {
        capabilities = ["create"]
    }

    # Negate access to rest paths
    path "*" {
        capabilities = []
    }
EOF
# Success! Uploaded policy: backup-policy

And create a token based on the previous policy.

vault token create -policy="backup-policy" -ttl="0"
# Key                  Value
# ---                  -----
# token                hvs.CAESIAdIgTXydtk1nF7gqJXYNzSA-E3oIiDetED8G4oV3vUdGh4KHGh2cy5rMFUyeFdDdkl6U24zZFQza3VnbFVoSXU
# token_accessor       s0RsNTj6USHtOVEV3lyks4sX
# token_duration       768h
# token_renewable      true
# token_policies       ["backup-policy" "default"]
# identity_policies    []
# policies             ["backup-policy" "default"]

In this case the token is hvs.CAESIAdIgTXydtk1nF7gqJXYNzSA-E3oIiDetED8G4oV3vUdGh4KHGh2cy5rMFUyeFdDdkl6U24zZFQza3VnbFVoSXU.

Kubernetes Configuration

We need to define an RBAC configuration to allow a ServiceAccount to execute commands on the Vault pods.

apiVersion: v1
kind: ServiceAccount
metadata:
  name: vault-admin
  namespace: vault

--- 

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: pod-reader
  namespace: vault
rules:
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["pods/exec"]
    verbs: ["create", "get"]

--- 

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: pod-reader-binding
  namespace: vault
subjects:
  - kind: ServiceAccount
    name: vault-admin
    namespace: vault
roleRef:
  kind: Role
  name: pod-reader
  apiGroup: rbac.authorization.k8s.io

Define a configmap to store the kube-config.

kube-config-cm.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: kube-configmap
  namespace: vault
data:
  CONFIG: |
    <here kubeconfig>

Define a secret for the Vault token created previously.

apiVersion: v1
kind: Secret
metadata:
  name: vault-token
  namespace: vault
data:
  VAULT_TOKEN: <encoded-base64 limited vault token>
type: Opaque

In AWS Cloud

For aws config file, as aws-conf.yaml.

NOTE: You should take into account that the AWS config file looks like this:

[default]
aws_access_key_id = <tu_access_key_id>
aws_secret_access_key = <tu_secret_access_key>
region = <region>
echo "<aws-config-file>" | base64
# base64 content

aws-config.yaml

apiVersion: v1
kind: Secret
metadata:
  name: aws-config-secret
  namespace: vault
data:
  config: <encoded-base64 aws config-file>
type: Opaque

Then, we can set up the cronjob, as follows.

aws-backup-vault-cj.yaml

apiVersion: batch/v1
kind: CronJob
metadata:
  name: vault-backup-cron
  namespace: vault
spec:
  schedule: "0 23 * * *"
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: vault-admin
          containers:
          - name: backup-vault
            image: <aws-backup-image> # This is defined in the Dockerfile section
            imagePullPolicy: Always
            env:
              - name: VAULT_TOKEN
                valueFrom:
                  secretKeyRef:
                    name: vault-token
                    key: VAULT_TOKEN
            volumeMounts:
            - name: aws-config-secret
              mountPath: /root/.aws/config
              subPath: config
            - name: kube-configmap
              mountPath: /root/.kube/
          restartPolicy: OnFailure
          volumes:
          - name: aws-config-secret
            secret:
                secretName: aws-config-secret
                items:
                - key: config
                    path: config 
          - name: kube-configmap
            configMap:
              name: kube-configmap
              items:
                  - key: CONFIG
                    path: config
              defaultMode: 0600

Deploy all resources.

kubectl apply -f aws-conf.yaml
kubectl apply -f kube-config-cm.yaml
kubectl apply -f aws-backup-vault-cj.yaml

In GCP Cloud

For gcp config file, as gcp-conf.yaml

NOTE: You should take into account, the gcp config file looks like this:

{
  "type": "service_account",
  "project_id": "PROJECT_ID",
  "private_key_id": "KEY_ID",
  "private_key": "-----BEGIN PRIVATE KEY-----\nPRIVATE_KEY\n-----END PRIVATE KEY-----\n",
  "client_email": "SERVICE_ACCOUNT_EMAIL",
  "client_id": "CLIENT_ID",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "token_uri": "https://accounts.google.com/o/oauth2/token",
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/SERVICE_ACCOUNT_EMAIL"
}
echo "<gcp-config-file>" | base64
# base64 content

gcp-config.yaml

apiVersion: v1
kind: Secret
metadata:
  name: gcp-config-secret
  namespace: vault
data:
  service-account.json: <encoded-base64 gcp-config-file>
type: Opaque

Then, we can set up the cronjob, as follows.

gcp-backup-vault-cj.yaml

apiVersion: batch/v1
kind: CronJob
metadata:
  name: vault-backup-cron
  namespace: vault
spec:
  schedule: "0 23 * * *"
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: vault-admin
          containers:
          - name: backup-vault
            image: <gcp-backup-image> # This is defined in the Dockerfile section
            imagePullPolicy: Always
            env:
              - name: VAULT_TOKEN
                valueFrom:
                  secretKeyRef:
                    name: vault-token
                    key: VAULT_TOKEN
            volumeMounts:
            - name: gcp-config-secret
              mountPath: /root/.config/gcloud/application_default_credentials.json
              subPath: service-account.json
            - name: kube-configmap
              mountPath: /root/.kube/
          restartPolicy: OnFailure
          volumes:
          - name: gcp-config-secret
            secret:
              secretName: gcp-config-secret
              items:
                - key: service-account.json
                  path: service-account.json
          - name: kube-configmap
            configMap:
              name: kube-configmap
              items:
                - key: CONFIG
                  path: config
              defaultMode: 0600

Deploy all resources.

kubectl apply -f gcp-conf.yaml
kubectl apply -f kube-config-cm.yaml
kubectl apply -f gcp-backup-vault-cj.yaml

In Azure Cloud

For azure config file, as az-conf.yaml

NOTE: You should take into account, the azure config file looks like this:

{
  "clientId": "CLIENT_ID",
  "clientSecret": "CLIENT_SECRET",
  "tenantId": "TENANT_ID",
  "subscriptionId": "SUBSCRIPTION_ID"
}
echo "<azure-config-file>" | base64
# base64 content

az-config.yaml

apiVersion: v1
kind: Secret
metadata:
  name: azure-config-secret
  namespace: vault
data:
  azure-config.json: <encoded-base64 azure-config-file>
type: Opaque

Then, we can set up the cronjob, as follows.

az-backup-vault-cj.yaml

apiVersion: batch/v1
kind: CronJob
metadata:
  name: vault-backup-cron
  namespace: vault
spec:
  schedule: "0 23 * * *"
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: vault-admin
          containers:
          - name: backup-vault
            image: <az-backup-image> # This is defined in the Dockerfile section
            imagePullPolicy: Always
            env:
              - name: VAULT_TOKEN
                valueFrom:
                  secretKeyRef:
                    name: vault-token
                    key: VAULT_TOKEN
            volumeMounts:
            - name: azure-config-secret
              mountPath: /root/.azure/azure.json
              subPath: azure-config.json
            - name: kube-configmap
              mountPath: /root/.kube/
          restartPolicy: OnFailure
          volumes:
          - name: azure-config-secret
            secret:
              secretName: azure-config-secret
              items:
                - key: azure-config.json
                  path: azure-config.json
          - name: kube-configmap
            configMap:
              name: kube-configmap
              items:
                - key: CONFIG
                  path: config
              defaultMode: 0600

Deploy all resources.

kubectl apply -f az-conf.yaml
kubectl apply -f kube-config-cm.yaml
kubectl apply -f az-backup-vault-cj.yaml

In OCI Cloud

For OCI config file, as oci-config.yaml

NOTE: You should take into account, the OCI config file looks like this:

[DEFAULT]
user=<userid>
fingerprint=<fingerprint>
key_file=/root/.oci/oci_api_key.pem
tenancy=<tenancy>
region=<region>
echo "<oci-config-file>" | base64
# base64 content

oci-config.yaml

apiVersion: v1
kind: Secret
metadata:
  name: oci-conf
  namespace: vault
data:
  oci_config: <encoded-base64 oci-conf-file>
type: Opaque

For oci key-secret, as oci-key-secret.yam.

apiVersion: v1
kind: Secret
metadata:
  name: oci-key
  namespace: vault
data:
  oci_api_key.pem: <base64-encoded key-secret>
type: Opaque

Then, we can set up the cronjob, as follows.

oci-backup-vault-cj.yaml

apiVersion: batch/v1
kind: CronJob
metadata:
  name: vault-backup-cron
  namespace: vault
spec:
  schedule: "0 23 * * *"
  jobTemplate:
    spec:
      template:
        spec:
          priorityClassName: workflow-controller
          serviceAccountName: vault-admin
          containers:
          - name: backup-vault
            image: <oci-backup-image> # This is defined in the Dockerfile section
            imagePullPolicy: Always
            env:
              - name: VAULT_TOKEN
                valueFrom:
                  secretKeyRef:
                    name: vault-token
                    key: VAULT_TOKEN
            volumeMounts:
            - name: oci-key
              mountPath: /root/.oci/oci_api_key.pem
              subPath: oci_api_key.pem
            - name: oci-conf
              mountPath: /root/.oci/config
              subPath: config
            - name: kube-configmap
              mountPath: /root/.kube/
          restartPolicy: OnFailure
          volumes:
          - name: oci-key
            secret:
              secretName: oci-key
              items:
                  - key: oci_api_key.pem
                    path: oci_api_key.pem        
              defaultMode: 256
          - name: oci-conf
            configMap:
              name: oci-conf
              items:
                  - key: oci_config
                    path: config  
          - name: kube-configmap
            configMap:
              name: kube-configmap
              items:
                  - key: CONFIG
                    path: config
              defaultMode: 0600

Deploy all resources.

kubectl apply -f oci-conf.yaml
kubectl apply -f oci-key-secret.yaml
kubectl apply -f kube-config-cm.yaml
kubectl apply -f oci-backup-vault-cj.yaml

Finally, we can verify the snapshot file in the bucket, as follows.

Snapshot vault in bucket

Restore snapshot in a new Vault Server

In this case, you need to download the snapshot stored in the bucket and follow these steps:

# Copy the snapshot from your local machine to the Vault pod
kubectl cp snapshot.snap vault/vault-0:/tmp/snapshot.snap

# Restore the snapshot
kubectl -n vault exec vault-0 -- vault operator raft snapshot restore /tmp/snapshot.snap

You will most likely receive an error message stating that the operation is not possible due to mismatched keys. In this case, simply add the --force flag. Don't worry—Vault will restore the data without issues since this is a new Vault server.

kubectl -n vault exec vault-0 vault operator raft snapshot restore /tmp/snapshot.snap --force

Conclusions

Automating Vault server backups with Raft storage in a Kubernetes environment ensures data security and disaster recovery preparedness. This guide demonstrated a Bash script to create snapshots from the Vault leader pod and store them in a cloud provider of choice, enhancing resilience and compliance.

⚠️ **GitHub.com Fallback** ⚠️