Backup Routine - intelligent002/kafka-backup-offline GitHub Wiki
Step-by-step routine used by the tool to perform full cluster backups.
Key Features
- Automatic nightly/weekly backups via
cronjob- You can schedule automated backups using Linux
crontabon the management node (typicallynode-00). - Example: To run the backup every day at 01:00 AM, add the following line using
crontab -e:0 1 * * * /data/KBO/kafka-backup-offline.sh cluster_backup - This command will invoke the full backup process according to the default configuration.
- Ensure that the script is executable and the user has sufficient permissions.
- You can schedule automated backups using Linux
- Manual backup triggers via the GUI
- Retention policy with automatic rotation of old backups
- Structured "zip-of-zips" archive format for multi-node environments
Storage Details
Backups are stored on node-00 by default, mounted as /backup/.
Within this drive:
- All cold backups are stored under
/backup/cold/ - Subfolders include:
/backup/cold/data//backup/cold/certificates//backup/cold/credentials//backup/cold/configs/
For certificates:
- Rotated backups are stored in
/backup/cold/certificates/rotated/YYYY/MM/DD/ - Pinned (non-rotated) backups are stored in
/backup/cold/certificates/pinned/ - Example file:
2025-04-08---01-14-02---credentials.xzcontaining a zip-of-zips, where each node's certificates are stored in separate archives inside the main one
Rotation & Retention
- Rotation refers to automatic deletion of older backups beyond the retention window
- The tool ensures only the most recent backups are retained to conserve disk space
- Default retention policy is:
# Number of days to retain config backups. retention_policy_certificates: 365 # Number of days to retain credentials backups. retention_policy_credentials: 365 # Number of days to retain config backups. retention_policy_configs: 365 # Number of days to retain data backups. retention_policy_data: 7 - Users can manually pin specific backups to prevent them from being deleted by moving them into the
pinned/directory
Workflow
- Controlled shutdown of the Kafka cluster (from the last towards the first node, one by one)
- Perform zip of component data on all nodes
- Collect zip of component data from all nodes to local drive on node-00
- Consolidation of zip files into a "zip of zips" on node-00
- Restart Kafka cluster using the controlled startup sequence (from first towards the last node, one by one)