cmd dr - nself-org/cli GitHub Wiki
Disaster recovery operations: drill, promote-standby, reconfigure-dns, rollback, fence.
nself dr <subcommand> [flags]
nself dr runs disaster recovery procedures. It covers planned drills, promoting a warm standby to primary, rewriting DNS records to a new primary IP, rolling back a promotion, and fencing the cluster (Redis read-only flag) to prevent split-brain writes during failover.
dr drill is the primary verification command. In v1.0.9, only --scenario cold-start is supported. Cold-start provisions a Hetzner VM via hcloud, restores the latest backup via ssh, runs the full smoke-query catalog, records RTO, and writes a dated report to ~/.claude/backups/nself-staging/dr/. --install-cron installs the monthly drill systemd timer (nself-dr-drill.timer).
v1.0.9 scenario support:
-
cold-start, fully implemented; provisions VM, restores, verifies, records RTO -
region-failover, NOT supported in v1.0.9 (single-region by design); returns a deprecation error directing to v1.1.0 and the DR runbook -
data-corruption, NOT supported in v1.0.9 (PITR via pgbackrest is planned for v1.1.0); returns a deprecation error
dr promote-standby requires production confirmation unless --yes is passed. dr reconfigure-dns --ip <new-ip> updates DNS to point traffic at the new primary. dr rollback demotes the promoted standby and resyncs from the original primary. dr fence sets a read_only=true flag in Redis that the application layer must honor.
| Name | Description |
|---|---|
drill |
Execute a DR drill |
promote-standby |
Promote warm standby to primary |
reconfigure-dns |
Update DNS records to point to new primary |
rollback |
Demote promoted standby and resync from original primary |
fence |
Set read-only flag in Redis for split-brain prevention |
| Flag | Default | Description |
|---|---|---|
--scenario |
cold-start |
Drill scenario: cold-start (only supported in v1.0.9; region-failover and data-corruption planned for v1.1.0) |
--dry-run |
false | Preview only |
--now |
false | Run a full provision-restore-smoke-destroy drill immediately |
--install-cron |
false | Install the monthly drill systemd timer |
--schedule |
monthly |
Drill cadence for --install-cron (only monthly supported) |
--hetzner-project |
"" |
Hetzner project name used for drill VM |
--vm-type |
cx22 |
Hetzner server type for drill VM |
--ssh-key |
/root/.config/nself/dr-key.pub |
Public SSH key path injected into drill VM |
--region |
fsn1 |
Hetzner location for drill VM |
--render-cloud-init |
false | Print the cloud-init user-data template and exit |
--render-alerts |
false | Print the Prometheus DRDrillFailed rule and exit |
| Flag | Default | Description |
|---|---|---|
--region |
"" |
Target region for promotion |
--yes |
false | Skip confirmation |
| Flag | Default | Description |
|---|---|---|
--ip |
"" |
New primary IP address (required) |
# Dry-run a cold-start drill
nself dr drill --dry-run
# Run an actual cold-start drill now
sudo nself dr drill --now
# Install the monthly drill timer on the ops host
sudo nself dr drill --install-cron --hetzner-project camarata
# Print the cloud-init template that would be used
nself dr drill --render-cloud-init
# Promote the warm standby to primary in eu-west
nself dr promote-standby --region eu-west --yes
# Point DNS at the new primary IP
nself dr reconfigure-dns --ip 5.75.235.42
# Fence writes during a manual failover
nself dr fence- cmd-backup, backup operations
- cmd-promote, environment promotion
- cmd-watchdog, self-healing watchdog
- Commands, full command index