Production Database Migration Guide - sgajbi/portfolio-analytics-system GitHub Wiki

Overview

Database migrations in production must be safe, tested, and reversible. This guide outlines:

Pre-checks before migration.
Applying migrations with minimal downtime.
Rollback and recovery strategy.

Migration Philosophy

Forward-only migrations in production (downgrades are for dev/staging only).
Zero downtime goal — migrations should not block event processing for long periods.
Pre-tested migrations — all changes must be validated in staging with production-like data before live deployment.

1️⃣ Pre-Migration Checklist

Code & Migration Readiness

Confirm migration script exists in alembic/versions/ with descriptive message.
Validate migration correctness by running in staging.
Check models match DB schema after applying migration.

Data Safety

Ensure database backups are taken:

pg_dump -Fc -h <prod-host> -U <user> <db-name> > backup_before_migration.dump

Communication

Notify:
- DevOps team
- Application owners
- Operations team (Ops will monitor during migration)
Schedule maintenance window if downtime is expected.

2️⃣ Applying the Migration in Production

Step 1: Put Services in Safe Mode

For schema changes that affect write operations:
- Pause Kafka consumers (to avoid writes during migration).
- Disable CI/CD auto-deploy temporarily.

Step 2: Run Migration

Execute Alembic upgrade:

alembic upgrade head

Step 3: Validate Schema

Check table definitions:

\d processed_events;

Confirm new columns/indexes exist.
Check existing records for integrity.

Step 4: Resume Services

Restart Kafka consumers.
Re-enable CI/CD deployments.

3️⃣ Post-Migration Verification

Functional Validation

Trigger a test transaction in ingestion.
Confirm correlation ID and idempotent processing still work.
Query API to confirm results:

curl -H "X-Correlation-ID: TEST:uuid" https://<api>/portfolios/1001/positions

Operational Validation

Monitor logs in Splunk for errors.
Check Kafka consumer lag.
Verify no increase in DB error rates.

4️⃣ Rollback Strategy

⚠ Direct downgrade is discouraged in production. Instead:

Apply forward-only fix migration if rollback is needed.
If data corruption occurs:
- Restore from backup.
- Redeploy previous stable application version.

Best Practices

Group non-breaking changes (adding columns, indexes) separately from breaking changes (renaming/dropping columns).
Use default values + nullable columns for safe schema evolution.
Large data changes (backfills) should be handled via batch jobs, not Alembic migrations.

Production Database Migration Guide - sgajbi/portfolio-analytics-system GitHub Wiki

Overview

Migration Philosophy

1️⃣ Pre-Migration Checklist

Code & Migration Readiness

Data Safety

Communication

2️⃣ Applying the Migration in Production

Step 1: Put Services in Safe Mode

Step 2: Run Migration

Step 3: Validate Schema

Step 4: Resume Services

3️⃣ Post-Migration Verification

Functional Validation

Operational Validation

4️⃣ Rollback Strategy

Best Practices

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️