Schema Change Guide - fedjo/CEI-InOE GitHub Wiki
Schema Change Guide
This guide explains how to handle database schema changes and their impact on the CEI-InOE services.
Overview
The database schema is the source of truth for data structures. Changes flow through:
Migration → PostgreSQL → Services (Ingestor + API)
Types of Schema Changes
| Change Type | Complexity | Downtime? |
|---|---|---|
| Add nullable column | Low | No |
| Add required column | Medium | No* |
| Rename column | High | No* |
| Drop column | Medium | No* |
| Add table | Medium | No |
| Add view | Low | No |
| Modify view | Low | No |
* With proper coordination
Example Scenarios
Scenario 1: Add New Nullable Column
Requirement: Add quality_score to dairy production.
Step 1: Create Migration
-- db/migrations/V010__dairy_quality_score.sql
ALTER TABLE dairy_production
ADD COLUMN quality_score DECIMAL(4,2);
COMMENT ON COLUMN dairy_production.quality_score IS
'Calculated milk quality score (0-100)';
Step 2: Update Ingestor (if field comes from source)
# ingestor/app/models.py
class DairyProductionRecord(BaseModel):
# ...existing fields...
quality_score: Optional[float] = None # New field
# ingestor/app/mappings/dairy_production.yaml
columns:
# ...existing mappings...
quality_score:
source: "Quality Score"
type: float
nullable: true
Step 3: Update API
# api/app/schemas/dairy.py
class DairyProductionRecord(BaseModel):
# ...existing fields...
quality_score: float | None = None # New field
# api/app/db/queries/dairy.py
# Add quality_score to SELECT statement
Step 4: Deploy
# 1. Apply migration
docker-compose run flyway migrate
# 2. Deploy API (can read existing data, new field will be null)
docker-compose up -d api
# 3. Deploy ingestor (will start writing new field)
docker-compose up -d ingestor
Impact:
- ✅ API works before ingestor deploys (returns
nullfor new field) - ✅ No downtime required
- ✅ Backward compatible
Scenario 2: Rename Column
Requirement: Rename energy_kwh to kwh in energy tables.
Step 1: Migration (Transition Period)
-- db/migrations/V010__energy_rename_kwh.sql
-- Add new column with trigger to keep both in sync
ALTER TABLE fact_energy_hourly
ADD COLUMN kwh FLOAT;
UPDATE fact_energy_hourly
SET kwh = energy_kwh;
-- Create sync trigger
CREATE OR REPLACE FUNCTION sync_energy_kwh()
RETURNS TRIGGER AS $$
BEGIN
IF TG_OP = 'INSERT' OR TG_OP = 'UPDATE' THEN
IF NEW.kwh IS NULL AND NEW.energy_kwh IS NOT NULL THEN
NEW.kwh := NEW.energy_kwh;
ELSIF NEW.energy_kwh IS NULL AND NEW.kwh IS NOT NULL THEN
NEW.energy_kwh := NEW.kwh;
END IF;
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER tr_sync_energy_kwh
BEFORE INSERT OR UPDATE ON fact_energy_hourly
FOR EACH ROW EXECUTE FUNCTION sync_energy_kwh();
Step 2: Update API to use new column name
# api/app/db/queries/energy.py
# Change: energy_kwh as kwh → kwh
Step 3: Update Ingestor to write to new column
# ingestor/app/...
# Update to write to 'kwh' column
Step 4: Cleanup migration (after all services updated)
-- db/migrations/V011__energy_drop_old_kwh.sql
DROP TRIGGER tr_sync_energy_kwh ON fact_energy_hourly;
DROP FUNCTION sync_energy_kwh();
ALTER TABLE fact_energy_hourly DROP COLUMN energy_kwh;
Impact:
- ⚠️ Requires two-phase deployment
- ✅ Zero downtime with transition period
- ✅ Can rollback during transition
Scenario 3: Add New Table
Requirement: Add water consumption tracking.
Step 1: Migration
-- db/migrations/V010__water_consumption.sql
CREATE TABLE water_consumption (
id SERIAL PRIMARY KEY,
device_id INTEGER REFERENCES generic_device(id),
ts TIMESTAMP NOT NULL,
liters DECIMAL(12,2),
source_file UUID REFERENCES ingest_file(file_id),
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX idx_water_consumption_ts ON water_consumption(ts);
CREATE INDEX idx_water_consumption_device ON water_consumption(device_id);
Step 2: Add Ingestor Components
ingestor/app/
├── models.py # Add WaterConsumptionRecord
├── mappings/
│ └── water_consumption.yaml # New mapping file
└── dao/
└── (extend data_dao or create water_dao)
Step 3: Add API Components
api/app/
├── schemas/
│ └── water.py # New response schemas
├── db/queries/
│ └── water.py # New query functions
└── routers/
└── water.py # New router
# api/app/main.py
from app.routers import water
app.include_router(
water.router,
prefix="/api/v1/water",
tags=["Water"]
)
Step 4: Deploy
# 1. Apply migration
docker-compose run flyway migrate
# 2. Deploy API (endpoints ready, no data yet)
docker-compose up -d api
# 3. Deploy ingestor (starts collecting data)
docker-compose up -d ingestor
Scenario 4: Add Aggregation View
Requirement: Add monthly energy summary view.
Step 1: Migration only
-- db/migrations/V010__energy_monthly_view.sql
CREATE OR REPLACE VIEW v_energy_monthly AS
SELECT
DATE_TRUNC('month', ts)::date AS month,
device_id,
SUM(energy_kwh) AS total_kwh,
COUNT(*) AS reading_count
FROM fact_energy_hourly
GROUP BY DATE_TRUNC('month', ts), device_id
ORDER BY month DESC;
Step 2: Add API endpoint (optional query function)
# api/app/db/queries/energy.py
def get_monthly_summary(limit: int = 12):
return execute_query(
"SELECT * FROM v_energy_monthly LIMIT %(limit)s",
{"limit": limit}
)
Impact:
- ✅ No ingestor changes needed
- ✅ Grafana can also query the view
- ✅ Simple deployment
Change Impact Matrix
| Change | Ingestor | API | Grafana |
|---|---|---|---|
| Add nullable column | Update if ingesting | Update if exposing | May need update |
| Add required column | Must update | Must update | Must update |
| Rename column | Update mappings | Update queries | Update queries |
| Add table | Add model+DAO | Add router+schema | Add dashboard |
| Add view | None | Can query | Can query |
| Drop column | Remove from model | Remove from schema | Remove from queries |
Deployment Order
Safe Order (No Downtime)
1. Apply database migration
└── Schema changes are live
2. Deploy API service
└── Reads are safe (new columns return null)
└── Old endpoints still work
3. Deploy Ingestor service
└── Writes now include new columns
Why This Order?
- API first: Read operations are inherently safe
- New columns default to null: Existing API code handles missing data
- Ingestor last: Ensures writes are compatible with new schema
Rollback Procedures
Column Addition Rollback
-- If something goes wrong after adding a column
ALTER TABLE table_name DROP COLUMN column_name;
Complex Change Rollback
For changes with triggers/functions:
-- Store rollback script
-- db/migrations/V010__change_ROLLBACK.sql (not applied automatically)
DROP TRIGGER IF EXISTS trigger_name ON table_name;
DROP FUNCTION IF EXISTS function_name();
ALTER TABLE table_name DROP COLUMN IF EXISTS new_column;
Pre-Change Checklist
- Document the change requirement
- Identify affected services (Ingestor, API, Grafana)
- Write migration SQL
- Test migration on local database
- Update Ingestor models/mappings (if applicable)
- Update API schemas/queries (if applicable)
- Run tests for both services
- Plan deployment order
- Prepare rollback procedure
Post-Change Checklist
- Migration applied successfully
- API health check passes
- API endpoints return expected data
- Ingestor starts without errors
- New data is being ingested correctly
- Grafana dashboards work (if affected)
- Monitor for errors in logs