Post Deployment - ericfitz/tmi GitHub Wiki
Post-Deployment
This guide covers verification, testing, and next steps after deploying TMI.
Overview
After deploying all TMI components, complete the following steps:
- Verify that all services are running
- Perform smoke tests
- Test critical user flows
- Set up monitoring and alerts
- Document your deployment
- Train users (optional)
Deployment Verification
Check All Services
Verify each component is running and accessible.
Web Application
# Check if web app is accessible
curl -I https://tmi.example.com
# Expected: HTTP/2 200 OK
# Verify static assets load
curl -I https://tmi.example.com/main.js
# Expected: HTTP/2 200 OK with Cache-Control headers
In browser:
- Navigate to
https://tmi.example.com - Verify page loads without errors
- Open DevTools Console - should have no errors
- Check Network tab - all resources should load (200 OK)
API Server
# Health check (root endpoint serves as health/info endpoint)
curl https://api.tmi.example.com/
# Expected response (JSON when called without Accept: text/html):
{
"status": {
"code": "ok",
"time": "2025-11-12T00:00:00Z"
},
"service": {
"name": "TMI",
"build": "1.3.2-abc1234 (production)"
},
"api": {
"specification": "https://github.com/ericfitz/tmi/blob/main/api-schema/tmi-openapi.json",
"version": "1.3.2"
}
}
# Note: status.code is "ok", "degraded", or "error".
# When "degraded", a "health" object with database/redis details is included.
# An optional "operator" object (name, contact) may also appear if configured.
# Check OAuth providers
curl https://api.tmi.example.com/oauth2/providers
# Expected response with configured providers
PostgreSQL
# Connection test
psql -h postgres-host -U tmi_user -d tmi -c "SELECT version();"
# Check tables exist
psql -h postgres-host -U tmi_user -d tmi -c "\dt"
# Expected: List of TMI tables (users, threat_models, diagrams, etc.)
# Check for data
psql -h postgres-host -U tmi_user -d tmi -c "SELECT count(*) FROM users;"
Redis
# Connection test
redis-cli -h redis-host -p 6379 -a password ping
# Expected: PONG
# Check memory usage
redis-cli -h redis-host -a password info memory | grep used_memory_human
# Check if keys exist
redis-cli -h redis-host -a password DBSIZE
Smoke Tests
Perform basic functionality tests to verify the system works correctly.
Test 1: User Registration via OAuth
- Navigate to app:
https://tmi.example.com - Click "Login" or appropriate login button
- Select OAuth provider (Google, GitHub, or Microsoft)
- Authenticate with provider
- Verify redirect back to application
- Check authentication state:
- User should be logged in
- User name/email should display
- No console errors
Expected result: User successfully logs in and sees the authenticated view.
Test 2: Create Threat Model
After logging in:
- Click "New Threat Model" or similar button
- Fill in details:
- Name: "Test Threat Model"
- Description: "Smoke test"
- Framework: STRIDE (or other)
- Submit
- Verify creation:
- Threat model appears in list
- Can navigate to threat model detail page
- No errors in console
Expected result: Threat model is created and accessible.
Test 3: Create Data Flow Diagram
From threat model detail page:
- Click "New Diagram" or similar
- Name: "Test Diagram"
- Open diagram editor
- Add a few shapes:
- External entity
- Process
- Data store
- Save
- Verify:
- Diagram appears in list
- Can reopen diagram
- Shapes are persisted
Expected result: Diagram is created and editable.
Test 4: Real-Time Collaboration
Test WebSocket functionality (requires two browsers/users):
User 1:
- Open threat model
- Open diagram in edit mode
- Add a shape
User 2:
- Open same diagram
- Verify shape appears in real-time
Expected result: Changes appear instantly for both users.
Test 5: Create Threat
From threat model page:
- Click "Add Threat"
- Fill in details:
- Title: "Test Threat"
- Description: "Testing threat creation"
- Category: Spoofing (or other)
- Severity: High
- Save
- Verify:
- Threat appears in list
- Can view threat details
- Can edit threat
Expected result: Threat is created and manageable.
Integration Tests
Test integration between components.
OAuth Token Exchange
# This test simulates the OAuth flow programmatically
# Step 1: Get OAuth authorization URL
curl https://api.tmi.example.com/oauth2/providers
# Step 2: Manually authenticate and get authorization code
# (Done in browser, copy code from callback URL)
# Step 3: Exchange code for tokens
curl -X POST https://api.tmi.example.com/oauth2/token?idp=google \
-H "Content-Type: application/json" \
-d '{
"code": "AUTHORIZATION_CODE",
"state": "RANDOM_STATE",
"redirect_uri": "https://tmi.example.com/oauth2/callback"
}'
# Expected: JWT access and refresh tokens
API Authentication
# Test authenticated API call
ACCESS_TOKEN="your-access-token-from-above"
curl https://api.tmi.example.com/threat_models \
-H "Authorization: Bearer $ACCESS_TOKEN"
# Expected: List of threat models (or empty array)
Database Connectivity
# Verify server can connect to database
# Check server logs for successful connection
# Docker deployment:
docker logs tmi-api-prod 2>&1 | grep -i database
# Or systemd deployment:
journalctl -u tmi -n 100 | grep -i database
# Should show successful connection and migration status
Redis Caching
# Check if cache is working
# After using application, check Redis for cached data
redis-cli -h redis-host -a password --scan --pattern "cache:*"
# Should show various cache keys
Performance Testing
Load Test API Endpoints
Use ab (Apache Bench) or similar:
# Test root/health endpoint
ab -n 1000 -c 10 https://api.tmi.example.com/
# Expected:
# - All requests succeed (200 OK)
# - Average response time < 100ms
# - No errors
Test Database Query Performance
-- Check slow queries
SELECT
query,
calls,
mean_exec_time,
max_exec_time
FROM pg_stat_statements
WHERE mean_exec_time > 10
ORDER BY mean_exec_time DESC
LIMIT 10;
-- Expected: Few or no slow queries
Test Redis Performance
# Benchmark Redis
redis-cli -h redis-host -a password --latency-history
# Expected:
# - Average latency < 1ms
# - No significant spikes
Security Verification
SSL/TLS Configuration
# Check SSL certificate
echo | openssl s_client -servername tmi.example.com \
-connect tmi.example.com:443 2>/dev/null | \
openssl x509 -noout -dates
# Verify:
# - Certificate is valid (not expired)
# - Valid for your domain
# Check SSL configuration
curl -I https://tmi.example.com | grep -i "strict-transport-security"
# Expected: HSTS header present
Security Headers
# Check security headers
curl -I https://tmi.example.com
# Should include:
# - Strict-Transport-Security
# - X-Frame-Options
# - X-Content-Type-Options
# - X-XSS-Protection
OAuth Security
Verify OAuth configuration:
- Redirect URI validation: Try using wrong redirect URI - should fail
- State parameter validation: Try reusing state - should fail
- Token expiration: Wait for token to expire, verify refresh works
- Logout: Verify tokens are invalidated after logout
Database Security
# Verify database is not publicly accessible
nmap -p 5432 your-public-ip
# Expected: Port 5432 should be filtered or closed
# Try connection from unauthorized host
psql -h postgres-host -U tmi_user -d tmi
# Expected: Connection should be rejected if not from allowed IP
Monitoring Setup
Application Monitoring
Set up health check monitoring:
# Create health check script
cat > /usr/local/bin/tmi-health-check.sh << 'EOF'
#!/bin/bash
ERRORS=0
# Check web app
if ! curl -f -s -o /dev/null https://tmi.example.com; then
echo "Web app failed"
ERRORS=$((ERRORS + 1))
fi
# Check API (root endpoint is the health/info endpoint)
if ! curl -f -s -o /dev/null https://api.tmi.example.com/; then
echo "API failed"
ERRORS=$((ERRORS + 1))
fi
# Check database
if ! psql -h postgres-host -U tmi_user -d tmi -c "SELECT 1;" > /dev/null 2>&1; then
echo "Database failed"
ERRORS=$((ERRORS + 1))
fi
# Check Redis
if ! redis-cli -h redis-host -a password ping > /dev/null 2>&1; then
echo "Redis failed"
ERRORS=$((ERRORS + 1))
fi
if [ $ERRORS -gt 0 ]; then
exit 1
else
echo "All services healthy"
exit 0
fi
EOF
chmod +x /usr/local/bin/tmi-health-check.sh
Schedule with cron:
# Add to crontab - run every 5 minutes
*/5 * * * * /usr/local/bin/tmi-health-check.sh || mail -s "TMI Health Check Failed" [email protected]
Log Aggregation
Set up log collection:
# Check logs are being generated
# Docker deployment:
docker logs -f tmi-api-prod
# Or systemd deployment:
sudo journalctl -u tmi -f
# Web server:
sudo tail -f /var/log/nginx/access.log
# Consider setting up centralized logging:
# - ELK Stack (Elasticsearch, Logstash, Kibana)
# - Loki + Grafana
# - Splunk
# - Datadog
Metrics Collection
Consider deploying metrics collection:
- Prometheus: Collect metrics from TMI server
- Grafana: Visualize metrics with dashboards
- Node Exporter: System metrics
- PostgreSQL Exporter: Database metrics
- Redis Exporter: Cache metrics
Alerting
Set up alerts for critical events:
- API server down
- Database connection failed
- Redis unavailable
- High error rate
- Slow response times
- Certificate expiring soon
- Disk space low
Backup Verification
Test Database Backups
# Perform manual backup
pg_dump -h postgres-host -U tmi_user -d tmi -Fc \
-f /backup/tmi_test_$(date +%Y%m%d).dump
# Verify backup file exists and has content
ls -lh /backup/tmi_test_*.dump
# Test restore to temporary database
createdb -h postgres-host -U postgres tmi_test_restore
pg_restore -h postgres-host -U tmi_user \
-d tmi_test_restore /backup/tmi_test_*.dump
# Verify data
psql -h postgres-host -U tmi_user -d tmi_test_restore \
-c "SELECT count(*) FROM threat_models;"
# Clean up
dropdb -h postgres-host -U postgres tmi_test_restore
Test Redis Backups
# Trigger Redis save
redis-cli -h redis-host -a password BGSAVE
# Check RDB file exists
ls -lh /var/lib/redis/dump.rdb
# Verify AOF if enabled
ls -lh /var/lib/redis/appendonly.aof
Documentation
Document your deployment for future reference and team members.
Deployment Documentation
Create /opt/tmi/DEPLOYMENT_INFO.md:
# TMI Deployment Information
**Deployment Date**: 2025-11-12
**Deployed By**: Your Name
**Environment**: Production
## URLs
- Web Application: https://tmi.example.com
- API Server: https://api.tmi.example.com
- Admin Contact: [email protected]
## Infrastructure
- Web Server: nginx on server1.example.com (203.0.113.10)
- API Server: TMI v1.0.0 on server2.example.com (203.0.113.11)
- Database: PostgreSQL 15 on db.example.com (internal)
- Cache: Redis 7 on cache.example.com (internal)
## Configuration
- Config File: config-production.yml (mounted at /etc/tmi/ in Docker)
- Environment Variables: TMI_ prefix (e.g., TMI_DATABASE_URL, TMI_REDIS_URL, TMI_JWT_SECRET)
- Environment: Production
- OAuth Providers: Google, GitHub
- JWT Expiration: Configurable via TMI_JWT_EXPIRATION_SECONDS
## Backups
- Database: Daily at 2 AM to /backup/postgresql/
- Retention: 7 days
- Redis: RDB + AOF persistence enabled
## Monitoring
- Health Checks: Every 5 minutes via cron
- Logs: docker logs tmi-api-prod, /var/log/nginx/
- Alerts: Email to [email protected]
## Maintenance
- SSL Certificates: Auto-renewal via Let's Encrypt
- Updates: Manual, test in staging first
- Contact: [email protected] for issues
Runbook
Create operational runbook for common tasks:
# TMI Operations Runbook
## Starting Services (Docker Compose)
docker compose -f docker-compose.prod.yml up -d
### Or individually:
docker start tmi-api-prod
docker start tmi-postgres-prod
docker start tmi-redis-prod
## Stopping Services
docker compose -f docker-compose.prod.yml down
# Or individually (reverse order of starting):
docker stop tmi-api-prod
docker stop tmi-redis-prod
docker stop tmi-postgres-prod
## Starting Services (systemd, if applicable)
### Web Application (nginx)
sudo systemctl start nginx
### API Server
sudo systemctl start tmi
### Database
sudo systemctl start postgresql
### Redis
sudo systemctl start redis
## Restarting After Configuration Change
1. Edit config: config-production.yml (or update TMI_ environment variables)
2. Restart: docker restart tmi-api-prod (or sudo systemctl restart tmi)
3. Verify: curl https://api.tmi.example.com/
## Viewing Logs
- API (Docker): docker logs -f tmi-api-prod
- API (systemd): sudo journalctl -u tmi -f
- Web: sudo tail -f /var/log/nginx/access.log
- Database: docker logs -f tmi-postgres-prod
- Redis: docker logs -f tmi-redis-prod
## Common Issues
See Troubleshooting section below.
User Training
If deploying for a team, consider:
- User Documentation: Link to the Home
- Training Session: Walk through creating first threat model
- Demo Video: Record tutorial for async learning
- Support Channel: Set up Slack/Teams channel for questions
- Admin Documentation: Document admin tasks
Next Steps
After successful deployment:
Immediate (Day 1)
- Monitor logs for errors
- Watch for failed authentication attempts
- Verify backups are running
- Check health check alerts working
Short Term (Week 1)
- Review security logs
- Monitor performance metrics
- Gather user feedback
- Document any issues encountered
- Fine-tune performance if needed
Ongoing
- Regular security updates
- Monitor resource usage
- Review and rotate logs
- Test backup restores monthly
- Update documentation as needed
- Plan for scaling if usage grows
Common Post-Deployment Issues
High Memory Usage
Symptom: Server running out of memory.
Solutions:
- Check Redis memory usage:
redis-cli info memory - Review PostgreSQL connection count
- Check for memory leaks in TMI server logs
- Adjust cache TTL settings
- Scale up server resources
Slow Response Times
Symptom: Application feels sluggish.
Solutions:
- Check database query performance
- Review Redis cache hit rate
- Enable query logging to find slow queries
- Optimize database indexes
- Consider connection pooling
- Add more server instances
Authentication Failures
Symptom: Users can't log in.
Solutions:
- Verify OAuth provider credentials
- Check OAuth callback URLs match
- Review TMI server logs for token exchange errors
- Verify TMI_JWT_SECRET is set correctly
- Check OAuth provider status pages
WebSocket Disconnections
Symptom: Real-time collaboration doesn't work.
Solutions:
- Check reverse proxy WebSocket configuration
- Verify WebSocket timeout settings
- Review CORS configuration
- Check for network issues
- Verify allowed origins configuration
Rollback Plan
If deployment has critical issues:
Rollback Steps
-
Stop the current TMI server:
# Docker deployment: docker stop tmi-api-prod # Or systemd deployment: sudo systemctl stop tmi -
Restore previous version:
# Docker deployment - roll back to previous image tag: docker compose -f docker-compose.prod.yml down # Edit docker-compose.prod.yml to use previous VERSION, then: docker compose -f docker-compose.prod.yml up -d # Or systemd deployment: sudo cp /opt/tmi/tmiserver.backup /opt/tmi/tmiserver sudo cp /etc/tmi/config-production.yml.backup /etc/tmi/config-production.yml -
Restore database (if schema changed):
psql -h postgres-host -U tmi_user -d tmi < /backup/pre_deployment_backup.sql -
Start previous version (systemd only; Docker restarts via compose above):
sudo systemctl start tmi -
Verify rollback:
curl https://api.tmi.example.com/ -
Notify users of temporary issues and rollback
Success Criteria
Your deployment is successful when:
- All services are running and accessible
- Users can log in via OAuth
- Users can create threat models
- Users can create and edit diagrams
- Real-time collaboration works
- No errors in logs (or only expected warnings)
- Health checks pass consistently
- Backups are running automatically
- Monitoring and alerts are active
- SSL certificates are valid
- Performance is acceptable
- Security scan shows no critical issues
Getting Help
If you encounter issues:
-
Check documentation:
-
Review Logs:
- Application logs
- Web server logs
- Database logs
- System logs
-
Community Support:
- GitHub Issues: tmi
- GitHub Discussions: Share deployment experiences
-
Professional Support:
- Contact maintainers for enterprise support
- Consider hiring DevOps consultant for complex deployments
Congratulations
You have successfully deployed TMI. Your team can now:
- Create threat models collaboratively
- Build data flow diagrams in real-time
- Document threats systematically
- Integrate with issue tracking systems
- Export and share threat models
Related Pages
- Planning-Your-Deployment -- Pre-deployment planning and decision matrix
- Component-Integration -- Connecting TMI components
- Security-Best-Practices -- Security guidelines
- Monitoring-and-Health -- Ongoing monitoring procedures
- Common-Issues -- Frequently encountered problems
- Performance-and-Scaling -- Performance tuning and scaling strategies