API Gateway Run Book ‐ Operations Guide - Wiz-DevTech/prettygirllz GitHub Wiki
This run book provides operational procedures for managing the WizDevTech API Gateway application, a Spring Boot application that provides caching capabilities for API responses and Server-Side Rendering (SSR) content.
-
API Gateway: Main Spring Boot application (
GatewayApplication.java
) - Controllers: REST endpoints for API and health checks
- Caching Layer: Dual caching system (in-memory + database)
- Scheduled Tasks: Automatic cache cleanup
- Database: H2 in-memory (test) or configurable datasource
- Product API with intelligent caching
- SSR fallback mechanism with HTML caching
- Automatic cache expiration and cleanup
- Health monitoring endpoints
- Test utilities for cache verification
- Java 17+
- Maven/Gradle
- Database connection (configured in
application.yml
)
# Standard startup
java -jar gateway-application.jar
# With specific profile
java -jar gateway-application.jar --spring.profiles.active=test
# Development mode
./mvnw spring-boot:run
- Check application logs for successful startup
- Verify endpoints:
curl http://localhost:8080/health curl http://localhost:8080/api/test
-
Endpoint:
GET /health
- Purpose: Database connectivity check
-
Expected Response:
"DB connection successful!"
-
Error Response:
"DB connection failed: <error message>"
-
Endpoint:
GET /api/test
- Purpose: API service verification
-
Expected Response:
"API is working! Current time: <timestamp>"
# Successful cache operations
"Cache hit for product id: {id}"
"Cache miss for product id: {id}"
"Cache hit for SSR route: {route}"
# Error conditions
"Error in product endpoint: {error}"
"Error in fallback endpoint: {error}"
"DB connection failed: {error}"
# Scheduled cleanup
"Cleaned expired cache entries."
-
INFO
: Normal operations, cache hits/misses -
WARNING
: Non-critical issues -
ERROR
: Application errors, endpoint failures -
DEBUG
: Detailed operation traces
-
Table:
api_responses
- Purpose: Cache API responses with TTL
- TTL: 1 hour (3600 seconds)
- Cleanup: Hourly via scheduler
-
Table:
ssr_cache
- Purpose: Cache pre-rendered HTML
- TTL: Configurable per route
- Cleanup: Manual or application-level
# Test cache with specific key
curl "http://localhost:8080/cache-test?key=test-123"
# Verify cache status
curl "http://localhost:8080/verify-cache?key=test-123"
# Manual cleanup of test data
curl -X DELETE http://localhost:8080/clear-test-data
Monitor cache performance through logs:
- Hit/miss ratios per endpoint
- Cache entry counts
- Cleanup operation results
# Get product (creates cache entry)
curl http://localhost:8080/api/products/123
# Subsequent calls use cache
curl http://localhost:8080/api/products/123
# Access SSR route (returns cached or 404)
curl http://localhost:8080/api/fallback/homepage
curl http://localhost:8080/api/fallback/category/electronics
- Automatic: Scheduled cleanup runs every hour
- Manual: Use test endpoints for verification
- Database: Query tables directly if needed
Symptoms: All requests bypass cache Investigation:
# Check cache entries in database
# Verify timestamps and expiry
# Check logs for cache operations
Resolution:
- Ensure database connectivity
- Verify system time synchronization
- Check transaction isolation levels
Symptoms: Health endpoint returns connection errors Investigation:
- Check application.yml configuration
- Verify database service status
- Check connection pool settings Resolution:
- Restart application
- Verify database credentials
- Check network connectivity
Symptoms: Application becomes sluggish Investigation:
- Monitor cache table sizes
- Check for memory leaks
- Review cache expiration settings Resolution:
- Reduce cache TTL
- Increase cleanup frequency
- Tune JVM memory settings
Symptoms: Fallback endpoints return 404 Investigation:
- Check route normalization (leading slash)
- Verify expiry dates
- Review SSR cache population Resolution:
- Ensure routes are properly formatted
- Check cache population process
- Verify expiry logic
-
Response Times
- Cache hits vs. misses
- Database query times
- Overall endpoint latency
-
Cache Efficiency
- Hit ratio percentage
- Cache size and growth
- Expiration rates
-
System Resources
- JVM memory usage
- Database connections
- CPU utilization
-
Error Rates
- 5xx error frequency
- Database failures
- Timeout occurrences
- Check health endpoints
- Review error logs
- Monitor cache performance
- Analyze cache hit ratios
- Review system resource usage
- Check for expired entries
- Performance trend analysis
- Cache strategy review
- System resource optimization
# Database configuration
SPRING_DATASOURCE_URL=jdbc:postgresql://localhost:5432/gateway
SPRING_DATASOURCE_USERNAME=gateway_user
SPRING_DATASOURCE_PASSWORD=secret
# Logging levels
LOGGING_LEVEL_COM_WIZDEVTECH=DEBUG
-
test
: In-memory H2 database -
production
: Production database settings -
development
: Debug logging enabled
- Check system resources (CPU, memory)
- Review recent log entries
- Restart application gracefully
- If needed, perform hard restart
- Verify database service status
- Check connection pool health
- Clear problematic cache entries
- Restart with clean cache if necessary
- Stop cache cleanup scheduler
- Analyze problematic entries
- Clear affected cache tables
- Restart application
- Level 1: Application restart
- Level 2: Configuration review
- Level 3: Developer/architect consultation
- API Documentation: [Internal Wiki]
- Architecture Diagrams: [Design Documents]
- Deployment Guides: [DevOps Runbooks]
- Application Version: [From build.gradle/pom.xml]
- Java Version: 17+
- Spring Boot Version: 3.x
- Last Updated: [Date]