API Gateway Run Book ‐ Operations Guide - Wiz-DevTech/prettygirllz GitHub Wiki

API Gateway Run Book - Operations Guide

Overview

This run book provides operational procedures for managing the WizDevTech API Gateway application, a Spring Boot application that provides caching capabilities for API responses and Server-Side Rendering (SSR) content.

System Architecture

Components

  • API Gateway: Main Spring Boot application (GatewayApplication.java)
  • Controllers: REST endpoints for API and health checks
  • Caching Layer: Dual caching system (in-memory + database)
  • Scheduled Tasks: Automatic cache cleanup
  • Database: H2 in-memory (test) or configurable datasource

Key Features

  • Product API with intelligent caching
  • SSR fallback mechanism with HTML caching
  • Automatic cache expiration and cleanup
  • Health monitoring endpoints
  • Test utilities for cache verification

Application Startup

Prerequisites

  • Java 17+
  • Maven/Gradle
  • Database connection (configured in application.yml)

Starting the Application

# Standard startup
java -jar gateway-application.jar

# With specific profile
java -jar gateway-application.jar --spring.profiles.active=test

# Development mode
./mvnw spring-boot:run

Startup Verification

  1. Check application logs for successful startup
  2. Verify endpoints:
    curl http://localhost:8080/health
    curl http://localhost:8080/api/test

Monitoring & Health Checks

Health Endpoints

System Health

  • Endpoint: GET /health
  • Purpose: Database connectivity check
  • Expected Response: "DB connection successful!"
  • Error Response: "DB connection failed: <error message>"

Application Health

  • Endpoint: GET /api/test
  • Purpose: API service verification
  • Expected Response: "API is working! Current time: <timestamp>"

Log Monitoring

Key Log Patterns to Monitor

# Successful cache operations
"Cache hit for product id: {id}"
"Cache miss for product id: {id}"
"Cache hit for SSR route: {route}"

# Error conditions
"Error in product endpoint: {error}"
"Error in fallback endpoint: {error}"
"DB connection failed: {error}"

# Scheduled cleanup
"Cleaned expired cache entries."

Log Levels

  • INFO: Normal operations, cache hits/misses
  • WARNING: Non-critical issues
  • ERROR: Application errors, endpoint failures
  • DEBUG: Detailed operation traces

Cache Management

Cache Types

API Response Cache

  • Table: api_responses
  • Purpose: Cache API responses with TTL
  • TTL: 1 hour (3600 seconds)
  • Cleanup: Hourly via scheduler

SSR Content Cache

  • Table: ssr_cache
  • Purpose: Cache pre-rendered HTML
  • TTL: Configurable per route
  • Cleanup: Manual or application-level

Cache Operations

Cache Verification

# Test cache with specific key
curl "http://localhost:8080/cache-test?key=test-123"

# Verify cache status
curl "http://localhost:8080/verify-cache?key=test-123"

Cache Cleanup

# Manual cleanup of test data
curl -X DELETE http://localhost:8080/clear-test-data

Cache Statistics

Monitor cache performance through logs:

  • Hit/miss ratios per endpoint
  • Cache entry counts
  • Cleanup operation results

Common Operations

1. Product API Operations

# Get product (creates cache entry)
curl http://localhost:8080/api/products/123

# Subsequent calls use cache
curl http://localhost:8080/api/products/123

2. SSR Fallback Operations

# Access SSR route (returns cached or 404)
curl http://localhost:8080/api/fallback/homepage
curl http://localhost:8080/api/fallback/category/electronics

3. Cache Maintenance

  • Automatic: Scheduled cleanup runs every hour
  • Manual: Use test endpoints for verification
  • Database: Query tables directly if needed

Troubleshooting Guide

Common Issues

1. Cache Not Working

Symptoms: All requests bypass cache Investigation:

# Check cache entries in database
# Verify timestamps and expiry
# Check logs for cache operations

Resolution:

  • Ensure database connectivity
  • Verify system time synchronization
  • Check transaction isolation levels

2. Database Connection Failures

Symptoms: Health endpoint returns connection errors Investigation:

  • Check application.yml configuration
  • Verify database service status
  • Check connection pool settings Resolution:
  • Restart application
  • Verify database credentials
  • Check network connectivity

3. High Memory Usage

Symptoms: Application becomes sluggish Investigation:

  • Monitor cache table sizes
  • Check for memory leaks
  • Review cache expiration settings Resolution:
  • Reduce cache TTL
  • Increase cleanup frequency
  • Tune JVM memory settings

4. SSR Cache Issues

Symptoms: Fallback endpoints return 404 Investigation:

  • Check route normalization (leading slash)
  • Verify expiry dates
  • Review SSR cache population Resolution:
  • Ensure routes are properly formatted
  • Check cache population process
  • Verify expiry logic

Performance Monitoring

Key Metrics to Track

  1. Response Times

    • Cache hits vs. misses
    • Database query times
    • Overall endpoint latency
  2. Cache Efficiency

    • Hit ratio percentage
    • Cache size and growth
    • Expiration rates
  3. System Resources

    • JVM memory usage
    • Database connections
    • CPU utilization
  4. Error Rates

    • 5xx error frequency
    • Database failures
    • Timeout occurrences

Maintenance Procedures

Daily Tasks

  • Check health endpoints
  • Review error logs
  • Monitor cache performance

Weekly Tasks

  • Analyze cache hit ratios
  • Review system resource usage
  • Check for expired entries

Monthly Tasks

  • Performance trend analysis
  • Cache strategy review
  • System resource optimization

Configuration Management

Environment Variables

# Database configuration
SPRING_DATASOURCE_URL=jdbc:postgresql://localhost:5432/gateway
SPRING_DATASOURCE_USERNAME=gateway_user
SPRING_DATASOURCE_PASSWORD=secret

# Logging levels
LOGGING_LEVEL_COM_WIZDEVTECH=DEBUG

Profile-based Configuration

  • test: In-memory H2 database
  • production: Production database settings
  • development: Debug logging enabled

Emergency Procedures

Application Not Responding

  1. Check system resources (CPU, memory)
  2. Review recent log entries
  3. Restart application gracefully
  4. If needed, perform hard restart

Database Issues

  1. Verify database service status
  2. Check connection pool health
  3. Clear problematic cache entries
  4. Restart with clean cache if necessary

Cache Corruption

  1. Stop cache cleanup scheduler
  2. Analyze problematic entries
  3. Clear affected cache tables
  4. Restart application

Contact Information

Escalation Path

  1. Level 1: Application restart
  2. Level 2: Configuration review
  3. Level 3: Developer/architect consultation

Documentation References

  • API Documentation: [Internal Wiki]
  • Architecture Diagrams: [Design Documents]
  • Deployment Guides: [DevOps Runbooks]

Version Information

  • Application Version: [From build.gradle/pom.xml]
  • Java Version: 17+
  • Spring Boot Version: 3.x
  • Last Updated: [Date]
⚠️ **GitHub.com Fallback** ⚠️