Observability & Logging - sgajbi/portfolio-analytics-system GitHub Wiki

Overview

The Portfolio Analytics System includes robust observability capabilities to ensure that every event, API call, and calculation can be traced end-to-end.

Recent enhancements provide:

  • Correlation ID propagation across all services.
  • Standardized logging for consistent debugging.
  • Database-level traceability via processed_events.

Correlation ID Standards

Format

<svc-shortname>:<uuid>

Example:

  • Ingestion → ING:550e8400-e29b-41d4-a716-446655440000
  • Cost Calculator → COST:550e8400-e29b-41d4-a716-446655440000 (inherited from Ingestion)

Service Shortnames

  • Ingestion → ING
  • Persistence → PST
  • Cost Calculator → COST
  • Cashflow Calculator → CFLOW
  • Position Calculator → POS
  • Valuation Calculator → VAL
  • Performance Calculator → PERF
  • API Service → QRY

Propagation Rules

  • Generated in Ingestion Service if missing.
  • Passed through Kafka message headers to downstream services.
  • Logged in all services.
  • Included in API responses as X-Correlation-ID.

Logging Standards

Format

All logs follow the same structure:

<Timestamp> [<LogLevel>] [corr_id=<CorrelationID>] <ServiceName> - <Message>

Example:

2025-08-01 12:34:56 [INFO] [corr_id=VAL:123e4567-e89b-12d3-a456-426614174000] ValuationCalculator - Valuation updated for Portfolio 1001, AAPL

Implementation

  • Shared logger in portfolio_common.logging_utils
  • Injects correlation_id from contextvars
  • Applied in all services

Database Traceability

processed_events Table

  • Stores (event_id, service_name, correlation_id, processed_at)

  • Used to:

    • Prevent duplicate processing (idempotency)
    • Trace processing of a specific event across services

Example query:

SELECT * 
FROM processed_events
WHERE correlation_id = 'ING:550e8400-e29b-41d4-a716-446655440000';

Operational Debugging

Tracing an Event

  1. Identify correlation ID from API response or log.

  2. Search Splunk/ELK:

    corr_id=ING:550e8400-e29b-41d4-a716-446655440000
    
  3. Query processed_events to see processing status.

  4. Identify the last processed service for further investigation.


Testing Observability

  • Integration tests confirm correlation ID:

    • Is generated if missing.
    • Is propagated to all downstream services.
  • Log format is validated in service startup tests.

⚠️ **GitHub.com Fallback** ⚠️