Features - sgajbi/portfolio-analytics-system GitHub Wiki

Feature List

The Portfolio Analytics System provides a production-grade analytics platform designed for accuracy, scalability, and resilience in wealth management.

Recent enhancements have added:

Idempotent event processing
Correlation ID propagation
Standardized logging
End-to-end observability

Ingestion & APIs

The system's entry points are designed for robust and flexible integration.

Unified Write & Read APIs (CQRS): Separate ingestion (ingestion-service) and query (query-service) APIs to ensure calculation-heavy queries do not impact write performance.
Bulk Data Ingestion: All ingestion endpoints support batch operations for high-volume loads.
Multi-Asset Support: Transactions, instruments, market prices, and FX rates are all supported.

Event-Driven Processing Pipeline

The core of the system is a resilient, asynchronous Kafka-based pipeline.

Decoupled Microservices Architecture: Independent services for each calculation stage (Cost, Cashflow, Position, Valuation, Performance).
Asynchronous High-Throughput Processing: Kafka-based message flow ensures ingestion stays fast and responsive.
Guaranteed Processing: Dead-Letter Queue (DLQ) for failed messages ensures no event loss.
Idempotent Processing: All calculators track processed events in processed_events table, preventing duplicate calculations.

Core Analytics Services

These services produce accurate portfolio analytics in near real-time.

Advanced Cost Basis Calculation: Calculates net cost and realized gains/losses with configurable accounting methods (FIFO, Average Cost).
Automated Cashflow Calculation: Configurable classification (income, outflow, fees) and timing (BOD/EOD).
Historical Position Keeping: Maintains full date-accurate position history.
Mark-to-Market Valuation: Calculates daily valuations and unrealized PnL for each position.
Portfolio Performance Metrics: Calculates Time-Weighted Return (TWR), Money-Weighted Return (MWR), and contribution analytics.

Data Persistence & Querying

All data is stored securely and queried efficiently.

Relational & Auditable Data Store: PostgreSQL schema managed by Alembic migrations for consistent deployments.
Correlation ID Traceability in DB: processed_events table stores correlation IDs for operational debugging.
Flexible Query API: Query endpoints return latest positions, full position histories, transactions with cashflows, valuations, and performance metrics.
Correlation ID in API Responses: All API responses include X-Correlation-ID for log tracing.

Observability & Operational Features

The system is built for production supportability.

Correlation ID Propagation:
- Generated in ingestion service (<svc-shortname>:<uuid> format).
- Propagated across Kafka topics and services.
- Returned in API responses.
Standard Logging:
- Shared logging utility in all services.
- Consistent log format:
```
[LEVEL] [corr_id=PREFIX:uuid] Service - Message
```
Centralized Logging & Monitoring:
- Logs aggregated in Splunk/ELK.
- Metrics captured in Prometheus/Grafana.
Operational Debugging:
- Events traceable via correlation ID in logs and database.
- Database queries link event status across all services.