Outbox Events - sgajbi/portfolio-analytics-system GitHub Wiki

Outbox Events Pattern

Overview

The Outbox Events pattern ensures reliable, atomic event publishing in an event-driven architecture. It decouples business logic persistence from Kafka/event bus delivery, eliminating data loss and enabling exactly-once event processing even if a service or process crashes.


Why Use the Outbox Pattern?

  • Prevents event loss when services crash or network failures occur after database writes but before event publish.
  • Ensures atomicity between database changes and event emission.
  • Simplifies retries and enables auditability of all published events.

How It Works in This System

  1. Service handles a business event (e.g., new transaction processed).
  2. Within the same DB transaction:
    • Data is persisted (e.g., to transactions or positions tables).
    • A corresponding outbox event is inserted into the outbox_events table, describing what should be published.
  3. A background dispatcher process (or thread) periodically polls the outbox table for unprocessed events:
    • Attempts to publish the event to Kafka.
    • On success, marks the outbox event as "processed".
    • On failure, increases a retry counter and leaves the event for later retry.
    • All attempts are logged with the correlation_id for traceability.

Outbox Events Table Schema

Column Type Purpose
id int PK, autoincrement
aggregate_type string Entity type (e.g., "Transaction")
aggregate_id string Entity identifier
event_type string E.g., "transaction_persisted"
payload JSON Full event data payload
topic string Kafka topic to publish to
status string pending, sent, failed
correlation_id string Trace event across services
retry_count int Number of publish attempts
last_attempted_at datetime Timestamp of last attempt
created_at datetime When event was added
processed_at datetime When event was successfully published

Outbox Dispatcher

  • Component: outbox_dispatcher.py in libs/portfolio-common
  • Logic:
    1. Polls for events with status='pending'
    2. Publishes to Kafka topic
    3. Updates status to sent and sets processed_at
    4. If publishing fails, increments retry_count and sets status='failed' if over retry limit

Example Flow

  1. Transaction processed:
    • Write to transactions table
    • Add an outbox_event for transaction_persisted
  2. Outbox dispatcher picks up and publishes the event to raw_transactions_completed Kafka topic.
  3. If Kafka is down, dispatcher retries automatically using the retry logic.

Operations & Troubleshooting

  • Auditability: The outbox table provides a full audit trail of all events and their delivery status.
  • Replay: If an event is stuck in failed status, operators can reset it to pending for reprocessing.
  • Backoff/alerts: Dispatcher can be extended to provide alerts on repeated failures or high queue length.

Related Pages