Outbox Events - sgajbi/portfolio-analytics-system GitHub Wiki
Outbox Events Pattern
Overview
The Outbox Events pattern ensures reliable, atomic event publishing in an event-driven architecture. It decouples business logic persistence from Kafka/event bus delivery, eliminating data loss and enabling exactly-once event processing even if a service or process crashes.
Why Use the Outbox Pattern?
- Prevents event loss when services crash or network failures occur after database writes but before event publish.
- Ensures atomicity between database changes and event emission.
- Simplifies retries and enables auditability of all published events.
How It Works in This System
- Service handles a business event (e.g., new transaction processed).
- Within the same DB transaction:
- Data is persisted (e.g., to
transactions
orpositions
tables). - A corresponding outbox event is inserted into the
outbox_events
table, describing what should be published.
- Data is persisted (e.g., to
- A background dispatcher process (or thread) periodically polls the outbox table for unprocessed events:
- Attempts to publish the event to Kafka.
- On success, marks the outbox event as "processed".
- On failure, increases a retry counter and leaves the event for later retry.
- All attempts are logged with the
correlation_id
for traceability.
Outbox Events Table Schema
Column | Type | Purpose |
---|---|---|
id | int | PK, autoincrement |
aggregate_type | string | Entity type (e.g., "Transaction") |
aggregate_id | string | Entity identifier |
event_type | string | E.g., "transaction_persisted" |
payload | JSON | Full event data payload |
topic | string | Kafka topic to publish to |
status | string | pending , sent , failed |
correlation_id | string | Trace event across services |
retry_count | int | Number of publish attempts |
last_attempted_at | datetime | Timestamp of last attempt |
created_at | datetime | When event was added |
processed_at | datetime | When event was successfully published |
Outbox Dispatcher
- Component:
outbox_dispatcher.py
inlibs/portfolio-common
- Logic:
- Polls for events with
status='pending'
- Publishes to Kafka topic
- Updates status to
sent
and setsprocessed_at
- If publishing fails, increments
retry_count
and setsstatus='failed'
if over retry limit
- Polls for events with
Example Flow
- Transaction processed:
- Write to
transactions
table - Add an
outbox_event
fortransaction_persisted
- Write to
- Outbox dispatcher picks up and publishes the event to
raw_transactions_completed
Kafka topic. - If Kafka is down, dispatcher retries automatically using the retry logic.
Operations & Troubleshooting
- Auditability: The outbox table provides a full audit trail of all events and their delivery status.
- Replay: If an event is stuck in
failed
status, operators can reset it topending
for reprocessing. - Backoff/alerts: Dispatcher can be extended to provide alerts on repeated failures or high queue length.