How do you retry failed transactions using message queues? - rnakidi/dsa GitHub Wiki

β€œHow do you retry failed transactions using message queues?”

This is a common pattern to handle transient errors. Let’s understand with the help of payment processing as an example.

The general approach to implementing a retry mechanism using message queues has 3 main parts:

βœ… Main Queue: This is where new payment transactions are queued.

βœ… Dead Letter Queue: A separate queue for messages that failed processing multiple times.

βœ… Retry Queue: This is where retries are scheduled with delays. This queue is optional as you can also use the main queue for it.

Here’s how the process works:

[1] The consumer or payment processor picks up a message from the main queue. It attempts to process the payment transaction.

[2] If processing fails, it checks the retry count that’s often stored in the message metadata.

[3] If retry count > max retries, increment count and re-queue the message.

[4] If retry count β‰₯ max retries, move the message to the DLQ.

[5] For retries, you can either re-queue directly to the main queue with a delay or use a separate retry queue with a time-based trigger.

[6] Lastly, monitor the DLQ for messages that have exhausted retry attempts. Implement a process for dealing with them.

Some best practices to keep in mind while following this pattern:

πŸ‘‰ Exponential Backoff: Increase the delay between retries exponentially to avoid overwhelming the system.

πŸ‘‰ Idempotency: Ensure that the payment processor can safely retry payment without crashing the economy

πŸ‘‰ Message TTL: Set an overall TTL for messages to stop very old transactions from being processed.

πŸ‘‰ Retry Limits: Set a value for max number of retries

πŸ‘‰ Error Types: Distinguish between transient errors (can be retried) and permanent errors (direct to DLQ)

image

Source/Credit: https://www.linkedin.com/posts/saurabh-dashora_if-you-are-working-with-an-event-driven-system-activity-7269599650444173313-8uNb?utm_source=share&utm_medium=member_desktop