Reliability - rselk/sidekiq GitHub Wiki

There are two aspects of reliability with Sidekiq and Redis:

  1. pushing jobs to Redis with the client
  2. fetching jobs from Redis with the server

Setup

TL;DR To use the Reliability features in Sidekiq Pro, add this to your initializer:

require 'sidekiq/pro/reliable_push'
Sidekiq.configure_server do |config|
  require 'sidekiq/pro/reliable_fetch'
end

You will also need to add -i to your sidekiq command line. Read on for more detail.

Server

Sidekiq uses BRPOP to pop a job off the queue in Redis. This is very efficient and simple but it has one drawback: the job is now removed from Redis. If Sidekiq crashes while processing that job, it is lost forever. This is not a problem for many but some businesses need absolute reliability when processing jobs.

Sidekiq Pro offers an alternative strategy for job processing using Redis' RPOPLPUSH command which ensures that a crash will not result in lost jobs. To enable "reliable fetch" you must tag each process on a machine with a unique index and require the strategy:

Start Sidekiq with a unique index for each process on the machine:

sidekiq -e production -i 0
sidekiq -e production -i 1
sidekiq -e production -i 2

Require the reliable fetch code:

Sidekiq.configure_server do |config|
  # This needs to be within the configure_server block
  require 'sidekiq/pro/reliable_fetch'
end

When Sidekiq starts, you should see ReliableFetch activated:

INFO: Booting Sidekiq 2.6.2 with Redis at redis://localhost:6379/0
INFO: Running in ruby 1.9.3p327 (2012-11-10 revision 37606) [x86_64-darwin11.4.2]
INFO: Sidekiq Pro 0.9.0, commercially licensed.  Thanks for your support!
INFO: ReliableFetch activated
INFO: Starting processing, hit Ctrl-C to stop

Heroku

Use $DYNO and some bash trickery to set a unique index for each worker process in your Procfile:

# DYNO will be set to worker.1, worker.2, etc for each worker process
worker: bundle exec sidekiq -e production -i ${DYNO:-1}

Cloud66

Cloud66 have implemented a similar option to Heroku. Use {{UNIQUE_INT}} to assign a unique integer to the process. This integer should be unique across processes, so multiple processes won't clash, but is not guaranteed to be unique across servers

worker: bundle exec sidekiq -e production -i {{UNIQUE_INT}}

Pausing Queues

Reliable Fetch also allows you to pause/unpause processing on a queue. See Pro API.

Fetch algorithms

Reliable fetch supports the same two fetch algorithms as Sidekiq's basic fetch: strict priority and weighted random.

Strict queue ordering algorithm

sidekiq -e production -i 0 -q critical -q default -q bulk

Beware that strict prioritization can lead to starvation: bulk jobs will only be processed once the critical and default queues are empty. You can switch priorities for different processes to ensure everyone gets processed:

sidekiq -e production -i 0 -q critical -q default -q bulk
sidekiq -e production -i 1 -q bulk -q default -q critical

Weighted random algorithm

sidekiq -e production -i 0 -q critical,3 -q default,2 -q bulk,1

When using weighted queues, sidekiq will randomly choose a queue to check, without blocking, using weighted random choice. For example, in the command given above, sidekiq will sample from the array ["critical", "critical", "critical", "default", "default", "bulk"]

Client

When the Sidekiq client pushes a job to Redis, it just assumes the network call will work. There's no error handling so any exception will trickle up into your app and cause a 500 error. The Sidekiq Pro client offers additional reliability by locally enqueueing the job for delivery once the network connection is successfully re-established.

There are a few limitations:

  • the local queue is non-persistent so if the client process is restarted, the jobs are lost.
  • the local queue doesn't work with Batches so any Redis network issues when creating a batch will still cause an exception and fail.

You can activate "reliable push" in your sidekiq initializer:

# This should not go in a Sidekiq.configure_{client,server} block
# since it should always be activated no matter which environment.
require 'sidekiq/pro/reliable_push'