Sidekiq Batch Jobs & Workflows
This guide extends the throughput work in Sidekiq Performance Tuning and the broader Backend Frameworks & Worker Scaling material into multi-job coordination — running hundreds of jobs in parallel and reliably knowing when they have all finished.
A single Sidekiq job is easy. The hard part is the workflow: fan out a thousand image-resize jobs and then send one "your gallery is ready" email only after every resize succeeds. The naive approaches all break — counting completed jobs in Redis with INCR races under concurrency, and polling for "is the queue empty" gives false positives the instant a worker picks up the last job. The symptom is a completion callback that fires too early (before all children are done) or never (because a counter was lost on a retry). This page shows the reliable patterns: Sidekiq::Batch from Sidekiq Pro for fan-out/fan-in with on(:success) and on(:complete) callbacks, an open-source equivalent for teams without Pro, batch progress tracking, and chaining dependent jobs into multi-stage pipelines.
Prerequisites
- A working Sidekiq install (Sidekiq 7.x assumed) with Redis configured. See Tuning the Sidekiq Redis connection pool — batches push many jobs at once and lean on the client pool.
- For native batches: a Sidekiq Pro license and the
sidekiq-progem. For the open-source path: thegushgem or a small custom coordinator (both shown). - Idempotent child jobs — batch jobs retry like any other, so a child may run more than once.
- Enough worker concurrency to actually run the fan-out in parallel; a batch of 1,000 jobs against a single thread is just a slow loop.
Step 1 — Fan out work with Sidekiq::Batch (Pro)
Sidekiq::Batch groups jobs pushed inside its jobs block and tracks their collective completion in Redis. Define the batch, attach a callback class, then enqueue children inside the block.
# app/services/gallery_processor.rb
class GalleryProcessor
def self.run(gallery_id, image_ids)
batch = Sidekiq::Batch.new
batch.description = "Resize gallery #{gallery_id}"
batch.on(:success, GalleryCallbacks, gallery_id: gallery_id)
batch.jobs do
image_ids.each { |id| ResizeImageJob.perform_async(id) } # fan-out
end
batch.bid # the batch id, persist this to query progress later
end
end
Every perform_async issued inside batch.jobs is registered as a child of the batch. Sidekiq increments and decrements the batch's pending counter atomically in Redis as children complete, sidestepping the race conditions of a hand-rolled counter.
Step 2 — React with on(:success) and on(:complete) callbacks
The two callback events have a critical difference. :complete fires when every child has finished or exhausted its retries (some may have failed permanently). :success fires only when every child succeeded with no failures. Use :success for "all-or-nothing" workflows and :complete for "we're done regardless" cleanup.
# app/callbacks/gallery_callbacks.rb
class GalleryCallbacks
def on_success(status, options)
# every child succeeded — safe to publish the result
GalleryMailer.ready(options["gallery_id"]).deliver_later
end
def on_complete(status, options)
# all children done; some may have failed permanently
if status.failures > 0
AdminAlert.batch_partial_failure(options["gallery_id"], status.failures)
end
end
end
The status object carries total, pending, and failures counts, so a single callback can branch on whether the workflow was fully or partially successful.
Step 3 — Nest batches for fan-out/fan-in stages
Real workflows have stages: process all items, then aggregate. You can open a new batch inside a parent batch's success callback, so the aggregation stage starts only after the processing stage fully succeeds. This is fan-in.
# app/callbacks/gallery_callbacks.rb
class GalleryCallbacks
def on_success(status, options)
# processing stage done -> start the aggregation stage as a child batch
aggregate = Sidekiq::Batch.new
aggregate.on(:success, FinalizeCallbacks, gallery_id: options["gallery_id"])
aggregate.jobs do
BuildGalleryManifestJob.perform_async(options["gallery_id"])
GenerateThumbnailSpriteJob.perform_async(options["gallery_id"])
end
end
end
Nesting gives you a dependency graph without polling: each stage's success callback is the trigger for the next, and Sidekiq guarantees the callback fires exactly once when the stage's counter reaches zero.
Step 4 — Open-source alternative without Sidekiq Pro
Without Pro, use the gush gem, which models workflows as an explicit DAG on top of Sidekiq (ActiveJob). Declare jobs and their dependencies; gush handles fan-out/fan-in ordering.
# app/workflows/gallery_workflow.rb
class GalleryWorkflow < Gush::Workflow
def configure(gallery_id, image_ids)
resize_jobs = image_ids.map do |id|
run ResizeImageJob, params: { image_id: id } # fan-out, run in parallel
end
# manifest runs only after ALL resize jobs finish (fan-in)
run BuildManifestJob, params: { gallery_id: gallery_id }, after: resize_jobs
end
end
# kick it off
flow = GalleryWorkflow.create(gallery_id, image_ids)
flow.start!
If you want zero extra dependencies, a minimal coordinator works for simple fan-in: track remaining children with Redis DECR (which is atomic) and trigger the finalizer when it hits zero.
# app/jobs/resize_image_job.rb
class ResizeImageJob
include Sidekiq::Job
def perform(image_id, batch_key)
ImageResizer.call(image_id)
# atomic decrement avoids the read-modify-write race
remaining = Sidekiq.redis { |r| r.decr(batch_key) }
FinalizeGalleryJob.perform_async(batch_key) if remaining.zero?
end
end
The DECR approach is reliable for counting but lacks Pro's failure accounting — a child that exhausts retries never decrements, so pair it with a dead-letter strategy so a stuck child doesn't wedge the finalizer forever.
Step 5 — Track batch progress
For a progress bar or status endpoint, query the batch status by its id. Pro exposes Sidekiq::Batch::Status; the custom path reads the Redis counter.
# app/controllers/batches_controller.rb
def show
status = Sidekiq::Batch::Status.new(params[:bid])
render json: {
total: status.total,
pending: status.pending,
failures: status.failures,
complete: status.complete?,
percent: ((status.total - status.pending) * 100.0 / status.total).round(1),
}
end
Persist the bid returned in Step 1 alongside the owning record (the gallery row) so the UI can look up progress without scanning Redis.
Step 6 — Chain dependent jobs
When stage B simply needs to run after stage A and there is no fan-out, you do not need a batch — just enqueue the next job from the end of the first. Keep the chain explicit so retries of stage A re-trigger stage B correctly.
# app/jobs/import_job.rb
class ImportJob
include Sidekiq::Job
def perform(file_id)
rows = Importer.call(file_id)
# enqueue the next stage only after this one's work is committed
ValidateImportJob.perform_async(file_id, rows)
end
end
For chains with retries, make each link idempotent so a redelivered ImportJob does not enqueue duplicate ValidateImportJob runs — use a unique job key or a guard on already-imported state.
Verification
Confirm the workflow coordinates correctly before relying on it.
Watch the batch drain and the callback fire in the Sidekiq logs:
# tail the worker log; expect child job completions then the on_success callback line
bundle exec sidekiq -C config/sidekiq.yml
Assert the success callback fires exactly once after all children, in a test:
# spec/services/gallery_processor_spec.rb
it "publishes only after every child succeeds" do
Sidekiq::Testing.inline! do
expect(GalleryMailer).to receive(:ready).once # callback fires once, not per child
GalleryProcessor.run(gallery.id, [1, 2, 3])
end
end
Inspect live batch state from the console:
# rails console
status = Sidekiq::Batch::Status.new(bid)
puts "#{status.pending} of #{status.total} pending, #{status.failures} failed"
Gotchas & edge cases
on(:complete)fires even with failures. Using it as the "everything worked" trigger publishes results when some children failed permanently. Useon(:success)for all-or-nothing workflows.- Children must be idempotent. A batch child retries like any Sidekiq job; a retried resize that runs twice must not corrupt state or double-decrement a custom counter.
- A custom
DECRcounter never recovers from a dead child. If a child exhausts retries it never decrements, so the finalizer never fires. Decrement in an exhausted-retries handler too, or use Pro's failure-aware batches. - Pushing thousands of children in one
batch.jobsblock bursts the client Redis pool. Enqueue in chunks and ensure the client pool is sized for the burst — see Tuning the Sidekiq Redis connection pool. - Batch metadata expires. Sidekiq Pro batches have a Redis TTL (default ~30 days). Querying progress for an old
bidreturns an empty status, not an error — handle the nil-ish case in status endpoints.
Related
- Sidekiq Performance Tuning — concurrency and Redis tuning that determine how fast a batch fans out.
- Tuning the Sidekiq Redis connection pool — sibling guide; batches stress the client pool when enqueuing many children.
- Sidekiq middleware for job prioritization — sibling guide for routing exhausted batch children and prioritizing stages.
- Backend Frameworks & Worker Scaling — scaling the worker fleet that executes fan-out workloads.