Propagating trace context through Celery tasks

This page is part of distributed tracing for async jobs within the broader topic of observability for job queues, and shows how to keep a single trace intact as work crosses from a web request into a Celery worker.

When a request enqueues a task, the trace usually dies at the broker. The producer span ends, the worker starts a fresh, unrelated trace, and in Jaeger you see two disconnected fragments instead of one end-to-end timeline — so you cannot tell that the slow API call and the slow background job belong to the same user action. The fix is to carry the W3C traceparent from producer to consumer through Celery's task headers and link the spans. This guide does that with OpenTelemetry's Celery instrumentation, then verifies a single connected trace in Jaeger.

Prerequisites

  • A Celery 5.x app with a broker. If you need one, see setting up Celery with a Redis broker.
  • A Jaeger instance (all-in-one is fine) reachable via OTLP on port 4317.
  • Python 3.9+ and the OpenTelemetry packages below.
  • Both the producer process (e.g. your web app) and the worker process instrumented — context only flows if both ends speak the same propagation format.
pip install opentelemetry-api opentelemetry-sdk \
  opentelemetry-instrumentation-celery \
  opentelemetry-exporter-otlp-proto-grpc

Step 1 — Configure a shared tracer provider

Both producer and worker need an identical tracer setup: a service-named resource, an OTLP exporter to Jaeger, and the W3C trace-context propagator (OpenTelemetry's default). Put this in a module both processes import.

# tracing.py — shared OpenTelemetry bootstrap
from opentelemetry import trace
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter

def init_tracing(service_name: str) -> None:
    provider = TracerProvider(
        resource=Resource.create({"service.name": service_name})
    )
    provider.add_span_processor(
        BatchSpanProcessor(OTLPSpanExporter(endpoint="http://jaeger:4317"))
    )
    trace.set_tracer_provider(provider)
    # W3C traceparent is the default propagator; no extra config needed.

Use distinct service.name values (e.g. web-api and celery-worker) so Jaeger shows which side each span came from while keeping them in one trace.

Step 2 — Instrument Celery on both sides

CeleryInstrumentor automatically injects traceparent into task headers when a task is published and extracts it when a worker picks the task up. Call it in both the producer and the worker; the worker uses the worker_process_init signal so each prefork child instruments itself.

# celery_app.py — worker and producer share this app
from celery import Celery
from celery.signals import worker_process_init
from opentelemetry.instrumentation.celery import CeleryInstrumentor
from tracing import init_tracing

app = Celery("tasks", broker="redis://redis:6379/0")

@worker_process_init.connect(weak=False)
def _init_worker_tracing(**_):
    init_tracing("celery-worker")
    CeleryInstrumentor().instrument()

@app.task
def resize_image(image_id: str):
    # Spans created here automatically become children of the producer span
    return f"resized {image_id}"
# producer.py — the web app or script that enqueues work
from opentelemetry import trace
from celery_app import resize_image
from tracing import init_tracing
from opentelemetry.instrumentation.celery import CeleryInstrumentor

init_tracing("web-api")
CeleryInstrumentor().instrument()
tracer = trace.get_tracer(__name__)

def handle_request(image_id: str):
    with tracer.start_as_current_span("handle_upload"):
        # .delay() now injects traceparent into the task headers
        resize_image.delay(image_id)

Instrument the producer too — if only the worker is instrumented there is no parent context to carry, and you still get disconnected traces.

Step 3 — Confirm the traceparent rides in task headers

Before trusting Jaeger, prove the header is actually on the message. Inspect headers from inside a task or with a before_task_publish signal.

# debug_headers.py — log the injected traceparent on publish
from celery.signals import before_task_publish

@before_task_publish.connect
def _log_traceparent(headers=None, **_):
    # OpenTelemetry stores W3C context under the 'traceparent' header key
    print("traceparent on publish:", headers.get("traceparent"))

A line like traceparent on publish: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01 confirms injection. The middle 32-hex segment is the trace ID that must match on the worker side.

Step 4 — Add manual child spans inside the task (optional)

The instrumentation creates one span per task run. For deeper visibility, open child spans around the expensive parts of the task; they nest under the worker span automatically because the context was extracted.

# inside resize_image, add sub-spans for the slow phases
from opentelemetry import trace
tracer = trace.get_tracer(__name__)

@app.task
def resize_image(image_id: str):
    with tracer.start_as_current_span("download"):
        data = _download(image_id)
    with tracer.start_as_current_span("transform") as span:
        span.set_attribute("image.bytes", len(data))
        out = _transform(data)
    with tracer.start_as_current_span("upload"):
        _upload(image_id, out)
    return image_id

These attributes and sub-spans show up as a flame graph under the task, so you can see whether download, transform, or upload dominates.

Step 5 — Verify a single connected trace in Jaeger

Run both processes, trigger a request, and confirm one trace spans both services. The trace ID logged in Step 3 is your lookup key.

# Start a worker (separate terminal)
celery -A celery_app worker --loglevel=info --concurrency=2

# Trigger the producer
python -c "from producer import handle_request; handle_request('img-42')"

Open Jaeger UI (http://jaeger:16686), select service web-api, and find the trace. It should contain a handle_upload span (service web-api) with the Celery publish span beneath it, and the resize_image run span (service celery-worker) plus your download/transform/upload children — all under one trace ID.

# Or query the Jaeger API for spans by trace ID and assert both services appear
curl -s "http://jaeger:16686/api/traces/4bf92f3577b34da6a3ce929d0e0e4736" \
  | python -c "import sys,json; d=json.load(sys.stdin); \
print(sorted({s['process']['serviceName'] \
for t in d['data'] for s in t['spans']}))"
# Expect: ['celery-worker', 'web-api']

Seeing both web-api and celery-worker under the same trace ID is the definitive proof that context propagated.

Gotchas and edge cases

Producer not instrumented. If only the worker calls CeleryInstrumentor().instrument(), there is no traceparent to inject and every task starts a new root trace. Instrument the producer process and ensure a span is active when .delay() is called.

Prefork children lose instrumentation. Instrumenting only in the parent before fork can leave children uninstrumented or with a broken exporter. Always (re)instrument inside worker_process_init with weak=False, as shown, so each child sets up its own tracer and exporter.

apply_async(headers=...) overwrites traceparent. If your code sets custom task headers, make sure you merge rather than replace — passing a fresh headers dict can clobber the injected traceparent and silently break propagation. Add your keys to the existing headers instead.

eager mode hides propagation bugs. With task_always_eager=True (common in tests) tasks run in-process and stay in the same trace regardless of header propagation, so tracing "works" in tests but breaks in production. Validate against a real worker and broker, not eager mode.

FAQ

Does this work with RabbitMQ as well as Redis? Yes. Propagation rides in the task message headers, which both brokers carry, so the same CeleryInstrumentor setup works unchanged whether you run Redis or RabbitMQ as the broker.

Can I export to an OpenTelemetry Collector instead of directly to Jaeger? Yes, and it is the recommended production setup. Point the OTLPSpanExporter at the Collector's OTLP endpoint and let the Collector fan out to Jaeger, Tempo, or another backend; the instrumentation code does not change.

Related