, have worked with machine learning or large-scale data pipelines, chances are you’ve used some sort of queueing system.
Queues let services talk to each other asynchronously: you send off work, don’t wait around, and let another system pick it up when ready. This is essential when your tasks aren’t instant — think long-running model training jobs, batch ETL pipelines, or even processing requests for LLMs that take minutes per query.
So why am I writing this? I recently migrated a production queueing setup to RabbitMQ, ran into a bunch of bugs, and found that documentation was thin on the trickier parts. After a fair bit of trial and error, I thought it’d be worth sharing what I learned.
Hope you will find this useful!
A quick primer: queues vs request-response model
Microservices typically communicate in two styles — the classic request–response model, or the more flexible queue-based model.
Imagine ordering pizza. In a request–response model, you tell the waiter your order and then wait. He disappears, and thirty minutes later your pizza shows up — but you’ve been left in the dark the whole time.
In a queue-based model, the waiter repeats your order, gives you a number, and drops it into the kitchen’s queue. Now you know it’s being handled, and you’re free to do something else till the chef gets to it.
That’s the difference: request–response keeps you blocked until the work is done, while queues confirm right away and let the work happen in the background.
What is Rabbit MQ?
RabbitMQ is a popular open-source message broker that ensures messages are reliably delivered from producers (senders) to consumers (receivers). First released in 2007 and written in Erlang, it implements AMQP (Advanced Message Queuing Protocol), an open standard for structuring, routing, and acknowledging messages.
Think of it like a post office for distributed systems: applications drop off messages, RabbitMQ sorts them into queues, and consumers pick them up when ready.
A common pairing in the Python world is Celery + RabbitMQ: RabbitMQ brokers the tasks, while Celery workers execute them in the background.
In containerised setups, RabbitMQ typically runs in its own container, while Celery workers run in separate containers that you can scale independently.
How it works at a high level
Your app wants to run some work asynchronously. Since this task might take a while, you don’t want the app to sit idle waiting. Instead, it creates a message describing the task and sends it to RabbitMQ.
- Exchange: This lives inside RabbitMQ. It doesn’t store messages but just decides where each message should go based on rules you set (routing keys and bindings).
Producers publish messages to an exchange, which acts as a routing intermediary. - Queues: They’re like mailboxes. Once the exchange decides which queue(s) a message should go to, it sits there till it’s picked up.
- Consumer: The service that reads and processes messages from a queue. In a Celery setup, the Celery worker is the consumer — it pulls tasks off the queue and does the actual work.
Once the message is routed into a queue, the RabbitMQ broker pushes it out to a consumer (if one is available) over a TCP connection.
Core components in Rabbit MQ
1. Routing and binding keys
Routing and binding keys work together to decide where a message ends up.
- A routing key is attached to a message by the producer.
- A binding key is the rule a queue declares when it connects (binds) to an exchange.
A binding defines the link between an exchange and a queue.
When a message is sent, the exchange looks at the message’s routing key. If that routing key matches the binding key of a queue, the message is delivered to that queue.
A message can only have one routing key.
A queue can have one or multiple binding keys, meaning it can listen for several different routing keys or patterns.
2. Exchanges
An exchange in RabbitMQ is like a traffic controller. It receives messages, does not store messages, and it’s key job is to decide which queue(s) the message should go to, based on rules.
If the routing key of a message does not match any the binding keys of any queues, it will not get delivered.
There are several types of exchanges, each with its own routing style.
2a) Direct exchange
Think of a direct exchange like an exact address delivery. The exchange looks for queues with binding keys that exactly match the routing key.
- If only one queue matches, the message will only be sent there (1:1).
- If multiple queues have the same binding key, the message will be copied to all of them (1:many).
2b) Fanout exchange
A fanout exchange is like shouting through a loudspeaker.
Every message is copied to all queues bound to the exchange. The routing keys are ignored, and it is always a 1:many broadcast.
Fanout exchanges can be useful when the same message needs to be sent to one or more queues with consumers who may process the same message in different ways.
2c) Topic exchange
A topic exchange works like a subscription system with categories.
Every message has a routing key, for example "order.completed”
. Queues can then subscribe to patterns such as "order.*”
. This means that whenever a message is related to an order, it will be delivered to any queues that have subscribed to that category.
Depending on the patterns, a message might end up in just one queue or in several at the same time.
There are two important special cases for binding keys:
*
(star) matches exactly one word in the routing key.#
(hash) matches zero or more words.
Let’s illustrate this to make the syntax alot more intuitive.

2d) Headers exchange
A headers exchange is like sorting mail by labels instead of addresses.
Instead of looking at the routing key (like "order.completed"
), the exchange inspects the headers of a message: These are key–value pairs attached as metadata. For instance:
x-match: all, priority: high, type: email
→ the queue will only get messages that have bothpriority=high
andtype=email
.x-match: any, region: us, region: eu
→ the queue will get messages where at least one of the conditions is true (region=us
orregion=eu
).
The x-match
field is what determines whether all rules must match or any one rule is enough.
Because multiple queues can each declare their own header rules, a single message might end up in just one queue (1:1) or in several queues at once (1:many).
Headers exchanges are less common in practice, but they’re useful when routing depends on more complex business logic. For example, you might want to deliver a message only if customer_tier=premium
, message_format=json
, or region=apac
.
2e) Dead letter exchange
A dead letter exchange is a safety net for undeliverable messages.
3. A push delivery model
This means that as soon as a message enters a queue, the broker will push it out to a consumer that is subscribed and ready. The consumer doesn’t request messages and instead just listens on the queue.
This push approach is great for low-latency delivery — messages get to consumers as soon as possible.
Useful features in Rabbit MQ
Rabbit MQ’s architecture lets you shape message flow to fit your workload. Here are some useful patterns.
Work queues — competing consumers pattern
You publish tasks into one queue, and many consumers (eg. celery workers) all listen to that queue. The broker delivers each message to exactly one consumer, so workers “compete” for work. This implicitly translates to simple load-balancing.
If you’re on celery, you’ll want to keep worker_prefetch_multiplier=1
. What this means is that a worker will only fetch one message at a time, avoiding slow workers from hoarding tasks.
Pub/sub pattern
Multiple queues bound to an exchange and each queue gets a copy of the message (fanout or topic exchanges). Since each queue gets its own message copy, so different consumers can process the same event in different ways.
Explicit acknowledgements
RabbitMQ uses explicit acknowledgements (ACKs) to guarantee reliable delivery. An ACK is a confirmation sent from the consumer back to the broker once a message has been successfully processed.
When a consumer sends an ACK, the broker removes that message from the queue. If the consumer NACKs or dies before ACKing, RabbitMQ can redeliver (requeue) the message or route it to a dead letter queue for inspection or retry.
There is, however, an important nuance when using Celery. Celery does send acknowledgements by default, but it sends them early — right after a worker receives the task, before it actually executes it. This behaviour (acks_late=False
, which is the default) means that if a worker crashes midway through running the task, the broker has already been told the message was handled and won’t redeliver it.
Priority queues
RabbitMQ has a out of the box priority queueing feature which lets higher priority messages jump the line. Under the hood, the broker creates an internal sub-queue for each priority level defined on a queue.
For example, if you configure five priority levels, RabbitMQ maintains five internal sub-queues. Within each level, messages are still consumed in FIFO order, but when consumers are ready, RabbitMQ will always try to deliver messages from higher-priority sub-queues first.
Doing so implicitly would mean an increasing amount of overhead if there were many priority levels. Rabbit MQ’s docs note that though priorities between 1 and 255 are supported, values between 1 and 5 are highly recommended.
Message TTL & scheduled deliveries
Message TTL (per-message or per-queue) automatically expires stale messages; and delayed delivery is available via plugins (e.g., delayed-message exchange) when you need scheduled execution.
How to optimise your Rabbit MQ and Celery setup
When you deploy Celery with RabbitMQ, you’ll notice a few “mystery” queues and exchanges appearing in the RabbitMQ management dashboard. These aren’t mistakes — they’re part of Celery’s internals.
After a few painful rounds of trial and error, here’s what I learned about how Celery really uses RabbitMQ under the hood — and how to tune it properly.
Kombu
Celery relies on Kombu, a Python messaging framework. Kombu abstracts away the low-level AMQP operations, giving Celery a high-level API to:
- Declare queues and exchanges
- Publish messages (tasks)
- Consume messages in workers
It also handles serialisation (JSON, Pickle, YAML, or custom formats) so tasks can be encoded and decoded across the wire.
Celery events and the celeryev
Exchange

Celery includes an event system that tracks worker and task state. Internally, events are published to a special topic exchange called celeryev
.
There are two such event types:
- Worker events eg.
worker.online
,worker.heartbeat
,worker.offline
are always on and are lightweight liveliness signals. - Task events, eg.
task-received
,task-started
,task-succeeded
,task-failed
which are disabled by default unless the-E
flag is added.
You have fine grain control over both types of events. You can turn off worker events (by turning off gossip, more on that below) while turning on task events.
Gossip
Gossip is Celery’s mechanism for workers to “chat” about cluster state — who’s alive, who just joined, who dropped out, and occasionally elect a leader for coordination. It’s useful for debugging or ad-hoc cluster coordination.
By default, Gossip is enabled. When a worker starts:
- It creates an exclusive, auto-delete queue just for itself.
- That queue is bound to the
celeryev
topic exchange with the routing key patternworker.#
.
Because every worker subscribes to every worker.*
event, the traffic grows quickly as the cluster scales.
With N workers, each one publishes its own heartbeat, and RabbitMQ fans that message out to the other N-1 gossip queues. In effect, you get an N × (N-1) fan-out pattern.
In my setup with 100 workers, that meant a single heartbeat was duplicated 99 times. During deployments — when workers were spinning up and shutting down, generating a burst of join, leave, and heartbeat events — the pattern spiraled out of control. The celeryev
exchange was suddenly handling 7–8k messages per second, pushing RabbitMQ past its memory watermark and leaving the cluster in a degraded state.
When this memory limit is exceeded, RabbitMQ blocks publishers until usage drops. Once memory falls back under the threshold, RabbitMQ resumes normal operation.
However, this means that during the memory spike the broker becomes unusable — effectively causing downtime. You won’t want that in production!
The solution is to disable Gossip so workers don’t bind to worker.#
. You can do this in the docker compose where the workers are spun up.
celery -A myapp worker --without-gossip
Mingle
Mingle is a worker startup step where the new worker contacts other workers to synchronise state — things like revoked tasks and logical clocks. This happens only once, during worker boot. If you don’t need this coordination, you can also disable it with --without-mingle
Occasional connection drops
In production, connections between Celery and RabbitMQ can occasionally drop — for example, due to a brief network blip. If you have monitoring in place, you may see these as transient errors.
The good news is that these drops are usually recoverable. Celery relies on Kombu, which includes automatic connection retry logic. When a connection fails, the worker will attempt to reconnect and resume consuming tasks.
As long as your queues are configured correctly, messages are not lost:
durable=True
(queue survives broker restart)delivery_mode=2
(persistent messages)- Consumers send explicit ACKs to confirm successful processing
If a connection drops before a task is acknowledged, RabbitMQ will safely requeue it for delivery once the worker reconnects.
Once the connection is re-established, the worker continues normal operation. In practice, occasional drops are fine, as long as they remain infrequent and queue depth doesn’t build up.
To end off
That’s all folks, these are some of the key lessons I’ve learned running RabbitMQ + Celery in production. I hope this deep dive has helped you better understand how things work under the hood. If you have more tips, I’d love to hear them in the comments and do reach out!!