Message queue

A message queue sits between two services and holds messages until the receiver is ready to process them. Service A puts a message in the queue. Service B picks it up when it can. If Service B is down, the message waits in the queue. If Service B is slow, messages accumulate but nothing breaks. The queue absorbs the difference in speed between producer and consumer.

RabbitMQ, Amazon SQS, and Apache Kafka are the most popular message queue systems (though Kafka is technically a distributed event log). They solve different problems at different scales. SQS is the simplest: managed by AWS, no infrastructure to run, works out of the box. RabbitMQ offers more routing flexibility and runs anywhere. Kafka handles millions of messages per second and stores them durably for replay.

Message queues enable a pattern called "work queues" that is fundamental to scaling. Instead of one server processing all image uploads synchronously, the web server puts each upload into a queue. Ten worker servers each pull from the queue and process uploads in parallel. If traffic spikes, you add more workers. If traffic drops, workers sit idle. The queue is the buffer that decouples production from consumption.

Examples

A platform processes image uploads asynchronously.

When a user uploads a profile photo, the web server puts a message in an SQS queue with the image URL and user ID. A fleet of workers polls the queue, downloads the image, generates thumbnails in five sizes, runs content moderation, and updates the user's profile. The user sees 'Photo uploading...' and gets a notification when processing completes. The web server responds in 200ms regardless of how long processing takes.

A team uses a dead letter queue to handle failures.

The email sending worker occasionally fails when the email provider has an outage. Failed messages are retried 3 times. After 3 failures, the message moves to a dead letter queue (DLQ). The team monitors the DLQ size. When the email provider recovers, an engineer reviews the DLQ, confirms the messages are valid, and replays them. No emails are lost. No failed messages clog the main queue.

A queue absorbs a traffic spike without downtime.

The company sends a promotional email to 500,000 users. Within minutes, 50,000 users click the link simultaneously. The web servers accept requests and queue order processing jobs. The queue grows to 50,000 messages. The 10 worker servers process 200 messages per second. The queue drains in 4 minutes. Without the queue, 50,000 simultaneous requests would overwhelm the order processing service and crash it.

Frequently asked questions

When should you use a message queue instead of a direct API call?

Use a queue when the work can happen later (sending emails, generating reports, processing uploads), when the consumer might be slower than the producer, or when you need to survive failures gracefully. Use a direct API call when you need the result immediately (checking inventory before confirming an order) or when the operation is fast and reliable. A good rule of thumb: if the user does not need to wait for the result, put it in a queue.

What is the difference between a message queue and a pub/sub system?

In a message queue, each message is consumed by exactly one consumer. If three workers read from the same queue, each message goes to one of them. In pub/sub (publish/subscribe), each message goes to all subscribers. If three services subscribe to a topic, all three get every message. Use a queue for distributing work (process this image). Use pub/sub for broadcasting events (an order was placed, and multiple services need to know).

Examples

In practice

Read more on the blog

Frequently asked questions

When should you use a message queue instead of a direct API call?

What is the difference between a message queue and a pub/sub system?

Related terms

Want the complete playbook?