Message queue
MES-ij kyoo
A system that stores and delivers messages between services, allowing asynchronous communication.
A message queue sits between two services and holds messages until the receiver is ready to process them. Service A puts a message in the queue. Service B picks it up when it can. If Service B is down, the message waits in the queue. If Service B is slow, messages accumulate but nothing breaks. The queue absorbs the difference in speed between producer and consumer.
RabbitMQ, Amazon SQS, and Apache Kafka are the most popular message queue systems (though Kafka is technically a distributed event log). They solve different problems at different scales. SQS is the simplest: managed by AWS, no infrastructure to run, works out of the box. RabbitMQ offers more routing flexibility and runs anywhere. Kafka handles millions of messages per second and stores them durably for replay.
Message queues enable a pattern called "work queues" that is fundamental to scaling. Instead of one server processing all image uploads synchronously, the web server puts each upload into a queue. Ten worker servers each pull from the queue and process uploads in parallel. If traffic spikes, you add more workers. If traffic drops, workers sit idle. The queue is the buffer that decouples production from consumption.
Examples
A platform processes image uploads asynchronously.
When a user uploads a profile photo, the web server puts a message in an SQS queue with the image URL and user ID. A fleet of workers polls the queue, downloads the image, generates thumbnails in five sizes, runs content moderation, and updates the user's profile. The user sees 'Photo uploading...' and gets a notification when processing completes. The web server responds in 200ms regardless of how long processing takes.
A team uses a dead letter queue to handle failures.
The email sending worker occasionally fails when the email provider has an outage. Failed messages are retried 3 times. After 3 failures, the message moves to a dead letter queue (DLQ). The team monitors the DLQ size. When the email provider recovers, an engineer reviews the DLQ, confirms the messages are valid, and replays them. No emails are lost. No failed messages clog the main queue.
A queue absorbs a traffic spike without downtime.
The company sends a promotional email to 500,000 users. Within minutes, 50,000 users click the link simultaneously. The web servers accept requests and queue order processing jobs. The queue grows to 50,000 messages. The 10 worker servers process 200 messages per second. The queue drains in 4 minutes. Without the queue, 50,000 simultaneous requests would overwhelm the order processing service and crash it.
In practice
Read more on the blog
Frequently asked questions
When should you use a message queue instead of a direct API call?
Use a queue when the work can happen later (sending emails, generating reports, processing uploads), when the consumer might be slower than the producer, or when you need to survive failures gracefully. Use a direct API call when you need the result immediately (checking inventory before confirming an order) or when the operation is fast and reliable. A good rule of thumb: if the user does not need to wait for the result, put it in a queue.
What is the difference between a message queue and a pub/sub system?
In a message queue, each message is consumed by exactly one consumer. If three workers read from the same queue, each message goes to one of them. In pub/sub (publish/subscribe), each message goes to all subscribers. If three services subscribe to a topic, all three get every message. Use a queue for distributing work (process this image). Use pub/sub for broadcasting events (an order was placed, and multiple services need to know).
Related terms
A design pattern where systems communicate by producing and consuming events instead of direct calls.
An architecture where an application is built as a collection of small, independent services.
Adding more servers to handle increased load, instead of upgrading existing servers.
An HTTP callback that sends data to your application automatically when an event occurs in another system.
The property where performing an operation multiple times produces the same result as performing it once.

Want the complete playbook?
Picks and Shovels is the definitive guide to developer marketing. Amazon #1 bestseller with practical strategies from 30 years of marketing to developers.