SQS’s ApproximateNumberOfMessages metric isn’t just a count; it’s a delay metric, showing messages that are waiting for a consumer to pick them up.
Let’s watch this in action. Imagine a simple SQS queue named my-processing-queue. We’ve got a producer sending messages, and a consumer that’s a bit slow.
# Producer sends a message
aws sqs send-message --queue-url https://sqs.us-east-1.amazonaws.com/123456789012/my-processing-queue --message-body "{\"data\": \"payload1\"}"
# Producer sends another message
aws sqs send-message --queue-url https://sqs.us-east-1.amazonaws.com/123456789012/my-processing-queue --message-body "{\"data\": \"payload2\"}"
# Consumer receives messages (but is slow to process)
aws sqs receive-message --queue-url https://sqs.us-east-1.amazonaws.com/123456789012/my-processing-queue --max-number-of-messages 10
If you check the ApproximateNumberOfMessages metric for my-processing-queue in CloudWatch right after sending those messages, you’ll see a value of 2. This means there are two messages in the queue that are available for retrieval.
Now, if your consumer is configured to process messages one by one and takes 10 seconds per message, the ApproximateNumberOfMessages metric will start to fluctuate. As messages are sent, the count goes up. As the consumer receives and processes them, the count goes down. If the producer is faster than the consumer, the count will steadily climb.
The core problem this metric helps solve is understanding queue backlog and potential processing bottlenecks. When ApproximateNumberOfMessages is high and consistently rising, it signals that your consumers aren’t keeping up with the rate of incoming messages. This can lead to increased message latency, potential timeouts, and ultimately, a degraded user experience if these messages are tied to user-facing actions. Conversely, a consistently low or zero count means your consumers are processing messages as fast as they arrive.
Internally, SQS maintains this count by incrementing it when a message is sent and decrementing it when a message is received by a consumer. It’s crucial to understand that the count only decreases upon receipt, not upon successful processing and deletion. This is why the metric is "approximate" – there’s a small window between a message being received by a consumer and that consumer explicitly deleting it. During this visibility timeout period, the message is still considered "in flight" and doesn’t contribute to ApproximateNumberOfMessages, but it’s also not available for other consumers to pick up.
The key levers you control are the rate of message production and the capacity/speed of your consumers. You can scale up consumers (e.g., by adding more instances or increasing their processing throughput) to reduce the ApproximateNumberOfMessages if it’s consistently high. You can also implement strategies like batching messages for processing or adjusting the VisibilityTimeout of the queue. A longer VisibilityTimeout gives consumers more time to process a message before it becomes visible again, but it also means messages stay out of circulation for longer if a consumer crashes.
Here’s the part that trips people up: ApproximateNumberOfMessagesVisible is the metric that most directly reflects messages ready for processing. ApproximateNumberOfMessages is often used interchangeably, but it’s technically the sum of ApproximateNumberOfMessagesVisible and ApproximateNumberOfMessagesNotVisible (messages that are currently being processed by a consumer and are within their visibility timeout). So, if you see ApproximateNumberOfMessages climbing but ApproximateNumberOfMessagesVisible isn’t, it means your consumers have picked up messages but haven’t finished processing and deleting them yet.
The next thing you’ll likely want to monitor is ApproximateAgeOfOldestMessage to understand how long messages are waiting in the queue.