SQS throughput is capped not by the queue itself, but by how quickly your consumers can process messages, and how many concurrent consumers you can spin up.
Let’s watch a consumer group process messages. Imagine we have a simple application that takes messages from an SQS queue, does some work, and then deletes the message.
import boto3
import json
import time
sqs = boto3.client('sqs', region_name='us-east-1')
queue_url = 'https://sqs.us-east-1.amazonaws.com/123456789012/my-processing-queue'
def process_message(message_body):
# Simulate work
data = json.loads(message_body)
item_id = data.get('item_id')
print(f"Processing item: {item_id}")
time.sleep(0.5) # Simulate I/O or CPU bound work
print(f"Finished processing item: {item_id}")
def receive_and_process_messages():
while True:
try:
response = sqs.receive_message(
QueueUrl=queue_url,
MaxNumberOfMessages=10, # Max allowed by SQS
WaitTimeSeconds=20, # Long polling
VisibilityTimeout=30 # Ensure enough time to process
)
if 'Messages' in response:
for message in response['Messages']:
process_message(message['Body'])
sqs.delete_message(
QueueUrl=queue_url,
ReceiptHandle=message['ReceiptHandle']
)
else:
print("No messages received, waiting...")
except Exception as e:
print(f"An error occurred: {e}")
time.sleep(5) # Backoff on error
if __name__ == "__main__":
receive_and_process_messages()
This consumer is configured to receive up to 10 messages at a time, using a 20-second WaitTimeSeconds for long polling to reduce empty responses, and a 30-second VisibilityTimeout to give it ample time to finish processing before the message becomes visible again. The time.sleep(0.5) simulates the actual work. If this sleep is the bottleneck, and you can only run one such process, your throughput is limited to 2 messages per second (1 / 0.5 seconds).
The core problem SQS throughput optimization solves is breaking this single-consumer bottleneck. You can increase throughput by:
-
Increasing Batch Size: The
MaxNumberOfMessagesparameter inreceive_messagecan be set from 1 to 10. Larger batches mean fewer API calls for the same number of messages, reducing overhead. However, if processing takes longer than theVisibilityTimeout, a large batch means all messages in that batch become visible simultaneously, potentially overwhelming your consumer or causing duplicate processing if not handled carefully. -
Parallelizing Consumers: The most direct way to increase throughput is to run multiple instances of your consumer application concurrently. If one consumer can process 2 messages/second, 10 consumers can process 20 messages/second. This is often achieved using Auto Scaling Groups (ASGs) for EC2 instances or scaling out Lambda functions. The key is that each consumer instance must have its own
ReceiptHandlefor the messages it processes. -
Tuning
VisibilityTimeout: This is the duration a message is hidden from other consumers after being received. It must be at least as long as your longest expected processing time. If it’s too short, messages can reappear before they’re finished, leading to duplicate processing. If it’s excessively long, a stalled consumer could hold onto messages unnecessarily, impacting availability. A common strategy is to set it to 6x your average processing time to account for retries and occasional spikes. -
Optimizing
WaitTimeSeconds: This parameter controls long polling. Setting it to a value between 1 and 20 seconds (the maximum) reduces the number of emptyreceive_messagecalls, saving on potential costs and reducing CPU load on your consumers. A value of 20 seconds is usually optimal for balancing responsiveness and cost. -
Choosing the Right SQS Type: Standard queues offer maximum throughput and are ordered best-effort. FIFO (First-In, First-Out) queues guarantee strict ordering and exactly-once processing but have lower throughput limits (3,000 messages per second for send and 3,000 messages per second for receive, per API action, per region, versus 300 send and 3,000 receive for Standard). If strict ordering isn’t critical, Standard is the way to go for higher throughput.
-
Batching
SendMessageandDeleteMessage: Whilereceive_messagehas aMaxNumberOfMessagesof 10,sendMessageanddeleteMessagecan operate on batches of up to 10 messages per API call. If you’re sending messages from a single producer, batching them can significantly improve send throughput. Similarly, if your consumer processes a batch and then deletes them, usingdeleteMessageBatchcan be more efficient than individualdeleteMessagecalls.
When you’re processing messages, the ReceiptHandle is crucial. It’s a unique identifier for a specific receive operation of a message. When you delete or change the visibility of a message, you must use this ReceiptHandle. If your consumer crashes after receiving a batch but before deleting all messages, the VisibilityTimeout will expire, and the messages will become available for another consumer to pick up. This is why VisibilityTimeout is so critical.
The most misunderstood aspect of SQS throughput is often the interplay between VisibilityTimeout and the actual message processing time. If your VisibilityTimeout is set to 30 seconds, but your message processing reliably takes 45 seconds, you’ve effectively created a throughput bottleneck and a potential duplicate processing problem. SQS will make the message visible again after 30 seconds, even if your consumer is still working on it. This means another consumer might pick it up, leading to duplicate work or conflicting state changes if not handled idempotently. Setting VisibilityTimeout to a value comfortably above your longest expected processing time for a single message is paramount.
The next hurdle you’ll likely face is managing dead-letter queues for failed messages.