SQS producer throughput limits are surprisingly flexible, allowing a single queue to handle an astonishing 3,000 messages per second without any special configuration, which often catches people off guard when they assume a hard, low ceiling.

Let’s watch this in action. Imagine a simple producer script, written in Python using boto3, that’s sending messages to an SQS queue.

import boto3
import time

sqs = boto3.client('sqs', region_name='us-east-1')
queue_url = 'https://sqs.us-east-1.amazonaws.com/123456789012/my-high-throughput-queue'

message_body = "This is a test message for high throughput."
num_messages = 100000
messages_sent = 0
start_time = time.time()

for i in range(num_messages):
    try:
        response = sqs.send_message(
            QueueUrl=queue_url,
            MessageBody=message_body,
            # DelaySeconds=0 # Default is 0, no delay
        )
        messages_sent += 1
        if (i + 1) % 1000 == 0:
            print(f"Sent {i + 1} messages...")
    except Exception as e:
        print(f"Error sending message {i+1}: {e}")
        break

end_time = time.time()
duration = end_time - start_time
throughput = messages_sent / duration if duration > 0 else 0

print(f"\n--- Performance Summary ---")
print(f"Total messages attempted: {num_messages}")
print(f"Total messages sent: {messages_sent}")
print(f"Total time taken: {duration:.2f} seconds")
print(f"Achieved throughput: {throughput:.2f} messages/second")

When you run this, assuming your AWS credentials are set up and the queue exists, you’ll see output like this, demonstrating sustained throughput:

Sent 1000 messages...
Sent 2000 messages...
...
Sent 100000 messages...

--- Performance Summary ---
Total messages attempted: 100000
Total messages sent: 100000
Total time taken: 30.50 seconds
Achieved throughput: 3278.69 messages/second

This illustrates that a standard SQS queue, by default, can handle a significant send rate. The problem isn’t usually the queue itself hitting a hard limit; it’s how you’re interacting with it.

The core concept is that SQS is designed for high availability and scalability. When you send a message, it’s immediately durably stored and made available for consumers. The "throughput limit" is often misunderstood as a strict, fixed number per queue. In reality, SQS scales automatically. The primary bottleneck you’ll encounter is usually within your producer application’s ability to generate and send messages, or network latency between your producer and the SQS service endpoint.

When you send a message to SQS using SendMessage, you’re making an API call to the SQS service. The service then writes this message to its distributed storage. For standard queues, SQS offers "at-least-once delivery" and "best-effort ordering." This architecture allows for very high throughput because individual message writes can be parallelized and handled by a vast, distributed backend. The 3,000 messages per second figure is a soft limit for unbuffered writes to a single SQS queue. This means if you’re just hammering send_message as fast as your application can manage, you’ll likely hit this number before SQS itself becomes the bottleneck.

There are two main types of SQS queues: Standard and FIFO. Standard queues offer the highest throughput but don’t guarantee order. FIFO queues guarantee order and exactly-once processing but have lower throughput limits (3,000 send requests per second and 300 receive requests per second per API action, per message group). The 3,000 messages/second figure applies to Standard queues.

The key levers you control as a producer are:

  • Batching: Instead of sending messages one by one, you can use SendMessageBatch. This API call allows you to send up to 10 messages in a single request. This is a huge throughput booster. Each SendMessageBatch call counts as one API request, but you’re sending multiple messages.

    messages_to_send = [
        {'Id': 'msg1', 'MessageBody': 'Batch message 1'},
        {'Id': 'msg2', 'MessageBody': 'Batch message 2'}
    ]
    response = sqs.send_message_batch(
        QueueUrl=queue_url,
        Entries=messages_to_send
    )
    

    The limit for SendMessageBatch is 10 messages per batch, and up to 10 batches per second per API action (which means 100 messages per second per API action). However, the overall queue throughput is still capped by the service at 3,000 messages per second for Standard queues. Batching primarily reduces your API request count and can improve efficiency.

  • Producer Concurrency: Running multiple producer processes or threads that all send to the same queue. SQS scales out its internal processing, so it can handle many concurrent connections and requests from multiple producers.

  • Network and Instance Performance: Your producer’s EC2 instance (or wherever it’s running) needs sufficient CPU, memory, and network bandwidth to make those API calls rapidly. If your producer is slow, it will be the bottleneck, not SQS.

  • Region Latency: The physical distance between your producer and the SQS endpoint in AWS affects how quickly API calls can be made and responses received.

  • SQS Queue Type: As mentioned, Standard queues have higher throughput than FIFO queues. If you need ordering, you accept the lower limits.

What most people don’t realize is that the 3,000 messages per second limit for Standard queues is per API action, per queue. If you’re using send_message, that’s 3,000 messages per second. If you’re using send_message_batch, you can send up to 10 messages per batch. So, if you send one batch per second, you’re sending 10 messages. If you send 300 batches per second, you’re sending 3,000 messages per second. The service can handle 3,000 individual messages per second, regardless of whether they arrive as single send_message calls or as part of send_message_batch calls, up to the batch size and rate limits for the batch API itself.

The next common hurdle after optimizing producer throughput is ensuring your consumers can keep up with the high volume of messages SQS can deliver, which often leads to discussions about consumer scaling and efficient message processing.

Want structured learning?

Take the full Sqs course →