SQS FIFO High Throughput Mode: Scale to 70K Messages/sec (2026)

SQS FIFO queues in high throughput mode can handle up to 3000 transactions per second, which translates to 70,000 messages per second if each transaction contains 23 messages.

Here’s how you can achieve that scale and what it looks like in practice:

Let’s imagine a system where an order processing service needs to send millions of order details to downstream services for fulfillment. We want to ensure these orders are processed in the exact order they were received, and we need to do it fast. SQS FIFO queues are the tool for this job, especially when configured for high throughput.

First, you need to create an SQS FIFO queue. The key here is enabling "High throughput for FIFO" during creation. This isn’t a toggle you can flip later.

aws sqs create-queue \
    --queue-name my-order-processing-fifo.fifo \
    --attributes '{"FifoQueue": "true", "ContentBasedDeduplication": "false", "ThroughputMode": "HIGH_THROUGHPUT"}' \
    --region us-east-1

When you create a FIFO queue, you’ll notice the .fifo suffix is mandatory. ContentBasedDeduplication is set to false here because we’ll be managing deduplication using our own message attributes or message body content, which is often more robust for complex business logic. ThroughputMode is explicitly set to HIGH_THROUGHPUT.

Now, how do you send messages to achieve 70,000 messages per second? It’s not about sending individual messages one by one at an insane rate. The magic happens with batching. SQS allows you to send up to 10 messages in a single SendMessageBatch API call.

Consider this Python snippet using boto3:

import boto3
import uuid

sqs = boto3.client('sqs', region_name='us-east-1')
queue_url = 'YOUR_QUEUE_URL' # Replace with your actual queue URL

def send_batch_messages(num_batches, messages_per_batch):
    for i in range(num_batches):
        entries = []
        for j in range(messages_per_batch):
            message_id = str(uuid.uuid4())
            entries.append({
                'Id': message_id,
                'MessageBody': f'{"order_details": {"order_id": "ORDER-{}-{}-{}"}}'.format(i, j, message_id),
                'MessageGroupId': f'GROUP-{i % 100}', # Example: Distribute across 100 groups
                'MessageDeduplicationId': message_id # Simple deduplication for example
            })

        response = sqs.send_message_batch(
            QueueUrl=queue_url,
            Entries=entries
        )
        print(f"Batch {i+1}/{num_batches} sent. Success: {len(response.get('Successful', []))}, Failed: {len(response.get('Failed', []))}")

# To achieve 70,000 messages/sec, you need 3000 TPS.
# If each TPS sends 23 messages (average batch size), you'd do:
# 3000 TPS * 23 messages/TPS = 69,000 messages/sec
# So, you'd aim for roughly 3000 calls to send_message_batch per second,
# each with an average of 23 messages.

# Example for demonstration (not actual 70k/sec):
# send_batch_messages(num_batches=1000, messages_per_batch=23)

The crucial components here are:

SendMessageBatch: This API call is your workhorse. It allows you to send up to 10 messages in a single request.
MessageGroupId: This is fundamental to FIFO. All messages with the same MessageGroupId are processed in strict order. To achieve high throughput, you need to distribute your messages across many MessageGroupIds. If you send all messages to a single MessageGroupId, you’ll be bottlenecked by that single group’s processing capacity, not the overall queue capacity. The example uses f'GROUP-{i % 100}' to distribute messages across 100 groups for each batch.
MessageDeduplicationId: For FIFO, you need to ensure that duplicate messages aren’t processed. You can either let SQS handle this automatically by setting ContentBasedDeduplication to true (which uses the message body for hashing) or provide your own MessageDeduplicationId. Providing your own is often preferred for better control and performance, especially when batching, as shown with message_id.

To reach 70,000 messages per second, you need to sustain approximately 3,000 SendMessageBatch API calls per second, with each call averaging around 23 messages. This requires a robust producer application, potentially running on multiple instances, with optimized network connectivity to AWS.

The system then scales by increasing the number of partitions for your FIFO queue. When you enable "High throughput for FIFO," SQS automatically scales your queue by adding partitions. Each partition can independently handle 300 write transactions per second (which is 300 messages/sec if not batched, or up to 300 * 23 = 6,900 messages/sec if batched to the maximum). With 10 partitions, you get 3,000 TPS * 23 messages/TPS = 69,000 messages/sec.

When consuming messages, you also want to use ReceiveMessageBatch. A single ReceiveMessageBatch call can retrieve up to 10 messages. To maximize throughput, your consumers should also batch their receives.

response = sqs.receive_message_batch(
    QueueUrl=queue_url,
    MaxNumberOfMessages=10,
    WaitTimeSeconds=20 # Long polling is crucial for efficiency
)

The WaitTimeSeconds parameter (long polling) is critical. Instead of polling every second and getting no messages, long polling keeps the connection open for up to 20 seconds, reducing the number of API calls and improving efficiency.

A common pitfall is not distributing MessageGroupIds sufficiently. If all your messages are going to one or a few groups, you are not leveraging the parallel processing capabilities of SQS FIFO high throughput mode. The system is designed to partition your FIFO queue internally based on MessageGroupIds and process these partitions in parallel.

After successfully sending and receiving messages at high throughput, the next challenge you’ll encounter is managing the lifecycle of messages that might fail processing downstream, potentially leading to reprocessing loops.

More Deep Dives in Sqs