SQS’s 256KB message size limit isn’t just a constraint; it’s a feature that forces you to think about data gravity.

Let’s watch SQS and S3 dance. Imagine a system processing user-uploaded images. The metadata for each image (user ID, timestamp, image dimensions, tags) is relatively small, but the image binary itself can be megabytes.

// SQS Message (before S3 offload)
{
  "messageId": "123e4567-e89b-12d3-a456-426614174000",
  "receiptHandle": "AQEB...",
  "body": "{\"userId\": \"user-123\", \"timestamp\": \"2023-10-27T10:00:00Z\", \"imageSize\": \"1920x1080\", \"tags\": [\"nature\", \"landscape\"], \"imageData\": \"iVBORw0KGgoAAAANSUhEUgAA...\"}"
}

This imageData field, if it were the actual image data, would blow past SQS’s limit and cause MessageTooLongException errors. The solution is to send the large payload to S3 and only put the S3 object’s key in the SQS message.

// SQS Message (after S3 offload)
{
  "messageId": "a1b2c3d4-e5f6-7890-1234-567890abcdef",
  "receiptHandle": "AQEB...",
  "body": "{\"userId\": \"user-123\", \"timestamp\": \"2023-10-27T10:00:00Z\", \"imageSize\": \"1920x1080\", \"tags\": [\"nature\", \"landscape\"], \"s3ObjectKey\": \"images/user-123/abc123xyz.jpg\"}"
}

When a consumer receives this message, it extracts s3ObjectKey, retrieves the actual image from S3, and then processes it. This pattern is often called the "SQS Large Message Pattern" or "SQS/S3 integration."

The core problem this solves is decoupling message size from SQS’s inherent limitations. SQS is optimized for fast, reliable delivery of small to medium-sized messages. It’s not designed as a blob store. Trying to cram large data into SQS leads to:

  • Message Size Exceeded Errors: AmazonSQSException: The request is invalid because the message size is too large.
  • Increased Latency: Larger messages take longer to serialize, transmit, and deserialize.
  • Higher Costs: You pay for message size and transmission time.
  • SQS Throughput Limits: Large messages consume more throughput capacity.

The mental model is simple: SQS for communication, S3 for storage. The SQS message becomes a pointer to the actual data stored elsewhere.

Here’s how you’d implement it:

  1. Producer Side:

    • When you have a large payload (e.g., an image file, a large JSON document, a PDF):
    • Upload the payload to an S3 bucket. Get the S3 object key (e.g., my-bucket/path/to/file.bin).
    • Construct a small SQS message containing metadata and the S3 object key.
    • Send this small message to your SQS queue.
    import boto3
    
    s3_client = boto3.client('s3')
    sqs_client = boto3.client('sqs')
    
    # --- Producer ---
    bucket_name = 'my-large-payload-bucket'
    s3_key = 'user-uploads/image_abc123.jpg'
    image_data = b'...' # The actual large image binary data
    
    # 1. Upload to S3
    s3_client.put_object(Bucket=bucket_name, Key=s3_key, Body=image_data)
    
    # 2. Construct small SQS message
    sqs_message_body = {
        'userId': 'user-987',
        'originalFilename': 'my_photo.jpg',
        's3ObjectKey': s3_key,
        's3Bucket': bucket_name
    }
    
    # 3. Send to SQS
    queue_url = 'https://sqs.us-east-1.amazonaws.com/123456789012/my-processing-queue'
    sqs_client.send_message(
        QueueUrl=queue_url,
        MessageBody=json.dumps(sqs_message_body)
    )
    
  2. Consumer Side:

    • Receive a message from the SQS queue.
    • Parse the message body to extract the S3 object key and bucket name.
    • Use the S3 client to download the object from S3.
    • Process the downloaded data.
    • Delete the SQS message.
    import boto3
    import json
    
    s3_client = boto3.client('s3')
    sqs_client = boto3.client('sqs')
    
    # --- Consumer ---
    queue_url = 'https://sqs.us-east-1.amazonaws.com/123456789012/my-processing-queue'
    
    # 1. Receive message
    response = sqs_client.receive_message(
        QueueUrl=queue_url,
        MaxNumberOfMessages=1,
        WaitTimeSeconds=20 # Long polling
    )
    
    if 'Messages' in response:
        message = response['Messages'][0]
        message_body_dict = json.loads(message['Body'])
    
        s3_key = message_body_dict['s3ObjectKey']
        s3_bucket = message_body_dict['s3Bucket']
        receipt_handle = message['ReceiptHandle']
    
        # 2. Download from S3
        s3_object = s3_client.get_object(Bucket=s3_bucket, Key=s3_key)
        large_payload_data = s3_object['Body'].read() # This is the actual image data
    
        # 3. Process the data
        print(f"Processing data for key: {s3_key}, size: {len(large_payload_data)} bytes")
        # ... do image processing ...
    
        # 4. Delete SQS message
        sqs_client.delete_message(
            QueueUrl=queue_url,
            ReceiptHandle=receipt_handle
        )
    

When you configure SQS queues, remember that the MaximumMessageSize parameter refers to the SQS message body itself, not the data it points to. It’s a hard limit of 262,144 bytes (256 KB). If you try to send a message with a body larger than this, SQS will reject it with a MessageTooLongException. The S3 object can be up to 5 TB.

The common pitfall here is not managing the S3 objects. If your SQS consumer fails to process a message and the message is redelivered, it will attempt to download the same S3 object again. However, if the SQS message is successfully processed and deleted, the pointer to the S3 object is effectively lost from your processing pipeline. You need a separate mechanism (like S3 Lifecycle Policies or a dedicated cleanup job) to remove old S3 objects that are no longer needed, otherwise, your S3 costs will skyrocket.

Once you’ve mastered offloading large payloads to S3, your next consideration will be how to handle ordered processing of these messages, as SQS itself doesn’t guarantee strict FIFO order without using a FIFO queue.

Want structured learning?

Take the full Sqs course →