SQS temporary queues are actually stateful, not ephemeral, and the primary reason they’re useful is to avoid polling a fixed queue for a specific request’s reply.

Let’s see this in action. Imagine a service A that needs to ask another service B a question and get an answer back.

# Service A (the client)
import boto3

sqs = boto3.client('sqs')
sns = boto3.client('sns')

# Create a temporary queue for replies
response = sqs.create_queue(
    QueueName='my-app-reply-queue-12345', # In a real app, this would be dynamic
    Attributes={
        'VisibilityTimeout': '30', # How long a message is hidden after being received
        'MessageRetentionPeriod': '86400' # How long messages stay in the queue
    }
)
reply_queue_url = response['QueueUrl']
reply_queue_arn = sqs.get_queue_attributes(QueueUrl=reply_queue_url, AttributeNames=['QueueArn'])['Attributes']['QueueArn']

# Publish a request to a known service endpoint queue
request_queue_url = 'https://sqs.us-east-1.amazonaws.com/123456789012/service-b-requests'
sns.publish(
    TopicArn='arn:aws:sns:us-east-1:123456789012:service-b-requests-topic',

    Message=f'{{"reply_to": "{reply_queue_arn}", "payload": "what is 2+2?"}}',

    MessageAttributes={
        'MessageType': {'StringValue': 'Request', 'DataType': 'String'}
    }
)

# Now, service A polls its *own* reply queue
while True:
    response = sqs.receive_message(
        QueueUrl=reply_queue_url,
        MaxNumberOfMessages=1,
        WaitTimeSeconds=10 # Long polling to reduce cost and latency
    )
    if 'Messages' in response:
        message = response['Messages'][0]
        print(f"Received reply: {message['Body']}")
        sqs.delete_message(
            QueueUrl=reply_queue_url,
            ReceiptHandle=message['ReceiptHandle']
        )
        break
    print("No reply yet, waiting...")

# Clean up the temporary queue
sqs.delete_queue(QueueUrl=reply_queue_url)

This Service A code demonstrates the core pattern. It first creates a queue specifically for receiving replies. Notice the QueueName is somewhat arbitrary here, but in a real distributed system, you’d want a way to generate unique, predictable names or use a discovery mechanism. The crucial part is that Service A knows this queue is for its replies.

Then, it publishes a request to a known Service B request queue. The request message includes the ARN of Service A’s reply queue. Service B, upon processing the request, will send its answer back to the reply_queue_arn specified in the original request.

The most important part is the polling loop. Service A isn’t polling the Service B request queue hoping to find its specific answer. Instead, it’s polling its own dedicated reply queue. This is the "temporary" or "ephemeral" aspect – it’s a queue created for a specific interaction and then deleted.

This pattern solves the "request-reply" problem in a decoupled, asynchronous environment. Without temporary queues, Service A would have to poll a shared request queue and filter messages for its own request ID, which is inefficient and scales poorly. Or, Service B would need to know Service A’s direct network address, which breaks the decoupling.

The system handles this via two key SQS features:

  1. CreateQueue: Allows dynamic creation of queues. You can specify QueueName and various Attributes like VisibilityTimeout (how long a message is hidden from other consumers after being received) and MessageRetentionPeriod (how long messages persist).
  2. ReceiveMessage with WaitTimeSeconds: This enables long polling. Instead of returning immediately if no messages are present, SQS keeps the connection open for up to 20 seconds (the maximum allowed value, though 10 is shown in the example for brevity) waiting for messages. This significantly reduces the number of ReceiveMessage calls, saving costs and reducing latency.

The MessageAttributes in the publish call are also vital. While the payload contains the actual data, MessageType (or any custom attribute) can be used by Service B to understand the type of message it received and how to route the reply.

The core of the "temporary" nature isn’t that the queue disappears on its own, but that the application is responsible for cleaning it up. A common pattern is to use a unique identifier for the temporary queue name (e.g., a UUID) and then explicitly call sqs.delete_queue once the reply is received and processed. If the application crashes before deleting, the queue will persist until its MessageRetentionPeriod expires (or longer, depending on configuration).

A subtle but crucial detail is how the reply queue ARN is passed. By publishing the request to an SNS topic (service-b-requests-topic in the example) and having Service B subscribe to that topic, Service B can then read the request message, extract the reply queue ARN, and send the reply directly to that SQS queue. This decouples Service B from needing to know about Service A’s existence beyond the initial request.

The next challenge you’ll likely face is managing the lifecycle of these temporary queues effectively, especially in high-throughput scenarios, and ensuring that replies are not lost if the client application crashes before processing them.

Want structured learning?

Take the full Sqs course →