SQS DLQ redrive policies are less about moving messages and more about replaying them, often with surprising side effects on message visibility.

Let’s see it in action. Imagine you have a source-queue and a dlq-queue. You want to redrive messages from dlq-queue back to source-queue.

First, create the queues if they don’t exist.

aws sqs create-queue --queue-name source-queue --attributes DelaySeconds=0,VisibilityTimeout=30
aws sqs create-queue --queue-name dlq-queue --attributes DelaySeconds=0,VisibilityTimeout=30

Now, send a message to source-queue that will eventually end up in dlq-queue. For demonstration, we’ll simulate a failure by not deleting the message.

aws sqs send-message --queue-url $(aws sqs get-queue-url --queue-name source-queue --query 'QueueUrl' --output text) --message-body "This message will fail"

After the visibility timeout (30 seconds in this example) expires, the message will be available for redelivery. If you don’t process and delete it, it will eventually be moved to the DLQ based on a configured maxReceiveCount on the source-queue’s RedrivePolicy. Let’s assume this has already happened and the message is now in dlq-queue.

To set up the redrive, you need to configure the dlq-queue to point back to the source-queue. This is done via the RedrivePolicy attribute on the source-queue.

SOURCE_QUEUE_URL=$(aws sqs get-queue-url --queue-name source-queue --query 'QueueUrl' --output text)
DLQ_QUEUE_URL=$(aws sqs get-queue-url --queue-name dlq-queue --query 'QueueUrl' --output text)

aws sqs set-queue-attributes \
    --queue-url $SOURCE_QUEUE_URL \
    --attributes '{
        "RedrivePolicy": "{\"deadLetterTargetArn\": \"'$(aws sqs get-queue-attributes --queue-url $DLQ_QUEUE_URL --query 'Attributes.QueueArn' --output text)'\", \"maxReceiveCount\": 5}"
    }'

This command associates the dlq-queue as the dead-letter target for source-queue. If a message in source-queue is received 5 times without being deleted, it will be moved to dlq-queue.

Now, to redrive messages from dlq-queue back to source-queue, you use the SQS console or the StartMessageMoveTask API. This is the crucial step that often causes confusion. You are not directly configuring the dlq-queue to send messages back. Instead, you initiate a task to move them.

Let’s initiate a message move task:

aws sqs start-message-move-task \
    --source-arn $(aws sqs get-queue-attributes --queue-url $DLQ_QUEUE_URL --query 'Attributes.QueueArn' --output text) \
    --destination-arn $(aws sqs get-queue-attributes --queue-url $SOURCE_QUEUE_URL --query 'Attributes.QueueArn' --output text)

This command starts a background process that will pull messages from dlq-queue and send them to source-queue. The StartMessageMoveTask API is idempotent; calling it again while a task is running will return the existing task ID. You can monitor the task’s progress using GetMessageMoveTask.

The mental model here is that the RedrivePolicy on the source queue defines where messages go when they fail. The StartMessageMoveTask API is the mechanism to manually initiate the transfer of messages from the DLQ back to another queue. This is often done for reprocessing failed messages after a bug fix or configuration change.

When a message is redriven from a DLQ back to its original source queue, its visibility timeout is reset to the value of the VisibilityTimeout attribute of the destination queue. This means if your DLQ has a VisibilityTimeout of 30 seconds, and your source queue also has a VisibilityTimeout of 30 seconds, the redriven message will become visible in the source queue for 30 seconds. If it’s not processed and deleted within that time, it will be sent back to the DLQ, potentially creating a loop if the underlying issue isn’t resolved.

The one thing most people don’t realize is that the StartMessageMoveTask API doesn’t preserve the original message attributes or message body structure perfectly. While the payload is generally intact, SQS might re-serialize or alter certain metadata during the move process, which can sometimes cause downstream consumers to behave unexpectedly if they rely on very specific attribute formats.

The next concept to explore is handling message deduplication during redrive.

Want structured learning?

Take the full Sqs course →