SQS redrive policies don’t actually move messages; they tell another queue where to find them.
Let’s see this in action. Imagine we have a main queue, my-processing-queue, and a dead-letter queue, my-dlq. We want messages that fail processing in my-processing-queue to end up in my-dlq.
First, the DLQ needs to be configured to accept messages from the source queue. This is done via the RedrivePolicy attribute on the source queue.
Here’s how you’d set it up using the AWS CLI:
aws sqs set-queue-attributes \
--queue-url https://sqs.us-east-1.amazonaws.com/123456789012/my-processing-queue \
--attributes '{
"RedrivePolicy": {
"deadLetterTargetArn": "arn:aws:sqs:us-east-1:123456789012:my-dlq",
"maxReceiveCount": "5"
}
}'
In this command:
deadLetterTargetArn: This is the ARN of the DLQ that will receive messages. It must be a valid SQS queue ARN.maxReceiveCount: This is the number of times a message can be received frommy-processing-queuebefore it’s considered "failed" and sent to the DLQ. Once it hits this count, SQS automatically moves it.
Crucially, the IAM policy attached to the DLQ must grant SQS permission to send messages to it. If this is missing, messages will just sit in the source queue and eventually expire or be lost, even if the RedrivePolicy is correctly set.
Here’s an example of the necessary IAM policy for the DLQ:
{
"Version": "2012-10-17",
"Id": "arn:aws:iam::123456789012:policy/SQS-DLQ-Policy",
"Statement": [
{
"Sid": "AllowSqsToSendMessageToDLQ",
"Effect": "Allow",
"Principal": {
"Service": "sqs.amazonaws.com"
},
"Action": "sqs:SendMessage",
"Resource": "arn:aws:sqs:us-east-1:123456789012:my-dlq"
}
]
}
Notice how the Principal is sqs.amazonaws.com, the Action is sqs:SendMessage, and the Resource is the ARN of the DLQ. This policy allows the SQS service itself to send messages to my-dlq.
The RedrivePolicy is an attribute of the source queue, but the IAM permissions are on the DLQ. This is a common point of confusion. You’re essentially telling my-processing-queue to send failed messages to my-dlq, but my-dlq has to explicitly allow sqs.amazonaws.com to send messages to it.
When a message in my-processing-queue is received maxReceiveCount times (e.g., 5 times), SQS automatically deletes it from my-processing-queue and sends it to my-dlq. The message’s visibility timeout is reset on the DLQ, allowing for re-examination and potential manual redrive.
The most surprising thing about SQS redrive is that the "redrive" itself is an asynchronous, background operation performed by SQS. You don’t initiate a "redrive" command when a message fails; you configure the policy, and SQS handles the rest based on receive counts. The concept of "redriving" from the DLQ typically refers to a manual process where you again configure a redrive policy on the DLQ, pointing back to a processing queue, to attempt to reprocess those failed messages.
If you set up the RedrivePolicy correctly on the source queue but messages are still not appearing in the DLQ after exceeding maxReceiveCount, the most likely culprit is a missing or incorrect IAM policy on the DLQ itself.