SQS queue-based autoscaling is a way to automatically adjust the number of worker instances (EC2 or ECS tasks) based on the number of messages waiting in an Amazon SQS queue.

Let’s look at an SQS queue with messages, and how we can scale out EC2 instances to process them.

Imagine a queue named my-processing-queue. When messages arrive in this queue, we want to automatically spin up more EC2 instances to process them. Conversely, when the queue is empty, we want to scale down those instances to save costs.

Here’s a simple setup:

1. SQS Queue:

  • Name: my-processing-queue
  • Visibility Timeout: 300 seconds (5 minutes). This means once a message is picked up by a worker, it’s hidden for 5 minutes. If the worker doesn’t delete it within that time (e.g., it crashes), the message becomes visible again.

2. EC2 Auto Scaling Group:

  • We’ll create an Auto Scaling group that launches EC2 instances.
  • Launch Template: This defines the EC2 instance type, AMI, security groups, and user data. User data is crucial here; it’s a script that runs when the instance first boots. This script will typically include logic to pull messages from SQS, process them, and delete them.
  • Scaling Policy: This is where we define when to scale. We’ll use a Target Tracking Scaling Policy.

3. CloudWatch Alarm:

  • The Auto Scaling Group needs a signal to know when to scale. This signal comes from CloudWatch metrics.
  • For SQS, the key metric is ApproximateNumberOfMessagesVisible. This metric tells us how many messages are waiting in the queue and are visible (i.e., not currently being processed by a worker).

How it works in practice:

Let’s say you have a web application that drops tasks into my-processing-queue.

  • Initial State: The queue is empty. Your EC2 Auto Scaling group has a minimum of 1 instance running. This instance is idle, waiting for work.
  • Work Arrives: Messages start appearing in my-processing-queue. The ApproximateNumberOfMessagesVisible metric in CloudWatch begins to increase.
  • Scaling Out:
    • You configure a Target Tracking policy for your Auto Scaling group. You set a target value for ApproximateNumberOfMessagesVisible per instance. For example, you might aim for 10 messages per instance.
    • If the total ApproximateNumberOfMessagesVisible divided by the current number of instances goes above your target (e.g., 50 messages visible and 2 instances running, so 25 messages/instance), CloudWatch triggers an alarm.
    • The alarm tells the Auto Scaling group to add more instances. The group launches new EC2 instances based on your Launch Template.
    • As new instances join the group, the ApproximateNumberOfMessagesVisible per instance starts to decrease, moving back towards your target.
  • Processing: Each EC2 instance runs a worker process (defined in its user data or installed by a configuration management tool). This process continuously polls the my-processing-queue for messages. When it receives a message, it processes it and then deletes it from the queue.
  • Scaling In:
    • When the rate of incoming messages slows down, or processing catches up, the ApproximateNumberOfMessagesVisible metric drops.
    • If the ApproximateNumberOfMessagesVisible per instance falls below your target (e.g., 5 messages visible and 5 instances running, so 1 message/instance), CloudWatch triggers a different alarm.
    • This alarm tells the Auto Scaling group to remove instances. It will terminate instances, usually starting with those that have been running the longest or are least utilized, until the metric is back near the target or the minimum number of instances is reached.

Example Configuration Snippet (AWS CLI conceptual):

# 1. Create the SQS Queue
aws sqs create-queue --queue-name my-processing-queue

# 2. Create a Launch Template (simplified - needs AMI, instance type, etc.)
aws ec2 create-launch-template \
    --launch-template-name MyWorkerTemplate \
    --launch-template-data '{"UserData": "#!/bin/bash\napt-get update -y && apt-get install -y python3 python3-pip\npip3 install boto3\n# Your worker script here - polls SQS, processes, deletes\npython3 /opt/worker/process_sqs.py my-processing-queue us-east-1"}'

# 3. Create the EC2 Auto Scaling Group
aws autoscaling create-auto-scaling-group \
    --auto-scaling-group-name MyWorkerASG \
    --launch-template LaunchTemplateName=MyWorkerTemplate,Version=1 \
    --min-size 1 \
    --max-size 10 \
    --desired-capacity 2 \
    --vpc-zone-identifier "subnet-xxxxxxxxxxxxxxxxx,subnet-yyyyyyyyyyyyyyyyy" \
    --tags "Key=Name,Value=SQSWorker"

# 4. Create the Target Tracking Scaling Policy
aws autoscaling put-scaling-policy \
    --auto-scaling-group-name MyWorkerASG \
    --policy-name SQSMessageTargetTracking \
    --policy-type TargetTrackingScaling \
    --target-tracking-configuration '{
        "TargetValue": 10.0,
        "PredefinedMetricSpecification": {
            "PredefinedMetricType": "SQSQueueAverageApproximateMessagesVisible",
            "ResourceLabel": "queue/my-processing-queue/xxxxxxxxxxxx" # Replace with your queue ARN's last part
        },
        "ScaleOutCooldown": 300, # Wait 5 minutes after scaling out before considering scaling in
        "ScaleInCooldown": 600  # Wait 10 minutes after scaling in before considering scaling out
    }'

The ResourceLabel for SQSQueueAverageApproximateMessagesVisible is crucial. It’s typically the queue/YOUR_QUEUE_NAME/YOUR_QUEUE_ID part of your SQS queue’s ARN (Amazon Resource Name). You can find this in the SQS console or by running aws sqs get-queue-attributes --queue-url YOUR_QUEUE_URL --attribute-names QueueArn.

The most surprising true thing about this setup is that the Auto Scaling Group doesn’t directly "see" the SQS queue itself; it relies entirely on CloudWatch metrics that represent the state of the SQS queue. The SQSQueueAverageApproximateMessagesVisible metric is published by SQS to CloudWatch, and the Auto Scaling Group’s target tracking policy monitors that metric.

This means you can scale based on message backlog without your worker instances needing to know about autoscaling, and without the Auto Scaling Group needing direct IAM permissions to ReceiveMessage or DeleteMessage from SQS. The worker processes running on the EC2 instances handle the SQS interactions.

The next concept you’ll likely encounter is how to manage the worker process itself within the EC2 instances, and ensuring it can reliably poll, process, and delete messages without getting stuck or losing work.

Want structured learning?

Take the full Sqs course →