The surprising truth about SQS Lambda triggers is that they don’t automatically scale concurrency based on queue depth. Instead, Lambda polls the SQS queue, and the number of concurrent Lambda functions is primarily driven by the ReservedConcurrency setting, not the number of messages waiting.

Let’s see this in action. Imagine an SQS queue named my-processing-queue and a Lambda function my-sqs-processor.

Here’s a typical Lambda configuration for an SQS trigger:

{
  "FunctionName": "my-sqs-processor",
  "Handler": "index.handler",
  "Role": "arn:aws:iam::123456789012:role/lambda-sqs-role",
  "Runtime": "nodejs18.x",
  "Code": {
    "S3Bucket": "my-lambda-code-bucket",
    "S3Key": "my-sqs-processor.zip"
  },
  "Environment": {
    "Variables": {
      "QUEUE_URL": "https://sqs.us-east-1.amazonaws.com/123456789012/my-processing-queue"
    }
  },
  "ReservedConcurrency": 100, // This is key, not queue depth
  "Timeout": 300,
  "MemorySize": 128
}

And here’s the Lambda event source mapping configuration:

{
  "UUID": "a1b2c3d4-e5f6-7890-1234-567890abcdef",
  "BatchSize": 10,
  "FunctionArn": "arn:aws:lambda:us-east-1:123456789012:function:my-sqs-processor",
  "EventSourceArn": "arn:aws:sqs:us-east-1:123456789012:my-processing-queue",
  "State": "Enabled",
  "ParallelizationFactor": 1 // Default, can be up to 10
}

When messages arrive in my-processing-queue, Lambda doesn’t spontaneously spin up hundreds of functions just because the queue is full. Instead, Lambda’s SQS polling service, which runs independently, attempts to invoke my-sqs-processor. The number of concurrent invocations of my-sqs-processor is capped by its ReservedConcurrency. If ReservedConcurrency is set to 100, then at most 100 instances of your Lambda function can run simultaneously, regardless of whether there are 1,000 or 10,000 messages in the queue.

The BatchSize (e.g., 10) determines how many messages are delivered to each Lambda invocation. The ParallelizationFactor (e.g., 1) controls how many batches Lambda will attempt to process concurrently for a single SQS queue. A ParallelizationFactor of 1 means Lambda will try to deliver up to BatchSize messages in parallel to one function invocation. Increasing ParallelizationFactor to, say, 5, means Lambda will try to deliver up to 5 batches (5 * BatchSize messages) concurrently to different invocations of the function, but still within the ReservedConcurrency limit.

The actual scaling mechanism you’re likely looking for involves adjusting ReservedConcurrency dynamically. This isn’t a built-in feature of SQS-to-Lambda triggers. You typically achieve this by:

  1. Monitoring Queue Depth: Use CloudWatch metrics like ApproximateNumberOfMessagesVisible for your SQS queue.
  2. Setting Alarms: Create CloudWatch Alarms that trigger when ApproximateNumberOfMessagesVisible exceeds a certain threshold for a sustained period.
  3. Automating Concurrency Updates: Use AWS Lambda’s PutFunctionConcurrency API. This is usually done via a separate Lambda function or a Step Functions workflow that reacts to the CloudWatch alarms. For example, if the queue depth consistently stays above 500 messages, an alarm fires, triggering a Lambda function that calls PutFunctionConcurrency to increase ReservedConcurrency for my-sqs-processor from, say, 100 to 200. Conversely, if the queue depth drops significantly, another alarm can trigger a workflow to decrease ReservedConcurrency to save costs.

The ParallelizationFactor parameter in the event source mapping can significantly increase throughput without necessarily increasing the number of invocations if your function is already bottlenecked on processing time per message. If your function processes a batch of 10 messages in 1 second, and you have ReservedConcurrency of 100, you can process 100 * 10 = 1000 messages per second. If you set ParallelizationFactor to 5, Lambda will attempt to send up to 5 batches (50 messages) concurrently to different invocations. If you have enough ReservedConcurrency (at least 5 in this case), you can effectively process more messages per second per Lambda instance if the bottleneck is SQS polling or network I/O rather than CPU.

When Lambda polls an SQS queue, it doesn’t just grab one message. It uses a long polling mechanism, and the MaxNumberOfMessages parameter in the ReceiveMessage API call is directly controlled by your BatchSize configuration. Lambda will attempt to fetch up to BatchSize messages in a single ReceiveMessage call. The WaitTimeSeconds parameter in ReceiveMessage is also configurable in the event source mapping, allowing Lambda to wait up to 20 seconds for messages before timing out its poll, which can reduce the number of empty poll requests and thus costs.

The subtle point most people miss is how ReservedConcurrency interacts with ParallelizationFactor. If you set ReservedConcurrency to 10 and ParallelizationFactor to 10, you’re telling Lambda to potentially send 10 batches of messages (up to 100 messages total, depending on BatchSize) to 10 different concurrently running functions. However, if you set ReservedConcurrency to 10 and ParallelizationFactor to 2, Lambda will try to send up to 2 batches concurrently, potentially to two different functions, but never exceeding the 10 concurrent function limit. This means the maximum number of concurrent function invocations is always dictated by ReservedConcurrency, regardless of ParallelizationFactor.

The next hurdle you’ll likely encounter is understanding the interplay between MaxReceiveCount in your SQS queue’s Dead Letter Queue (DLQ) configuration and Lambda’s visibility timeout to prevent message duplication or loss.

Want structured learning?

Take the full Sqs course →