Rate limiting is a fundamental security and stability mechanism, but the most effective approach involves understanding that it’s not about blocking requests, but about managing the rate at which a service can process them without collapsing.
Let’s see Traefik’s rate limiting in action. Imagine a simple API endpoint we want to protect. Here’s a Traefik configuration that limits requests to 10 per second per IP address:
http:
routers:
my-api-router:
rule: "Host(`api.example.com`) && PathPrefix(`/v1/users`)"
service: my-api-service
middlewares:
- rate-limit-middleware
services:
my-api-service:
loadBalancer:
servers:
- url: "http://10.0.0.10:8080"
middlewares:
rate-limit-middleware:
rateLimit:
average: 10 # Average requests per second
burst: 20 # Maximum burst requests
period: 1s # The time window for the average
sourceCriterion:
ip: {} # Rate limit based on client IP
When a client sends requests to api.example.com/v1/users, Traefik intercepts them. If a single IP address exceeds the configured average of 10 requests within the period of 1 second, Traefik will start rejecting subsequent requests. However, it allows for a burst of 20 requests. This means if a client makes 5 requests at time t, then 10 requests at t+0.5s, and then another 10 requests at t+0.9s, the first 25 requests might be allowed. But if the client continues to send requests at a sustained rate of more than 10 per second, those exceeding the limit will be denied with a 429 Too Many Requests status code.
The core problem Traefik’s rate limiting middleware solves is preventing a single client, or a small group of clients, from overwhelming a backend service with an excessive volume of requests. This is crucial for maintaining service availability, preventing denial-of-service (DoS) attacks, and ensuring fair usage among all users. It acts as a protective shield, absorbing traffic spikes and only letting through requests at a rate the service can reliably handle.
Internally, Traefik uses a token bucket algorithm for rate limiting. Each client (identified by the sourceCriterion, defaulting to IP address) is associated with a "bucket." This bucket has a certain capacity (burst) and refills with "tokens" at a steady rate (average per period). When a request arrives, Traefik attempts to take a token from the bucket. If a token is available, the request is allowed to pass. If the bucket is empty, the request is rejected. The burst value determines how many requests can be processed in quick succession when the bucket is full, providing a buffer for legitimate traffic spikes. The average and period define the sustained rate limit.
The sourceCriterion is incredibly flexible. While ip: {} is common, you can also rate limit based on headers (headers: { names: ["X-Real-IP"] }) or even combinations of criteria. This allows for fine-grained control, such as rate limiting a specific API key passed in a header rather than just the client’s IP, which might be shared by many users behind a NAT.
The period setting doesn’t just define the window for the average calculation; it also influences how quickly the burst capacity is replenished. A shorter period with a high average and burst will allow for more aggressive traffic bursts that recover quickly, while a longer period will smooth out traffic over a more extended duration.
What’s often overlooked is the interaction between multiple rate limiters. If you have a global rate limiter and then a specific rate limiter on a particular API path, Traefik evaluates them sequentially. A request must satisfy all applicable rate limiters. This means you can build layered defenses: a broad limit for the entire application, and then tighter limits for critical or resource-intensive endpoints.
The next logical step after mastering rate limiting is to consider how to differentiate traffic and apply different policies, which is where Traefik’s requestHeaders and responseHeaders middlewares come into play for advanced traffic shaping and observability.