Traefik’s weighted round robin is a powerful mechanism to distribute traffic across multiple instances of a service, but it’s often misunderstood as a simple percentage split.

Let’s see it in action. Imagine we have a service called my-app running on two different versions, v1 and v2. We want to send 80% of the traffic to v1 and 20% to v2.

Here’s how you’d configure that in Traefik’s dynamic configuration (using YAML):

apiVersion: traefik.io/v1alpha1
kind: ServersTransport
metadata:
  name: my-app-transport
spec:
  # ... other transport settings if needed
  servers:
    - url: "http://192.168.1.10:8080" # IP and port for my-app v1
      weight: 8
    - url: "http://192.168.1.11:8080" # IP and port for my-app v2
      weight: 2

And then in your IngressRoute:

apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
  name: my-app-ingress
spec:
  entryPoints:
    - web
  routes:
    - match: Host(`my-app.example.com`)
      kind: Rule
      services:
        - name: my-app-service # This is a Kubernetes Service, or a Traefik Service object
          port: 80
          serversTransport: my-app-transport

When a request comes in for my-app.example.com, Traefik will use the my-app-transport configuration. The weight values (8 and 2) are not direct percentages. Instead, Traefik treats them as relative proportions. The total weight is 8 + 2 = 10. So, v1 (with weight 8) will receive 8/10 = 80% of the requests, and v2 (with weight 2) will receive 2/10 = 20% of the requests.

This mechanism is incredibly useful for canary deployments. You can gradually shift traffic to a new version of your application while monitoring for errors or performance regressions. If v2 starts showing issues, you can simply adjust the weights back to favor v1 or even set v2’s weight to 0 to stop sending it traffic entirely.

The underlying principle is that Traefik maintains a pool of "slots" equal to the sum of all weights. For our example, there are 10 slots. 8 slots are assigned to v1 and 2 slots to v2. When a request arrives, Traefik picks a slot randomly (but weighted by the number of slots assigned to each service) and forwards the request to the corresponding backend server. This ensures that over a large number of requests, the distribution closely approximates the desired weighted ratio.

Crucially, the weight is applied at the server level within a ServersTransport or directly within a Traefik Service object if not using a ServersTransport. This means you can have multiple servers (e.g., multiple pods for v1) and assign weights to each individual server, not just to the logical versions of your application. For instance, if you had two pods for v1 and one for v2, and wanted 80% to v1 and 20% to v2, you could assign weights like v1-pod1: 4, v1-pod2: 4, and v2-pod1: 2.

What many people miss is that Traefik doesn’t guarantee perfect distribution for every single request. It’s a probabilistic distribution. If you have very low traffic, you might see deviations from the exact weighted ratio for short periods. For example, with weights 8 and 2, it’s perfectly possible (though statistically unlikely over many requests) for the first few requests to all go to v1. The weighting becomes more accurate as the total number of requests increases.

The next step is often to explore how Traefik handles health checks in conjunction with weighted routing, ensuring that unhealthy instances don’t receive traffic regardless of their assigned weight.

Want structured learning?

Take the full Traefik course →