Canary deployments in Traefik are fundamentally about controlling which version of your service traffic flows to, based on weights you define.
Let’s see Traefik in action. Imagine you have two versions of your myapp service: myapp-v1 and myapp-v2. You want to send 90% of traffic to v1 (your stable version) and 10% to v2 (your new, canary version).
Here’s how you’d configure this using Traefik’s dynamic configuration, typically via a Kubernetes Custom Resource Definition (CRD) like IngressRoute:
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
name: myapp-ingress
spec:
entryPoints:
- websecure
routes:
- match: Host(`myapp.example.com`)
kind: Rule
services:
- name: myapp-v1
port: 80
weight: 90 # 90% of traffic goes to myapp-v1
- name: myapp-v2
port: 80
weight: 10 # 10% of traffic goes to myapp-v2
This IngressRoute tells Traefik: "When a request comes in for myapp.example.com on the websecure entrypoint, split the traffic between myapp-v1 and myapp-v2 according to the specified weights." Traefik will then dynamically update its routing table to reflect these weights, sending approximately 90 out of every 100 requests to myapp-v1 and 10 to myapp-v2.
The problem this solves is reducing the risk of deploying new service versions. Instead of a full cutover where all users are immediately exposed to a potentially buggy new version, you can gradually roll out the new version to a small subset of users. If issues arise, you can quickly roll back by adjusting the weights (e.g., setting v2 weight to 0) before significant impact.
Internally, Traefik uses a weighted round-robin algorithm for this. When a request matches a route with multiple services and weights, Traefik selects a service based on its assigned weight. It maintains an internal counter or state for each service, incrementing it by the service’s weight. The service whose counter reaches or exceeds a certain threshold (often related to the sum of all weights) is chosen, and its counter is reset. This ensures that over time, the traffic distribution closely matches the defined weights.
The key levers you control are the weight values for each service in your routing configuration. These are simple integers. A weight of 0 effectively disables traffic to that service. You can also have more than two services in a weighted route, distributing traffic across multiple versions or even different backend implementations.
A common misconception is that weights are strictly enforced per-request. In reality, Traefik’s weighted distribution is an aggregate behavior. While it aims for precise ratios, network latency, request patterns, and the internal timing of Traefik’s routing updates can lead to minor statistical deviations over short periods. You might see 12 requests go to v2 in a burst of 100 if v2 happens to be selected more frequently due to the algorithm’s internal state at that exact moment, but over thousands of requests, the ratio will converge to the defined weights.
The next step after mastering weighted traffic splitting is often implementing advanced routing rules for your canary, like routing specific user agents or request headers to the canary version.