Tekton’s Affinity Assistant is secretly a sophisticated scheduler that nudges tasks to share nodes, not because it’s "nice," but because it aggressively minimizes the blast radius of failure.
Let’s see it in action. Imagine you have a critical CI/CD pipeline with a sequence of tasks: build, test, and deploy. Without Affinity Assistant, Kubernetes might schedule build on node n1, test on n2, and deploy on n3. If n2 fails, your entire test stage is gone.
With Affinity Assistant enabled, it tries to keep related tasks together. If build lands on n1, it will strongly prefer to schedule test on n1 as well.
Here’s a simplified Pipeline resource demonstrating its use:
apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
name: build-test-deploy
spec:
tasks:
- name: build
taskSpec:
steps:
- name: echo-build
image: ubuntu
script: |
echo "Building..."
sleep 60
- name: test
taskRef:
name: test # Assuming a separate Task for testing
runAfter:
- build
# This is where Affinity Assistant comes in:
affineTo:
# Tells Tekton to try and schedule this task on the same node as the task named 'build'
task: build
- name: deploy
taskRef:
name: deploy # Assuming a separate Task for deployment
runAfter:
- test
affineTo:
# Tells Tekton to try and schedule this task on the same node as the task named 'test'
task: test
When this pipeline runs, Tekton’s controller, specifically the PipelineRun controller and the TaskRun controller, will look at the affineTo field. If a TaskRun for build is scheduled on node worker-01, the TaskRun for test will be created with a nodeSelector or affinity rule that strongly prefers worker-01.
The problem this solves is resilience and efficiency. By co-locating dependent tasks, you achieve:
- Reduced Data Transfer: If your
buildtask produces an artifact, andtestneeds to consume it, keeping them on the same node means the artifact might just stay on local disk or in a local cache, rather than being transferred across the network to a different node. This is especially crucial for large artifacts. - Improved Cache Hit Rates: If you’re using shared caching mechanisms (like build caches or dependency caches), co-locating tasks increases the likelihood that the cache is already warm on the node where the next task is scheduled.
- Mitigated Blast Radius: This is the primary driver for Affinity Assistant. If a node fails, all tasks scheduled on it are affected. By keeping related tasks together, you isolate the failure. If
buildandtestare onn1, andn1goes down, bothbuildandtestfail. This is often preferable tobuildsucceeding onn1,testfailing onn2(due ton2’s failure), and then needing to re-runbuildanyway because theteststage is incomplete. The system signals a clear point of failure.
The underlying mechanism involves Tekton’s reconciler components. When a PipelineRun is created, the controller creates TaskRun objects. For TaskRuns with affineTo, the controller inspects the TaskRun of the referenced task (if it’s already running or completed). It then injects Kubernetes affinity rules or nodeSelector fields into the TaskRun’s pod template. This is not a Tekton-level scheduler deciding where pods go; it’s Tekton influencing the Kubernetes scheduler by providing stronger hints.
The affineTo field is a powerful directive. It doesn’t guarantee co-location, as the Kubernetes scheduler ultimately makes the final decision based on node availability, resource constraints, and other existing pod affinities. However, it significantly increases the probability. The assistant adds rules to the TaskRun’s Pod spec, typically a requiredDuringSchedulingIgnoredDuringExecution affinity rule. For example, if build is on worker-01, the test TaskRun might get a pod spec like this (simplified):
spec:
template:
spec:
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
tekton.dev/pipeline: build-test-deploy # Label identifying the pipeline
tekton.dev/task: build # Label identifying the affine task
topologyKey: kubernetes.io/hostname
This tells Kubernetes: "I require this pod to be scheduled on a node that already has a pod running for the build task within the build-test-deploy pipeline."
Most people don’t realize that affineTo respects runAfter implicitly. If you affineTo: task: A and runAfter: [B], the affinity rule for A will only be applied once B has started or completed, and it will look for B’s node. If B hasn’t started yet, A might be scheduled elsewhere. However, once B is on a node, and A is being scheduled or rescheduled, the affinity rule will kick in.
The next logical step after mastering task co-location is understanding how to manage resource contention when multiple critical tasks must share a node.