Tekton’s Prometheus metrics aren’t just counters; they’re a live, granular view into the heart of your CI/CD, revealing bottlenecks and failures before they snowball.

Let’s see it in action. Imagine you’ve got a simple pipeline that clones a repo, builds a Docker image, and pushes it. Here’s how you’d expose Tekton’s metrics and then query them with Prometheus:

First, ensure the Tekton Dashboard is installed, as it bundles the metrics exporter. If not, you can install it with:

kubectl apply -f https://raw.githubusercontent.com/tektoncd/dashboard/main/k8s/000-namespace.yaml
kubectl apply -f https://raw.githubusercontent.com/tektoncd/dashboard/main/k8s/001-service-account.yaml
kubectl apply -f https://raw.githubusercontent.com/tektoncd/dashboard/main/k8s/002-role.yaml
kubectl apply -f https://raw.githubusercontent.com/tektoncd/dashboard/main/k8s/003-role-binding.yaml
kubectl apply -f https://raw.githubusercontent.com/tektoncd/dashboard/main/k8s/004-deployment.yaml
kubectl apply -f https://raw.githubusercontent.com/tektoncd/dashboard/main/k8s/005-service.yaml

The dashboard deployment typically exposes a service on port 9097. You’ll need to port-forward to access it locally if you haven’t set up an Ingress:

kubectl port-forward -n tekton-pipelines service/tekton-dashboard 9097:9097

Now, Prometheus needs to scrape these metrics. Add a scrape configuration to your prometheus.yml (or equivalent) file. If you’re running Prometheus in Kubernetes, this might look like a ServiceMonitor or PrometheusRule. For a local setup, you’d add this to your prometheus.yml:

scrape_configs:
  - job_name: 'tekton'
    static_configs:
      - targets: ['localhost:9097'] # If port-forwarding locally
        # Or, if Prometheus is in-cluster and can reach the dashboard service:
        # - targets: ['tekton-dashboard.tekton-pipelines.svc.cluster.local:9097']

Restart Prometheus, and you should start seeing Tekton metrics.

The core problem Tekton metrics solve is providing visibility into the distributed, asynchronous nature of pipeline execution. Without them, you’re flying blind, relying on log aggregation that’s often too high-level. Metrics give you quantifiable data on completion times, error rates, and resource utilization at the TaskRun, PipelineRun, and even Step levels.

The key metrics to watch are:

  • tekton_pipelinerun_duration_seconds: This measures the total time a PipelineRun takes from start to finish. You can aggregate this by pipeline name (pipeline), status (status), and namespace (namespace).
    • Query: rate(tekton_pipelinerun_duration_seconds_count{pipeline="my-app-pipeline"}[5m]) to see the rate of completed runs, or avg by (pipeline, status) (tekton_pipelinerun_duration_seconds{pipeline="my-app-pipeline"}) for average durations.
  • tekton_taskrun_duration_seconds: Similar to PipelineRun duration, but for individual TaskRuns within a PipelineRun. This is crucial for pinpointing slow tasks.
    • Query: avg by (task, status) (tekton_taskrun_duration_seconds{pipeline_name="my-app-pipeline"})
  • tekton_reconciler_loop_duration_seconds: This metric tracks how long the Tekton controller takes to process events and reconcile resources. Spikes here can indicate controller overload.
    • Query: rate(tekton_reconciler_loop_duration_seconds_sum{job="tekton-pipelines-controller"}[1m]) / rate(tekton_reconciler_loop_duration_seconds_count{job="tekton-pipelines-controller"}[1m])
  • tekton_pipeline_run_status: A gauge that indicates the current status of a PipelineRun (0 for running, 1 for success, 2 for failure, 3 for cancelled).
    • Query: tekton_pipeline_run_status{pipeline="my-app-pipeline", status="2"} to count currently failed runs.

These metrics are generated by the Tekton controller and the results of TaskRuns. The controller exposes its own internal metrics, while TaskRun results are translated into Prometheus metrics by the tekton-results component, which is often part of the dashboard.

The most surprising thing about Tekton metrics is how granular they get without explicit configuration. You don’t need to add special annotations or Prometheus-specific exporters to your Tasks; Tekton automatically instruments its core resources. This means you get performance data on every PipelineRun and TaskRun out of the box, allowing you to correlate performance degradation with specific pipeline stages or even individual steps within a Task.

To effectively use these metrics, you’ll want to set up dashboards in Grafana. Visualizing tekton_taskrun_duration_seconds over time, broken down by task name, will immediately highlight which parts of your pipelines are consistently slow. Alerting on tekton_pipeline_run_status for failures or tekton_reconciler_loop_duration_seconds for controller strain is also a powerful proactive measure.

The next logical step after monitoring pipeline performance is to dive into the metrics of the underlying Kubernetes resources that your Tasks are running on, such as Pod CPU and memory usage, which can be exposed by kube-state-metrics and node-exporter.

Want structured learning?

Take the full Tekton course →