Deploy Grafana Tempo on Kubernetes with Operator (2026)

Grafana Tempo, when deployed via its Kubernetes Operator, doesn’t just store traces; it’s a distributed system that prioritizes availability and scalability by decoupling its components.

Let’s see it in action. Imagine you’ve got Tempo deployed and you’re sending traces from a sample application. You can query these traces using the Grafana UI, which Tempo integrates with seamlessly. For example, you might see a trace for a specific HTTP request, broken down into its constituent spans, each with its own duration and metadata. This visual representation is key to understanding the flow of requests across your microservices.

The Operator simplifies managing Tempo’s distributed architecture. Tempo itself is composed of several microservices: the Distributor, Ingester, Querier, and Retriever. The Distributor receives traces and shoves them into object storage. The Ingester receives traces from the Distributor, chunks them, and writes them to object storage. The Querier fetches traces from object storage and serves them to Grafana. The Retriever is responsible for fetching the trace data from object storage. The Operator automates the deployment, scaling, and configuration of these components as Kubernetes resources (Deployments, StatefulSets, Services, etc.). This means you don’t manually craft Kubernetes manifests for each Tempo component; the Operator does it for you based on a custom resource definition (CRD).

The core problem Tempo solves is the difficulty in debugging distributed systems. When a request fails or is slow, pinpointing the problematic service can be a nightmare. Tempo, by collecting and aggregating traces from all your services, gives you a unified view of request lifecycles. You can see the exact path a request took, how long each service took to respond, and where errors occurred.

Internally, the Operator manages Tempo by observing a Tempo custom resource. You define the desired state of your Tempo deployment in this resource – things like the number of replicas for each component, the storage backend (e.g., S3, GCS, MinIO), and specific configurations for each service. The Operator then continuously reconciles the actual state of your Kubernetes cluster with this desired state, creating, updating, or deleting Kubernetes resources as needed. For instance, if you increase the replicas field for the ingester in your Tempo CR, the Operator will scale up the corresponding Kubernetes Deployment or StatefulSet.

The levers you control are primarily within the Tempo CR. You specify the image for each component, allowing you to pin specific versions. Under spec.mode, you choose between singleBinary (all components in one pod, good for testing) or distributed (each component as a separate Deployment/StatefulSet, for production). Storage configuration is crucial: spec.storage.trace defines where traces are stored. For S3, this would look like:

storage:
  trace:
    backend: s3
    s3:
      bucket: my-tempo-traces
      endpoint: s3.amazonaws.com
      region: us-east-1

Resource requests and limits for each component are also managed here, ensuring your Tempo deployment plays nicely with your cluster’s resources.

What most people miss is how Tempo’s tempo-query service works with its object storage backend. It doesn’t have its own cache for trace data; it always fetches raw trace chunks from object storage (like S3) and reconstructs the trace on the fly. This means the performance of your object storage is directly tied to your trace query latency, and optimizing your object storage’s read performance is as critical as scaling your Tempo Querier replicas.

The next concept to explore is how to integrate Tempo with other Grafana ecosystem components like Loki for logs and Prometheus for metrics, creating a powerful observability stack.

More Deep Dives in Tempo