Grafana Tempo Streaming Pipeline: Real-Time Trace Processing (2026)

Grafana Tempo can process traces in near real-time, but it’s not by streaming them directly to storage.

Let’s see this in action. Imagine a system generating traces. These traces are typically sent to Tempo’s ingester.

Traces -> Tempo Ingester -> Tempo Querier -> Storage (e.g., S3, GCS)

The streaming pipeline here refers to how Tempo handles incoming traces before they hit persistent storage. It’s about the internal flow and processing within the Tempo components themselves.

The core problem Tempo solves is distributed tracing at scale, especially when dealing with high volumes of trace data. Traditional tracing systems might batch traces for storage, leading to latency between an event happening and its trace data being queryable. Tempo’s architecture aims to minimize this latency.

Tempo’s ingester is the entry point for trace data. It receives spans from various sources (like OpenTelemetry collectors) via protocols like OTLP.

// Example OTLP Trace Data (simplified)
{
  "resourceSpans": [
    {
      "resource": { "attributes": [...] },
      "scopeSpans": [
        {
          "scope": { "name": "my-app" },
          "spans": [
            {
              "traceId": "a1b2c3d4e5f67890",
              "spanId": "1122334455667788",
              "kind": "SPAN_KIND_INTERNAL",
              "name": "http.request",
              "startTimeUnixNano": "1678886400000000000",
              "endTimeUnixNano": "1678886400100000000",
              "attributes": [...],
              "status": { "code": "STATUS_CODE_OK" }
            }
          ]
        }
      ]
    }
  ]
}

Once the ingester receives a trace, it doesn’t immediately write the entire trace to object storage. Instead, it performs several operations:

Validation and Deduplication: It checks if the trace is valid and if it has already been processed.
Indexing: It extracts key information from the trace (like service name, operation name, trace ID, and any indexed tags) and generates index entries. These index entries are crucial for the Querier to locate traces efficiently.
Sharding/Balancing: It determines which backend storage (e.g., which S3 bucket or GCS bucket) this trace data should eventually reside in. This is often based on a consistent hashing of the trace ID.
Buffering/Queueing: The ingester might buffer trace data internally before handing it off to a component responsible for writing to persistent storage. This buffering is where the "streaming" aspect comes into play – data flows through these internal buffers.

The ingester then sends the trace data and its associated index entries to the Tempo Querier. The Querier is responsible for retrieving traces. When a query comes in, the Querier first consults the index.

# Example Tempo Configuration (simplified querier section)
querier:
  # ... other settings
  # Tempo uses a distributed, in-memory index for recent traces
  # and queries the index backend for older traces.
  index:
    # ... index specific settings like backend (e.g., Prometheus, Cassandra)

The Querier uses the index entries to find the specific trace IDs and then retrieves the actual trace data from the configured object storage (like S3). The "streaming pipeline" means that as soon as a trace is indexed and available, it can potentially be queried, rather than waiting for a batch write to complete.

The most surprising true thing about Tempo’s "streaming" is that it’s not a continuous, real-time stream of trace data being written to object storage. Instead, it’s a near real-time flow of trace index entries and batched trace data chunks being prepared for eventual, asynchronous writing to object storage. The ingester acts as a high-throughput buffer and indexer, making trace data available for querying very quickly by placing its metadata in an accessible index. The actual trace payloads are then asynchronously written to object storage in chunks, optimized for storage costs and retrieval patterns, not for immediate query availability.

The "streaming pipeline" is more accurately a high-throughput, low-latency ingestion and indexing process that makes traces queryable rapidly, even before the full trace data is durably stored in its final object storage location.

The next concept you’ll likely encounter is the role of the Tempo Distributor and how it interacts with the ingester and storage for achieving high availability and scalability.

More Deep Dives in Tempo