Grafana Tempo’s remote write functionality for span metrics is essentially a highly efficient pipeline designed to ingest and process trace data, specifically focusing on extracting metrics derived from spans.

Let’s dive into how it actually works. Imagine you have a distributed tracing system (like Jaeger, OpenTelemetry, or even Tempo itself) generating spans. These spans contain rich information about requests as they traverse your services: duration, status, service name, operation name, and various tags. Tempo’s remote write feature allows these tracing backends to send this span data directly to Tempo. But Tempo doesn’t just store raw spans; it has a specialized pipeline that can transform this span data into metrics. This is the "Span Metrics Pipeline."

Here’s a simplified look at a typical flow:

  1. Span Ingestion: Your tracing backend (e.g., an OpenTelemetry Collector configured with otlp exporter) sends spans to Tempo’s ingester. Tempo can accept spans via its native Tempo protocol, Jaeger’s Thrift/Protobuf, or OpenTelemetry Protocol (OTLP).
  2. Span Storage: Tempo stores these raw spans, making them searchable and viewable in Grafana.
  3. Metrics Extraction (The Pipeline): This is where the magic for span metrics happens. Tempo, or more commonly, an intermediary like the OpenTelemetry Collector, processes these incoming spans. It looks for specific attributes or the inherent properties of spans (like duration) and aggregates them into metrics.
    • Example: A common metric extracted is http.server.duration (or similar, depending on your instrumentation). This metric is derived from the duration of spans that represent HTTP server requests. Other metrics might include counts of requests per service/operation, error rates (spans with an error status), or latency percentiles.
  4. Metric Export: These extracted metrics are then exported, usually to a Prometheus-compatible endpoint (like Tempo’s own metrics endpoint or a separate Prometheus server).

This pipeline allows you to gain operational insights from your traces without needing to instrument your applications again specifically for metrics. You get metrics like request rates, error rates, and latency distributions directly from your existing trace data.

How to Set It Up (The Core Idea)

The most common and flexible way to leverage Tempo’s span metrics pipeline is not by having Tempo itself do the heavy lifting of metric generation from spans. Instead, you typically use the OpenTelemetry Collector.

Here’s the mental model:

  • Tracing Backend: Generates spans.
  • OpenTelemetry Collector (OTel Collector):
    • Receives spans from the tracing backend.
    • Processes these spans through a receiver (e.g., otlp).
    • Applies transformations and metric generation using a processor (e.g., spanmetrics).
    • Exports the generated metrics to a Prometheus-compatible endpoint.
    • (Optionally) Exports the raw spans to Tempo for storage and debugging.
  • Prometheus/Mimir/Loki: Scrapes and stores the generated metrics.
  • Grafana: Visualizes both traces (from Tempo) and metrics (from Prometheus/Mimir).

Live Example: OTel Collector Configuration

Let’s say you’re using the OTel Collector to receive spans from your applications (via OTLP) and you want to generate metrics from them.

receivers:
  otlp:
    protocols:
      grpc:
      http:

processors:
  spanmetrics:
    # This is the core processor for span metrics.
    # It needs to know how to group and aggregate spans into metrics.
    # 'aggregation_temporality' is crucial for Prometheus compatibility.
    aggregation_temporality: DELTA # Or AGGREGATE for Prometheus Remote Write
    # 'latency_histogram_buckets' defines the buckets for latency metrics.
    # These are standard Prometheus histogram buckets.
    latency_histogram_buckets: [0.005, 0.01, 0.025, 0.05, 0.1, 0.2, 0.4, 0.6, 0.8, 1.0, 2.5, 5.0, 7.5, 10.0, 20.0, 40.0, 60.0]
    # 'enable_target_info' adds target information to metrics.
    enable_target_info: true
    # 'actions' define how to transform span attributes into metrics.
    # This is where you configure WHICH metrics you want.
    actions:
      # Example 1: Measure HTTP server request duration.
      # Looks for spans with 'http.method' and 'http.route' attributes.
      # Generates a histogram of durations.
      - key: http.server.request
        action: generate_histogram
        dimensions:
          - http.method
          - http.route
          - http.status_code
          - service.name
      # Example 2: Count incoming requests by operation.
      # Looks for spans with 'rpc.method' and 'service.name' attributes.
      # Generates a counter.
      - key: rpc.call
        action: generate_count
        dimensions:
          - rpc.method
          - service.name
      # Example 3: Measure client request duration.
      - key: http.client.request
        action: generate_histogram
        dimensions:
          - http.method
          - url.full
          - service.name

exporters:
  prometheus:
    endpoint: "0.0.0.0:8889" # Exposes metrics on this port

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [spanmetrics] # Apply the spanmetrics processor
      exporters: [] # You might export traces to Tempo here too
    metrics:
      receivers: [] # Metrics are generated by the spanmetrics processor
      processors: []
      exporters: [prometheus] # Export the generated metrics

In this configuration:

  • The spanmetrics processor is enabled.
  • aggregation_temporality: DELTA is often used when exporting to Prometheus directly, as Prometheus handles the aggregation. If you were exporting to Mimir or another system that expects pre-aggregated data, you might use AGGREGATE.
  • The actions section is critical. It tells the spanmetrics processor:
    • When it sees a span with certain attributes (like http.method, http.route), it should perform an action.
    • generate_histogram will create a Prometheus histogram metric (e.g., http_server_request_duration_seconds_bucket, _count, _sum).
    • generate_count will create a Prometheus counter metric (e.g., rpc_call_total).
    • dimensions specify which span attributes should be used as labels for these metrics.

This collector would then expose metrics on http://localhost:8889/metrics. You’d configure Prometheus to scrape this endpoint.

The "Aha!" Moment: Why This Isn’t Just Tempo

The most surprising truth about Tempo’s span metrics pipeline is that Tempo itself rarely generates the metrics. While Tempo can be configured to do some basic metric extraction if you send it spans directly and have specific ingester configurations, the robust, flexible, and idiomatic way to achieve this is by using the OpenTelemetry Collector. The Collector acts as the intelligent intermediary, transforming raw trace data into Prometheus-readable metrics before they even hit a metric storage system. This separation of concerns is key: Tempo is for traces, Prometheus is for metrics, and the OTel Collector bridges them.

This approach also means you can send your spans to Tempo for archival and analysis, and simultaneously extract metrics from them using the OTel Collector, all from the same source of spans.

The next problem you’ll likely encounter is figuring out how to correlate these metrics back to individual traces effectively in Grafana.

Want structured learning?

Take the full Tempo course →