Tempo’s flat JSON log format is surprisingly effective for trace correlation because it embeds trace IDs directly into the log line, making them searchable without needing complex indexing.

Let’s see this in action with a simple example. Imagine we have a service that logs events. In Tempo’s flat JSON format, a log line might look like this:

{"level":"info","ts":"2023-10-27T10:00:00Z","caller":"my_service/main.go:25","msg":"User logged in","user_id":"user123","traceID":"abc123xyz"}
{"level":"debug","ts":"2023-10-27T10:00:01Z","caller":"my_service/auth.go:50","msg":"Authentication successful","user_id":"user123","traceID":"abc123xyz"}
{"level":"info","ts":"2023-10-27T10:00:05Z","caller":"my_service/main.go:30","msg":"User session created","user_id":"user123","traceID":"abc123xyz"}

Notice the traceID field. This is the magic. When Tempo ingests these logs, it indexes this traceID. If you’re looking at a trace in Grafana and click on the "Logs" tab, Tempo will automatically filter the logs to show only those associated with that specific trace ID.

This solves the problem of trying to piece together the journey of a request across different services and components. Traditionally, you’d need to correlate logs based on timestamps and potentially request IDs, which can be brittle and require manual effort. Tempo, by standardizing on a common field like traceID within the log payload, allows for direct, programmatic correlation.

Internally, Tempo’s distributor component receives these logs. It identifies the traceID and associates it with the trace data being built. When a query comes in for a specific trace, Tempo can efficiently retrieve all log entries that share that same traceID from its object storage (like S3 or GCS) where the flat JSON logs are stored. The "flat" nature means the traceID is a top-level field, making it trivial for the ingestion and query paths to find and use it.

The key levers you control are primarily in how your applications generate logs. You need to ensure that:

  1. Your logging library is configured to output JSON.
  2. Your application logic injects the current traceID into the log context. This is typically done by passing the trace context through function calls or using a context propagation mechanism.
  3. Your Tempo configuration is set up to ingest logs from your chosen source (e.g., via Fluentd, Promtail, or directly).

For example, if you’re using go.uber.org/zap as your logger, you might do something like this:

import (
	"go.uber.org/zap"
	"go.opentelemetry.io/otel/trace"
)

func myHandler(w http.ResponseWriter, r *http.Request) {
	ctx := r.Context()
	span := trace.SpanFromContext(ctx)
	traceID := span.SpanContext().TraceID().String() // Get the trace ID

	logger, _ := zap.NewProduction() // Or your configured logger
	defer logger.Sync()

	// Add traceID to the log fields
	logger.Info("Request received",
		zap.String("traceID", traceID),
		zap.String("method", r.Method),
		zap.String("path", r.URL.Path),
	)
	// ... rest of your handler logic
}

This ensures that every log message generated within the context of a trace has that traceID embedded. Tempo then picks this up.

The most surprising aspect is how little effort is required on the Tempo side for this to work, assuming your logs are already in JSON and carry a traceID. The complexity is shifted to ensuring your application correctly generates and propagates trace IDs into its logs. If your logs are not in JSON, or if the trace ID field name is inconsistent across services (e.g., traceid vs. traceID vs. trace_id), Tempo won’t be able to correlate them automatically. You’d then need to configure log processing pipelines (like Fluentd or Logstash) to normalize these fields before they reach Tempo, or rely on Tempo’s log processing capabilities if you’re using the Loki integration.

Once you have trace correlation working, the next logical step is to explore distributed tracing within Grafana itself, linking traces to metrics.

Want structured learning?

Take the full Tempo course →