Tempo’s tail sampling is a powerful way to control trace volume without losing critical information.
Let’s see it in action. Imagine you have a microservice architecture and you’re getting inundated with traces from a high-traffic, low-value endpoint. You want to keep all traces from your critical checkout service, but only sample 10% of traces from the user-profile service.
Here’s a snippet of a Tempo configuration file (tempo.yaml) demonstrating tail sampling rules:
distributor:
# ... other distributor config ...
tail_sampling:
policies:
- name: "keep-critical-traces"
type: "status_code"
status_code: "error" # Keep all traces with an error status code
percentage: 100
- name: "sample-user-profile"
type: "trace_id"
service: "user-profile" # Target the user-profile service
percentage: 10 # Keep only 10% of traces from this service
- name: "sample-other-services"
type: "trace_id"
percentage: 1 # Default to 1% for all other services
This configuration tells Tempo to:
- Keep all traces marked with an error status code (policy
keep-critical-traces). This is our safety net. - Keep only 10% of traces originating from the
user-profileservice (policysample-user-profile). This significantly reduces volume for a noisy service. - Keep only 1% of all other traces (policy
sample-other-services). This is a general catch-all for low-volume services.
The problem this solves is the classic trade-off between observability and cost/storage. Full tracing everywhere is expensive and can overwhelm your storage. Sampling at the head (when the trace starts) can lose critical traces if an error occurs later in the trace. Tail sampling, however, makes decisions after the entire trace has been collected, ensuring you don’t discard a trace that later reveals an error.
Internally, Tempo’s distributor component handles tail sampling. When a trace finishes (all its spans have arrived at the distributor), the distributor evaluates the configured sampling policies against the trace’s attributes (like service name, status code, or even custom attributes). Based on the matching policies and their percentages, it decides whether to keep or discard the trace. The trace_id type policy is particularly interesting because it uses a consistent hashing algorithm based on the trace ID to ensure that for a given trace ID, the sampling decision is always the same across all the distributor instances. This prevents a single trace from being sampled by one instance and dropped by another if it happens to hit multiple distributors.
The percentage field isn’t a strict guarantee for every single trace. It’s a probabilistic target. Over a large number of traces, the system will approach the specified percentage. For fine-grained control, you can combine different type options. For instance, you could have a policy that keeps 100% of traces from checkout if they have an error, but only 5% otherwise.
A common misconception is that trace_id sampling is arbitrary. It’s not. Tempo uses a deterministic algorithm (typically based on murmurhash or similar) on the trace ID to assign it to a "bucket." The percentage then dictates how many of these buckets are marked for sampling. This means if you have multiple Tempo distributor instances, a trace with a given ID will always be sampled or dropped consistently across all instances.
The next concept you’ll likely encounter is how to integrate these sampling decisions with your alerting strategy, ensuring that critical errors are always visible even with aggressive sampling.