Grafana Tempo Ingestion Rate Limits: Config and Burst (2026)

Tempo’s ingestion rate limits are actually a good thing, preventing a single noisy trace from overwhelming your backend and causing cascading failures, rather than just dropping data.

Here’s how Tempo handles ingestion, and how you can tune it.

Tempo ingests traces through its distributor component. The distributor is stateless and shards incoming trace data to ingester components. Each ingester then writes data to object storage and an indexing system (like Loki or Elasticsearch).

The distributor itself has configurable limits to prevent it from accepting more data than the downstream ingesters can handle. These limits are primarily controlled by the distributor.slice-max-bytes and distributor.traces-per-second configuration options.

Let’s look at a common scenario: you’re seeing 429 Too Many Requests errors from the Tempo API when pushing traces, or your traces are disappearing from Grafana. This often means the distributor’s limits are being hit.

The most common cause is simply exceeding the default distributor.slice-max-bytes. This limit is designed to prevent a single, massive trace from consuming excessive memory on the distributor before it can be sliced and sent to ingesters.

To diagnose this, you’ll need to look at the logs of your Tempo distributor. Search for messages containing "slice max bytes exceeded" or "too many requests". You can also monitor the tempo_distributor_received_traces_total and tempo_distributor_exported_slices_total metrics in Grafana.

If you’re hitting this, the fix is to increase distributor.slice-max-bytes. The default is usually around 1MB. A common adjustment is to increase it to 2097152 (2MB) or 4194304 (4MB) depending on your trace sizes and available memory.

distributor:
  slice-max-bytes: 4194304 # 4MB

This works because it allows the distributor to buffer larger individual trace payloads before it attempts to slice them, reducing the likelihood of hitting the per-slice limit for a single large trace.

Another frequent culprit is the distributor.traces-per-second limit. This is a global rate limit applied by the distributor to prevent overwhelming the entire ingester pool.

Check your distributor logs for messages indicating "rate limit exceeded" or "too many requests" related to overall throughput. The metric tempo_distributor_rate_limited_traces_total is your key indicator here.

If this is the bottleneck, you’ll need to increase distributor.traces-per-second. The default is often around 50,000 traces per second. You might raise this to 100000 or 200000 if your cluster can handle it.

distributor:
  traces-per-second: 100000

This increases the maximum number of traces the distributor will accept per second, allowing higher overall ingestion volume.

Sometimes, the issue isn’t the distributor’s global limits, but rather the limits imposed on individual ingesters. The ingester.max-block-duration and ingester.max-block-bytes settings control how much data an ingester buffers before writing to object storage. If these are too small, ingesters can become overwhelmed, leading to backpressure that eventually affects the distributor.

Look for logs on the ingester components that mention "flushing block" or "block full" errors, and monitor tempo_ingester_blocks_written_total. Also, check tempo_ingester_current_blocks to see how many blocks are being held.

To fix this, increase ingester.max-block-duration and ingester.max-block-bytes. For example, you might set:

ingester:
  max-block-duration: 1h
  max-block-bytes: 268435456 # 256MB

This allows ingesters to buffer more data before writing, reducing the frequency of flushes and the load on the ingesters.

Don’t forget the ingester.traces-per-second setting, which is similar to the distributor’s but applies per ingester instance. If you have many ingesters but this per-instance limit is low, you might not be utilizing your cluster’s full capacity.

Monitor tempo_ingester_traces_received_total and look for ingester-specific rate limiting messages.

Adjust ingester.traces-per-second on a per-ingester basis. If you have 4 ingesters and want to support 200,000 traces/sec total, you might set this to 50000 per ingester.

ingester:
  traces-per-second: 50000

This distributes the load more evenly across your ingester pool.

The tempo-querier component also has ingestion-related settings, specifically querier.max-concurrent-requests, which can indirectly affect ingestion if the indexing backend (like Loki) becomes overloaded by queries triggered by ingestion. While not a direct ingestion limit, it’s worth checking if backend query performance is degraded.

Monitor tempo_querier_concurrent_requests and look for high latency in trace retrieval.

If this is an issue, increase querier.max-concurrent-requests. For example:

querier:
  max-concurrent-requests: 500

This allows the querier to handle more simultaneous requests to the indexing backend, improving overall responsiveness.

Finally, consider the tempo.rate-limit global setting. This is a higher-level rate limit that can be applied across all Tempo components. If you have this set too low, it will cap your overall ingestion regardless of individual component limits.

Check your main tempo.yaml for a rate-limit section. If it exists and is restrictive, consider increasing it.

# Example: Global rate limit
rate-limit:
  enabled: true
  max-traces-per-second: 150000

This acts as a master switch, ensuring that even if individual component limits are high, the overall system doesn’t exceed a defined threshold.

The next error you’ll likely encounter after optimizing ingestion is related to the indexing backend (Loki, Elasticsearch) struggling to keep up with the increased query load generated by retrieving these newly ingested traces.

More Deep Dives in Tempo