Tempo’s Parquet backend, while offering blazing-fast trace searches, is fundamentally a database that trades write speed for query efficiency by storing trace data in a columnar format.
Let’s see it in action. Imagine you’ve got a distributed system, and a user reports a slow request. You need to find that specific request’s trace in Tempo and see what happened.
Here’s a trace, represented conceptually, that you might be looking for:
{
"traceId": "a1b2c3d4e5f67890",
"spans": [
{
"traceId": "a1b2c3d4e5f67890",
"spanId": "001",
"parentSpanId": null,
"operationName": "HTTP GET /users",
"startTime": 1678886400000000000,
"duration": 500000000,
"tags": {
"http.method": "GET",
"http.status_code": 200,
"http.url": "/users",
"service.name": "user-service"
},
"logs": []
},
{
"traceId": "a1b2c3d4e5f67890",
"spanId": "002",
"parentSpanId": "001",
"operationName": "DB Query",
"startTime": 1678886401000000000,
"duration": 200000000,
"tags": {
"db.statement": "SELECT * FROM users WHERE id = ?",
"service.name": "user-service"
},
"logs": []
}
]
}
When Tempo stores this trace in Parquet, it doesn’t store it as a single JSON blob. Instead, it flattens and columnarizes this data. Think of it like a spreadsheet where each column represents a specific piece of information across many traces.
For example, you might have columns like:
traceId:a1b2c3d4e5f67890,anotherTraceId,yetAnotherTraceId, …spanId:001,002,003,004, …operationName:"HTTP GET /users","DB Query","HTTP POST /orders", …service.name:"user-service","order-service","payment-service", …startTime:1678886400000000000,1678886401000000000,1678886405000000000, …duration:500000000,200000000,1000000000, …tags.http.status_code:200,500,200, …
When you query Tempo, say for all traces from user-service that had a 500 status code, Tempo can efficiently scan only the service.name and tags.http.status_code columns. It doesn’t need to read the entire trace data for every trace. This is the core of columnar storage’s power for analytical queries.
The problem Tempo’s Parquet backend solves is the sheer scale of trace data. Storing billions of traces in a row-based format (like a traditional document database) would make searching for specific traces incredibly slow and resource-intensive. By organizing data into columns, Tempo can quickly identify and retrieve only the relevant data blocks needed to satisfy a query.
The internal workings involve breaking down traces into spans, and then flattening these spans into a tabular structure. This structure is then written to disk as Parquet files. Tempo manages these files, often organized by time chunks and trace IDs, in object storage like S3 or GCS. When a query comes in, Tempo’s query engine uses these file indices and the columnar nature of Parquet to prune away irrelevant data. It asks the object store for specific "row groups" or even specific columns within those row groups that match the query predicates.
You control Tempo’s Parquet backend primarily through its configuration. Key settings include the object storage bucket and path where Parquet files are stored, retention policies for how long data is kept, and the sampling rate if you’re using it. The system automatically handles the creation and management of Parquet files, including compaction and garbage collection, based on these configurations.
What’s often overlooked is how Tempo internally manages the lifecycle of these Parquet files. It doesn’t just dump data; it actively manages file creation, potentially merging smaller files into larger ones for better read performance (compaction), and deleting old files according to retention policies. This background process is crucial for maintaining query efficiency and managing storage costs over time, ensuring that the "index" of your trace data remains lean and effective.
The next step after optimizing your Parquet backend configuration is understanding how to tune your query patterns for maximum efficiency.