TimescaleDB’s parallel query execution doesn’t just speed up queries; it fundamentally changes how you reason about query performance by distributing work across multiple CPU cores within a single node.
Here’s a typical scenario: you’ve got a growing time-series dataset, and your analytical queries are starting to crawl. You know TimescaleDB is built on PostgreSQL, and you’ve heard about "parallel query," but you’re not sure if it’s on by default, how to enable it, or what knobs to turn.
Let’s see it in action. Imagine a temperatures hypertable with data for millions of sensor readings.
-- Sample data generation (simplified for illustration)
CREATE TABLE temperatures (
time TIMESTAMPTZ NOT NULL,
sensor_id INT NOT NULL,
temperature DECIMAL NOT NULL
);
-- Create a hypertable
SELECT create_hypertable('temperatures', 'time');
-- Add an index for faster lookups
CREATE INDEX ON temperatures (time DESC, sensor_id);
-- Populate with some data (e.g., 10 million rows)
INSERT INTO temperatures
SELECT
NOW() - (random() * INTERVAL '30 days'),
(random() * 1000)::INT,
(random() * 50)::DECIMAL
FROM generate_series(1, 10000000);
Now, a common analytical query: finding the average temperature per sensor over the last day.
SELECT
sensor_id,
AVG(temperature) AS avg_temp
FROM temperatures
WHERE time >= NOW() - INTERVAL '1 day'
GROUP BY sensor_id;
Without parallel query, this might be a single-threaded operation, maxing out one CPU core. With parallel query enabled and tuned, PostgreSQL (and by extension, TimescaleDB) can break this query down and assign different parts of the data scan and aggregation to different CPU cores on the same database server. You’ll see the query plan change, often showing "Gather" or "Gather Merge" nodes, indicating that results from multiple worker processes are being combined.
The problem this solves is inherent to single-threaded processing: as data volume grows, performance scales linearly with hardware improvements only if you can use more of that hardware. Parallel query allows a single database node to leverage multiple cores, effectively multiplying its processing power for certain types of queries.
Internally, when a query can be parallelized, PostgreSQL spawns multiple "worker" processes. These workers execute a portion of the query plan, often scanning and processing different data blocks or partitions. A "leader" process then coordinates these workers, collecting their intermediate results and merging them into the final output. This is particularly effective for "data-parallel" operations like table scans, index scans, aggregations, and joins where the work can be easily divided. TimescaleDB’s partitioning (hypertables) can sometimes enhance this by allowing parallel workers to focus on specific chunks, although the primary mechanism is PostgreSQL’s internal parallel execution.
The primary lever you control is max_parallel_workers_per_gather. This setting determines the maximum number of worker processes that can be spawned for a single query that requires parallel execution.
To enable and tune, you’ll interact with PostgreSQL configuration parameters. TimescaleDB inherits these.
-
Enable Parallel Query: Parallel query is generally enabled by default in modern PostgreSQL versions (which TimescaleDB uses), but its effectiveness is governed by several parameters. The most crucial ones are:
max_parallel_workers_per_gather: This controls the maximum number of worker processes that can be started by a single parallel-returning operation (likeGather).max_worker_processes: This is the total number of background processes that the system can run, including parallel workers, autovacuum workers, etc.max_parallel_workers_per_gathercannot exceedmax_worker_processes.
To enable it effectively, you need to set these in your
postgresql.confor viaALTER SYSTEM. A common starting point on a multi-core machine (e.g., 8 cores) is:-- Example for a server with 8 CPU cores ALTER SYSTEM SET max_parallel_workers_per_gather = 4; -- Use half your cores for parallel workers ALTER SYSTEM SET max_worker_processes = 8; -- Allow enough for parallel workers + others -- Reload configuration for changes to take effect SELECT pg_reload_conf();Why this works:
max_parallel_workers_per_gatherdirectly dictates how many parallel workers can assist a single query. Setting it to half the available cores is a common heuristic to leave some capacity for the OS and other PostgreSQL processes, preventing overall system contention.max_worker_processesmust be large enough to accommodate the sum of all potential parallel workers across all active queries plus other essential background processes. -
Tune
parallel_setup_costandparallel_tuple_cost: These parameters influence the planner’s decision on whether to use parallel execution. The planner estimates the cost of setting up parallel workers (parallel_setup_cost) and the cost of passing tuples between them (parallel_tuple_cost). If the estimated cost of parallel execution is higher than sequential execution, the planner might opt for the sequential plan.-- Example tuning (adjust based on your query patterns) ALTER SYSTEM SET parallel_setup_cost = 1000; -- Default is 1000 ALTER SYSTEM SET parallel_tuple_cost = 10; -- Default is 10 SELECT pg_reload_conf();Why this works: By default, these costs are set to values that often favor sequential execution for smaller tables or simpler queries. Increasing
parallel_setup_costorparallel_tuple_costmakes the planner more likely to choose a parallel plan because the "cost penalty" for parallelization is effectively reduced. You might need to experiment here; very low values could lead to parallel plans being chosen for queries that would be faster sequentially. -
Tune
min_parallel_table_scan_sizeandmin_parallel_index_scan_size: These settings prevent the planner from using parallel scans for very small tables or index scans, where the overhead of starting workers would outweigh the benefits.-- Example tuning (adjust based on your typical data distribution) ALTER SYSTEM SET min_parallel_table_scan_size = '10MB'; -- Default is 10MB ALTER SYSTEM SET min_parallel_index_scan_size = '5MB'; -- Default is 5MB SELECT pg_reload_conf();Why this works: These parameters act as thresholds. If a table scan or index scan is expected to process less data than these values, the planner will likely stick to sequential execution. Increasing these values means the planner will only consider parallel scans for larger data chunks, ensuring parallelization is reserved for genuinely large operations.
-
Consider
force_parallel_mode(for testing/debugging): This parameter can force the planner to use or avoid parallel execution, useful for verifying if parallelization is the bottleneck or solution.-- To test if a query *would* run in parallel (even if defaults say no) SET force_parallel_mode = 1; -- To force sequential execution for comparison SET force_parallel_mode = 0;Why this works: This is a direct override for the planner’s logic. Setting it to
1(try parallel) or2(parallel only) helps you isolate whether parallel execution is even being considered. Setting it to0(sequential only) is useful for establishing a baseline performance. -
Monitor
pg_stat_activityandEXPLAIN (ANALYZE, BUFFERS): After making changes, observe your queries.EXPLAIN (ANALYZE, BUFFERS)is your best friend. Look forGatherorGather Mergenodes in the plan. The output will also show the actual time spent and indicate if parallel workers were used.EXPLAIN (ANALYZE, BUFFERS) SELECT sensor_id, AVG(temperature) AS avg_temp FROM temperatures WHERE time >= NOW() - INTERVAL '1 day' GROUP BY sensor_id;Why this works:
EXPLAIN ANALYZEshows you the actual execution plan and timing. If you seeGathernodes and the query finishes faster than before, andpg_stat_activityshows multiplepostgresprocesses for your query, you’re seeing parallel query in action. If you don’t seeGathernodes, or if the query is slower, your settings might need further adjustment or the query might not be amenable to parallelization.
The one thing most people don’t realize about parallel query is that it’s not just for massive tables; it can significantly benefit queries that scan large portions of smaller tables if those scans are the dominant cost, provided the overheads are managed. The parallel_setup_cost and parallel_tuple_cost are the primary levers that tell the planner when that overhead is worth it.
Once you have parallel query working effectively, the next step is often understanding how TimescaleDB’s chunking interacts with PostgreSQL’s parallel execution, and how to optimize queries that span many chunks.