TimescaleDB secondary partitioning doesn’t actually create more partitions; it reorders existing ones for faster data access.

Let’s see it in action. Imagine a massive table sensor_data storing temperature readings from thousands of IoT devices. We’ve already partitioned it by time, which is standard.

CREATE TABLE sensor_data (
    time TIMESTAMPTZ NOT NULL,
    device_id INT NOT NULL,
    temperature DOUBLE PRECISION
);

SELECT create_hypertable('sensor_data', 'time');

Now, we frequently query sensor_data filtering by device_id within a specific time range. Without secondary partitioning, TimescaleDB might have to scan through many time chunks to find data for a single device_id.

-- This query can be slow on a large, time-partitioned table
SELECT AVG(temperature)
FROM sensor_data
WHERE device_id = 123
  AND time >= '2023-10-01'
  AND time < '2023-10-02';

This is where secondary partitioning by space comes in. We add a "partitioning key" that TimescaleDB uses within each existing time chunk. This doesn’t add new chunks; it adds an index-like structure inside each chunk.

-- Add secondary partitioning by device_id
ALTER TABLE sensor_data
SET (timescaledb.compress, timescaledb.compress_segmentby = 'device_id');

-- Reorganize existing chunks to apply the new partitioning
SELECT alter_chunk_set_compression_policy('sensor_data', 'off'); -- Temporarily disable compression for reorganization
SELECT recompress_chunk(chunk) FROM chunks_detailed('sensor_data') WHERE is_compressed = false; -- Recompress existing uncompressed chunks
SELECT alter_chunk_set_compression_policy('sensor_data', 'on'); -- Re-enable compression policy

When we set timescaledb.compress_segmentby = 'device_id', TimescaleDB internally groups data belonging to the same device_id together within each time chunk. This is not compression in the traditional sense of reducing storage size (though compression can be enabled alongside it). It’s about data locality.

Now, that same query is dramatically faster:

-- This query is now much faster
SELECT AVG(temperature)
FROM sensor_data
WHERE device_id = 123
  AND time >= '2023-10-01'
  AND time < '2023-10-02';

TimescaleDB can now efficiently locate the data for device_id = 123 within the relevant time chunk(s) without scanning unrelated data. It’s like having an index on device_id but applied within each time partition. The recompress_chunk operation is crucial because it physically reorganizes the data within the existing chunks according to the new segmentby key.

The compress_segmentby option is the mechanism for secondary partitioning by space. It tells TimescaleDB how to group data within each time chunk. You can specify multiple columns for segmentby if you often query by combinations of keys.

The most surprising aspect is that this reordering is tied to the compression mechanism. While you can enable compression alongside segmentby, you can also use segmentby purely for data locality without significant compression. This means you can gain the performance benefits of data organization without necessarily incurring the overhead or complexity of full data compression.

The next step is understanding how to choose the right columns for compress_segmentby based on your query patterns.

Want structured learning?

Take the full Timescaledb course →