TimescaleDB’s tiered storage lets you move older, less frequently accessed data to cheaper object storage, drastically cutting costs without impacting query performance on that data.
Let’s see it in action. Imagine you have a massive time-series dataset, say, IoT sensor readings, and you want to keep the last year of data on fast, local disk (for active queries) but archive everything older to S3 for long-term retention.
Here’s a hypothetical iot_sensor_readings hypertable:
CREATE TABLE iot_sensor_readings (
time TIMESTAMPTZ NOT NULL,
device_id INT NOT NULL,
temperature DOUBLE PRECISION,
humidity DOUBLE PRECISION
);
SELECT create_hypertable('iot_sensor_readings', 'time');
We’ll simulate some data and then set up tiered storage.
First, let’s add some data and observe its location. We’ll create a few chunks.
-- Add some data
INSERT INTO iot_sensor_readings VALUES
(NOW() - INTERVAL '2 days', 101, 22.5, 45.2),
(NOW() - INTERVAL '2 days', 102, 23.1, 44.8),
(NOW() - INTERVAL '1 day', 101, 22.7, 45.5),
(NOW() - INTERVAL '1 day', 103, 21.9, 46.1),
(NOW() - INTERVAL '10 days', 101, 20.1, 50.3),
(NOW() - INTERVAL '10 days', 102, 20.5, 49.8);
-- This command shows chunk information.
-- Initially, all data resides on 'main' (local) storage.
SELECT
chunk.name,
chunk.tablespace,
chunk.range
FROM
timescaldb_information.chunks chunk
ORDER BY
chunk.range DESC;
You’d see output like:
name | tablespace | range
-----------------+------------+---------------------------------
chunk_1_1234567 | main | [2023-10-27 10:00:00, 2023-10-28 10:00:00)
chunk_1_7890123 | main | [2023-10-17 10:00:00, 2023-10-27 10:00:00)
Now, let’s set up object storage. We’ll use AWS S3 as an example. You’ll need an S3 bucket and credentials configured for your PostgreSQL server.
-- Create an object store access method for S3
CREATE ACCESS METHOD s3_bucket TYPE STORAGE LANCOMPATIBLE 's3';
-- Configure the S3 connection details.
-- Replace 'your-bucket-name' and 'your-region' with your actual values.
-- Ensure your PostgreSQL instance has IAM permissions or credentials configured
-- to access this S3 bucket.
ALTER ACCESS METHOD s3_bucket SET (
bucket 'your-bucket-name',
region 'us-east-1'
);
Next, we create a tablespace that points to this object store.
-- Create a tablespace named 's3_storage' linked to our S3 access method.
CREATE TABLESPACE s3_storage
OWNER postgres
FOREIGN DATA WRAPPER s3_bucket;
Now, we can define a data tiering policy. This policy tells TimescaleDB which data to move and where. We want to move data older than 30 days to our s3_storage tablespace.
-- Define the data tiering policy for iot_sensor_readings.
-- 'event_time' is the time column of the hypertable.
-- '30 days' is the threshold: data older than this will be considered for moving.
-- 's3_storage' is the target tablespace for older data.
ALTER TABLE iot_sensor_readings
SET (
timescaledb.compress = true, -- Compression is often a prerequisite or highly recommended
timescaledb.compress_segmentby = 'device_id',
timescaledb.compress_chunk_time_interval = '7 days',
timescaledb.data_labeling = true,
timescaledb.policy = jsonb_build_object(
'data_tiers', jsonb_build_array(
jsonb_build_object(
'name', 'main',
'priority', 100,
'location', 'pg_default', -- Or your preferred local tablespace
'retention_period', '30 days'
),
jsonb_build_object(
'name', 's3_tier',
'priority', 50,
'location', 's3_storage',
'retention_period', '0 days' -- '0 days' means keep indefinitely once moved
)
)
)
);
The timescaldb.policy setting is where the magic happens. It defines data_tiers. Each tier has a name, priority (higher is considered first), location (tablespace), and retention_period.
- The
maintier is for active data, located onpg_default(your local disk), and it keeps data for 30 days. - The
s3_tieris for archived data, located ons3_storage(your object store), and it keeps data indefinitely (0 daysretention means "keep forever" once it reaches this tier).
TimescaleDB will automatically start moving data that falls outside the retention_period of the higher-priority tiers to the next available tier. In this case, data older than 30 days will be moved from main to s3_storage.
You can manually trigger a data rebalancing (which includes moving data between tiers) if you want to see the effect immediately, though it’s usually an automated background process.
-- Manually trigger a rebalance operation.
SELECT rebalance_chunk_data('iot_sensor_readings');
After the background process or manual rebalance completes, querying the chunks again will show the updated tablespace for older data:
SELECT
chunk.name,
chunk.tablespace,
chunk.range
FROM
timescaldb_information.chunks chunk
ORDER BY
chunk.range DESC;
You would now see something like:
name | tablespace | range
-----------------+------------+---------------------------------
chunk_1_1234567 | main | [2023-10-27 10:00:00, 2023-10-28 10:00:00)
chunk_1_7890123 | s3_storage | [2023-10-17 10:00:00, 2023-10-27 10:00:00)
Notice how chunk_1_7890123, which contains data older than 30 days relative to the current time, has been moved to s3_storage. Queries that hit this chunk will automatically fetch data from S3, transparently to the user.
The most surprising thing about this is how seamlessly it integrates. You don’t rewrite queries, you don’t need to manage separate databases or ETL pipelines for archival. TimescaleDB handles the data movement and retrieval internally. When you query the iot_sensor_readings table, the query planner identifies which chunks are needed and, if a chunk is in s3_storage, TimescaleDB fetches it on demand.
When data is moved to object storage, TimescaleDB doesn’t just copy the raw data. It first compresses the chunk and then uploads the compressed data. This significantly reduces the amount of data stored in object storage, further lowering costs. The compression settings defined via timescaledb.compress and timescaledb.compress_chunk_time_interval apply before data is moved to object storage.
The retention_period in the policy applies to the data within the chunk, not the chunk itself. So, if a chunk contains data spanning 40 days, and the retention period for the main tier is 30 days, TimescaleDB will automatically split that chunk. The newer 30 days of data will remain in the main tier, and the older 10 days of data will be moved to the s3_storage tier. This granular splitting is key to maintaining performance.
The next step in optimizing storage is often to configure automatic compression for your hypertable, which works hand-in-hand with tiered storage.