TimescaleDB job scheduling isn’t about running arbitrary scripts; it’s about orchestrating background workers that perform maintenance and analytical tasks on your time-series data.

Let’s see it in action. Imagine you have a massive hypertable, metrics_data, and you want to regularly compress older chunks to save space and improve query performance. You can define a job that runs a ts_compress_chunk function.

-- First, ensure you have the TimescaleDB extension enabled
-- CREATE EXTENSION IF NOT EXISTS timescaledb CASCADE;

-- Define the policy for compression
SELECT add_compression_policy('metrics_data', INTERVAL '7 days');

-- Now, let's manually create a job that will run this compression
-- This is an illustrative example; compression policies usually manage their own jobs.
-- For custom jobs, you'd use `create_job` directly.
-- Example: A job to run a custom analysis function every hour
CREATE OR REPLACE FUNCTION analyze_recent_data()
RETURNS BIGINT
AS $$
BEGIN
  -- Simulate some analysis, e.g., counting records in the last hour
  RAISE NOTICE 'Analyzing data from the last hour...';
  RETURN (SELECT count(*) FROM metrics_data WHERE time >= NOW() - INTERVAL '1 hour');
END;
$$ LANGUAGE plpgsql;

SELECT add_job('analyze_recent_data', '1 hour');

The add_job function is your primary interface. It takes the name of a function (or a SQL command) and a schedule. TimescaleDB then manages a background worker that executes this function according to the schedule. The add_compression_policy and add_retention_policy functions are convenience wrappers around add_job that set up jobs for common maintenance tasks.

The core problem this solves is offloading repetitive, resource-intensive tasks from your application’s main connection pool to dedicated background workers. This prevents your application queries from being blocked by long-running maintenance operations and ensures that data management happens reliably without manual intervention.

Internally, TimescaleDB’s job scheduler uses a table (typically _timescaledb_internal.bgw_jobs) to store job definitions and their last execution times. When a background worker starts, it polls this table for jobs that are due. Once a job is executed, its last_run_time is updated. You can query this table to see your scheduled jobs and their status.

-- View all scheduled jobs
SELECT * FROM _timescaledb_internal.bgw_jobs;

-- View jobs specific to compression
SELECT * FROM _timescaledb_internal.bgw_jobs WHERE application_name = 'timescaledb/compression';

The schedule parameter for add_job uses a simple, cron-like syntax. You can specify intervals like '1 hour', '1 day', or even '*/15 minutes'. For more complex scheduling, you can use cron expressions like '0 * * * *' for hourly execution at the top of the hour.

The true power lies in defining your own functions that interact with your data. This could be anything from calculating rolling averages, identifying anomalies, pruning old data beyond retention policies, or even triggering external alerts based on data conditions. The key is that these functions must be written in PostgreSQL’s procedural languages (like PL/pgSQL) and return a BIGINT value, which is typically the number of rows affected or processed.

One aspect often overlooked is how job execution is managed under high load or failure. TimescaleDB’s background workers are designed to be resilient. If a worker crashes mid-job, the job’s next_run_time is not advanced, and it will be retried by another available worker. Furthermore, you can configure the number of background workers available using the timescaledb.max_background_workers GUC, allowing you to scale the job execution capacity.

The next logical step is to understand how to monitor the actual execution of these jobs and troubleshoot when they don’t run as expected, especially in distributed or high-availability setups.

Want structured learning?

Take the full Timescaledb course →