The timescaledb extension
Work with time-series data in Postgres with the timescaledb extension
timescaledb
enables the efficient storage and retrieval of time-series data. Time-series data is a sequential collection of observations or measurements recorded over time. For example, IoT devices continuously generate data points with timestamps, representing measurements or events. timescaledb
is designed to handle large volumes of time-stamped data and provides SQL capabilities on top of a time-oriented data model such as IoT data, sensor readings, financial market data, and other time-series datasets.
This guide provides an introduction to the timescaledb
extension. You’ll learn how to enable the extension in Neon, create hypertables, run simple queries, and analyze data using timescaledb
functions. Finally, you’ll see how to delete data to free up space.
note
timescaledb
is an open-source extension for Postgres that can be installed on any Neon Project using the instructions below.
Version availability:
The version of timescaledb
available on Neon depends on the version of Postgres you select for your Neon project.
- Postgres 14 -
timescaledb
2.10.1 - Postgres 15 -
timescaledb
2.10.1 - Postgres 16 -
timescaledb
2.13.0 - Postgres 17 - not yet available
Only Apache-2 licensed features are supported. Compression is not supported.
timescaledb
extension
Enable the You can enable the extension by running the following CREATE EXTENSION
statement in the Neon SQL Editor or from a client such as psql
that is connected to Neon.
For information about using the Neon SQL Editor, see Query with Neon's SQL Editor. For information about using the psql
client with Neon, see Connect with psql.
Create a hypertable
timescaledb
hypertables are a high-level abstraction, extending traditional Postgres tables to manage temporal data more effectively. A hypertable simplifies the organization and retrieval of time-series information by providing built-in partitioning based on time intervals.
To begin with, create a SQL table for temperature data:
Convert it to a hypertable using the create_hypertable
function:
You should receive the following output:
It is possible to use both standard SQL commands and timescaledb
functions (which will be covered later).
To use an SQL query to insert data in the weather_conditions
table:
To retrieve the data by time in descending order:
You should receive the following output:
Load weather data
You can use the sample weather dataset from TimescaleDB and load it into your Neon database using psql.
Download the weather data:
Load the data into Neon database - enter the username, password, host and database name. You can find these details in the Connection Details widget on the Neon Dashboard.
You should receive the following output:
Use hyperfunctions to analyze data
You can now start using timescaledb
functions to analyze the data.
Get the first temperature reading for each location:
The aggregate function first
was used to get the earliest temperature
value based on time
within an aggregate group.
You should receive the following output:
Get the latest temperature reading for each location:
The aggregate function last
was used to get the latest temperature
value based on time
within an aggregate group.
You should receive the following output:
Calculate the average temperature per hour for a specific device:
The query uses the time_bucket
hyperfunction to group timestamps into one-hour intervals, calculating the average temperature for each interval from the table for a specific device, and then displays the results for the top 10 intervals.
You should receive the following output:
Bucket device humidity data:
Here, we use the histogram
function to create a distribution of humidity values within specified buckets (40
to 60
with a size of 5
) for each device_id
.
You should receive the following output:
Use the approximate_row_count
function to get the approximate number of rows in weather_conditions
hypertable:
You should receive the following output:
Working with chunks
Chunks are fundamental storage units within hypertables. Instead of storing the entire time-series dataset as a single monolithic table, timescaledb
breaks it down into smaller, manageable chunks. Each chunk represents a distinct time interval, making data retrieval and maintenance more efficient.
The show_chunks
function can be used to understand the underlying structure and organization of your time-series data and provides insights into how your hypertable is partitioned.
You should receive the following output:
show_chunks
output indicates the presence of two internal chunks within your hypertable. To show detailed chunks information:
You should receive the following output:
You can use the drop_chunks
function to remove data chunks whose time range falls completely before (or after) a specified time.
It returns a list of the chunks that were dropped.
You should receive the following output:
Data deletion
You may run into space concerns as data accumulates in timescaledb hypertables. While Neon's Postgres service does not support compression, deleting old data is an option if you don't need to hold on to it for long periods of time.
You can use the drop_chunks
function outlined above to easily delete outdated chunks from a hypertable. For example, to delete all chunks older than 3 months:
The query deletes any chunks that contain only data older than 3 months.
To automatically run this deletion periodically, you can setup a cron task. For example, adding this line to the crontab will run the deletion query every day at 1AM:
note
Please be aware that Neon's Autosuspend feature may affect the running of scheduled jobs. It may be necessary to start the compute before running a job.
This will help ensure the hypertable size is managed by deleting old unneeded data. Tune the interval passed to drop_chunks and the cron schedule based on your data retention needs.
Conclusion
You were able to configure the timescaledb extension in Neon and create a hypertable to store weather
data. Then you executed simple queries and analyzed data using a combination of standard SQL and timescaledb
functions before finally using drop_chunks()
to delete data.