PostgreSQL generate_series for generating time series

Ever needed to aggregate data per time period where you didn’t have data points in every part of the period? I did when building Callcounter. I wanted to generate a chart of requests per hour, where I didn’t have requests in every hour. Using a naive PostgreSQL query I ended up with results such as this:

=> SELECT date_trunc('hour', created_at),
-> COUNT(*) FROM events GROUP BY 1 LIMIT 10;

     date_trunc      | count 
---------------------+-------
 2021-03-22 10:00:00 |    32
 2021-03-22 11:00:00 |    92
 2021-03-22 13:00:00 |   213
 2021-03-22 14:00:00 |   237
 2021-03-22 16:00:00 |   331
 2021-03-22 17:00:00 |   389
 2021-03-22 18:00:00 |   452
 2021-03-22 19:00:00 |   493
 2021-03-22 20:00:00 |   273
 2021-03-22 21:00:00 |   152
(10 rows)

Clearly there is no data for 12:00-13:59 and 15:00-15:59 in the database. Wouldn’t it be great if we can fix this at the database level? That’s where the PostgreSQL generate_series() function comes into play. We can use it to generate a consecutive range of items, which we can then join with our events table.

=> SELECT * FROM generate_series('2021-04-22 10:00'::timestamp,
-> '2021-04-22 21:00', '1 hours');

   generate_series   
---------------------
 2021-04-22 10:00:00
 2021-04-22 11:00:00
 2021-04-22 12:00:00
 2021-04-22 13:00:00
 2021-04-22 14:00:00
 2021-04-22 15:00:00
 2021-04-22 16:00:00
 2021-04-22 17:00:00
 2021-04-22 18:00:00
 2021-04-22 19:00:00
 2021-04-22 20:00:00
 2021-04-22 21:00:00
(12 rows)

Depending on the ORM you’re using you’ll need to find out how to join this with your events. Let me know on Twitter if you’re also interested in the ActiveRecord solution I chose.