Cumulative Totals with SQL Window Functions
October 8, 2012
I have built up a database over the years, and each table has a created
column with date time in. I wanted to produce a cumulative total of rows based on the date the row was added. I'm using PostgresSQL8.4.12
Too Many Rows
My first attempt at this produced a row for each row added, as the created
column is datetime with millisecond precision.
SELECT e.created, count(*) OVER(ORDER BY e.created) FROM contests_contestevent e ORDER BY 1
This produced results like this:
2009-02-21 19:08:21.328148+00 | 1 2009-02-21 19:22:08.69062+00 | 2 2009-02-21 20:15:15.159401+00 | 3 2009-02-21 23:08:44.897416+00 | 4 2009-02-21 23:31:29.177185+00 | 5 2009-02-22 22:28:39.774899+00 | 6 2009-02-22 22:43:44.090386+00 | 7 2009-02-24 21:56:26.525112+00 | 8 2009-02-24 22:31:25.508663+00 | 9 2009-02-26 21:42:41.255125+00 | 10
Group By Month
What I actually wanted was to group rows by month and year. This can be done with the following query:
SELECT DISTINCT ON (EXTRACT (year FROM e.created), EXTRACT (month FROM e.created)) EXTRACT (month FROM e.created), EXTRACT (year FROM e.created), COUNT(*) OVER(ORDER BY e.created) FROM contests_contestevent e ORDER BY EXTRACT (year FROM e.created), EXTRACT (month FROM e.created)
which produces the following results:
2 | 2009 | 1 3 | 2009 | 32 4 | 2009 | 198 5 | 2009 | 256 6 | 2009 | 614 7 | 2009 | 666 8 | 2009 | 801 9 | 2009 | 807 10 | 2009 | 1024 11 | 2009 | 1188 12 | 2009 | 1292 1 | 2010 | 1328 2 | 2010 | 1434 3 | 2010 | 1506 4 | 2010 | 1651 5 | 2010 | 1731 6 | 2010 | 1954 7 | 2010 | 2048 8 | 2010 | 2257 9 | 2010 | 2347