Cumulative Totals with SQL Window Functions

October 8, 2012

I have built up a database over the years, and each table has a created column with date time in. I wanted to produce a cumulative total of rows based on the date the row was added. I'm using PostgresSQL8.4.12

Too Many Rows

My first attempt at this produced a row for each row added, as the created column is datetime with millisecond precision.

SELECT e.created, count(*) OVER(ORDER BY e.created)
FROM contests_contestevent e
ORDER BY 1

This produced results like this:

 2009-02-21 19:08:21.328148+00 |     1
 2009-02-21 19:22:08.69062+00  |     2
 2009-02-21 20:15:15.159401+00 |     3
 2009-02-21 23:08:44.897416+00 |     4
 2009-02-21 23:31:29.177185+00 |     5
 2009-02-22 22:28:39.774899+00 |     6
 2009-02-22 22:43:44.090386+00 |     7
 2009-02-24 21:56:26.525112+00 |     8
 2009-02-24 22:31:25.508663+00 |     9
 2009-02-26 21:42:41.255125+00 |    10

Group By Month

What I actually wanted was to group rows by month and year. This can be done with the following query:

SELECT DISTINCT ON (EXTRACT (year FROM e.created), EXTRACT (month FROM e.created)) 
        EXTRACT (month FROM e.created), 
        EXTRACT (year FROM e.created), 
        COUNT(*) OVER(ORDER BY e.created)
FROM contests_contestevent e
ORDER BY EXTRACT (year FROM e.created), EXTRACT (month FROM e.created)

which produces the following results:

         2 |      2009 |     1
         3 |      2009 |    32
         4 |      2009 |   198
         5 |      2009 |   256
         6 |      2009 |   614
         7 |      2009 |   666
         8 |      2009 |   801
         9 |      2009 |   807
        10 |      2009 |  1024
        11 |      2009 |  1188
        12 |      2009 |  1292
         1 |      2010 |  1328
         2 |      2010 |  1434
         3 |      2010 |  1506
         4 |      2010 |  1651
         5 |      2010 |  1731
         6 |      2010 |  1954
         7 |      2010 |  2048
         8 |      2010 |  2257
         9 |      2010 |  2347

References