Django ORM Extra
May 12, 2010
Given a two model structure, BlogEntry and BlogComment, where there is one BlogEntry to many BlogComments, then there are three possible ways of getting hold of the comment count for a given entry, when displaying a list of entries.
Simple
The simple approach is to use the reverse links on the model, to get hold of the blogcomment_set
{{entry.blogcomment_set.count}} comments
The disadvantage of this approach is that is requires an extra select per entry in the list.
Django ORM
The django ORM approach is to use extra:
lBlogEntries = BlogEntry.objects.all().extra(select={'comment_count': 'SELECT count(*) FROM blog_blogcomment WHERE blog_blogcomment.blog_entry_id=blog_blogentry.id'},)
This adds extra fields into the select. You can now use:
{{entry.comment_count}} comments
from the template.
Raw SQL
The final (and not recommended) route is to use raw SQL:
from django.db import connection lCursor = connection.cursor() lCursor.execute("""SELECT (SELECT count(*) FROM blog_blogcomment WHERE blog_blogcomment.blog_entry_id=blog_blogentry.id) AS "comment_count", "blog_blogentry"."publication_date", "blog_blogentry"."title", "blog_blogentry"."slug" FROM "blog_blogentry" ORDER BY "blog_blogentry"."publication_date" DESC""") lRows = lCursor.fetchall() lBlogEntries = [ ] for row in lRows: lEachBlogEntry = BlogEntry() lEachBlogEntry.comment_count = row[0] lEachBlogEntry.publication_date = row[1] lEachBlogEntry.title = row[2] lEachBlogEntry.slug = row[3] lBlogEntries.append(lEachBlogEntry)
This makes maintenance difficult, but it does allow finegrained control over which fields are fetched.
Performance Improvements
I've just revisited some forum code I wrote when I first started with Django. I always knew it was slow, but never realised quite how slow until I fired it up using the Django debug toolbar. To show a list of forums, along with the number of posts, last time that forum was posted to, and whether there were unread posts or not: 1068 queries, 81958ms. Gulp. Maybe I should have started using the debug toolbar sooner!
Anyway, replacing the complex set of Django ORM selects with this:
lForums = Forum.on_site.all().extra(select={ 'topic_count': 'SELECT count(*) FROM forum_forumtopic WHERE forum_forumtopic.forum_id = forum_forum.id', 'post_count' : """SELECT count(*) FROM forum_topicpost WHERE forum_topicpost."forumTopic_id" IN (SELECT id FROM forum_forumtopic WHERE forum_forumtopic.forum_id = forum_forum.id)""", 'last_post_date' : """SELECT max(forum_topicpost.created) FROM forum_topicpost WHERE forum_topicpost."forumTopic_id" IN (SELECT id FROM forum_forumtopic WHERE forum_forumtopic.forum_id = forum_forum.id)""", 'unread_count' : """SELECT count(*) FROM forum_topicpost WHERE forum_topicpost."forumTopic_id" IN (SELECT id FROM forum_forumtopic WHERE forum_forumtopic.forum_id = forum_forum.id) AND forum_topicpost.id NOT IN (SELECT post_id FROM forum_postread WHERE forum_postread.user_id=%s)""" % request.user.id, })
brought the total down to eight queries, and 66ms. That's quite an improvement!