Correlated Subqueries

January 6, 2008

The Rails' ORM ActiveRecord has support for counter caches. Say you have a blog, like this one, which has articles, and each article has many comments. If you are going to show a list of articles with the number of comments each article has, without a counter cache, you will end up with N+1 SQL queries. One query to get the articles, and one select count(*) from comments where article_id = ? query for each article.

To avoid the N+1 select, you can add a column in the articles table that contains the number of comments the article has. With some ORM frameworks, this becomes somewhat of a pain, because you have to keep that count column updated, but ActiveRecord makes it easy. If you just create a table called <table_name>_count on a parent table, it will keep the total of the records that belong to that record in that column. So then when you call article.comments_count, there is no SQL query executed.

This is nice, but what happens if you have a site in production and you want to add a counter cache column? You've got to create the column and then get the count updated. You can do this in a migration, but you can also do it in SQL using a correlated subquery. Using MySQL, you can do it like this:

alter table articles add column comments_count int(11) default 0;
update articles set articles.comments_count = 
  (select count(comments.id) from comments where articles.id = comments.article_id);

Posted in Technology | Tags MySQL, ActiveRecord | 0 Comments