Learning from Rails

February 14, 2007

I've said before that learning different programming frameworks and languages makes you a better programmer. For example, learning about Ruby on Rails will make you a better Java programmer even if you don't ever build any apps in Ruby on Rails. Here's an example of that.

In a recent post, I pointed out how Rails make querying across relationships easy with the use of :includes. This adds outer joins to the SQL query. In Hibernate, this is referred to as eager fetching.

In Hibernate 3, by default, any associated collections of an object are loaded lazily. What this means is that by default, you just get a proxy to a collection that contains no data. Upon first access of the collection, hibernate goes and gets the data from the database when you need it. Rails also work like this.

In some cases, this is a good thing, because you are fetching only the data you need from the database, which leads to better performance. There are 2 cases where this becomes a problem:

First, you don't make your first call to the collection until after the Hibernate Session is closed. This will cause the dreaded LazyInitializationException. There are two ways to handle this problem:

a. You can open a session at the very beginning of request and then close it at the very end. This is most often done with a Filter, commonly known as the OpenSessionInView pattern. Incidentally, Rails doesn't require this, it "Just Works".

b. You can eagerly load all the data that you want.

The second problem caused when using lazy loading is the dreaded N+1 Select problem. Let's say you select all the posts from your blog, and each post has a collection of comments associated to it. So now that you have all your posts, you want to print out each post and each comment for each post. You'll basically end up doing some kind of nested for loop, but each time you do a loop on the comments of a post, your lazy loaded comments collection will have to get loaded from the database in a separate query. So the SQL queries executed during your request will be:

SELECT * FROM posts
SELECT * FROM comments WHERE post_id = 1
SELECT * FROM comments WHERE post_id = 2
SELECT * FROM comments WHERE post_id = 3
...

This is where the N+1 name comes from. You execute 1 query for the main object (posts, in this case), and then N queries, where N is the number of main objects. Now imagine if each comments had a collection of some kind associated to it. Then this becomes (N*M)+1, and this starts to become a real strain on your database, not to mention making your pages slow.

So, in either case 1b or 2, you want to use SQL joins to get all the data at once. When doing this in Java, you usually have a Data Access Object (DAO), that is where your Hibernate code is. So PostDAO.get(Long id) gets the post with the given id, but all collections are loaded lazily. To have them loaded eagerly, you create another method like PostDAO.getWithComments(Long id), which, if you are using Hibernate's Criteria API, you will end up having a line of code like setFetchMode("comments", FetchMode.JOIN). This is roughly equivalent to Rails' :include option in the find method.

The difference is that the Rails find method is so simple and expressive, you could feel confident using the :include in the controller layer. In Java, to accomplish the same thing, you need to set the fetch mode on the criteria, which means you would need to have access to the Hibernate Criteria object outside of the DAO. This would be considering exposing the details of the persistence layer to the business layer, and therefore a bad design.

So that's why you create those findWithComments methods. But the problem is that some times you have an object with several associations and there are various cases where you need certain associations and not others. Now you will have to have a lot of methods on your DAO like findWithCommentsAndSomethingAndSomethingElse. So when I was thinking about this the though occurred to me wouldn't it be great if there was something like includes in Java/Hibernate?

Well, you can easily mimic something similar. If you write a DAO method like PostDAO.find(Collection<String> includes), you can pass a collection of the names of the associations you want, and then inside the DAO method do something like this:

DetachedCriteria c = DetachedCriteria.forClass(Post.class)
for(String include: includes) {
    c.setFetchMode(include, FetchMode.JOIN);
}

You will get back an object will all of the associations that you need loaded. This isn't rocket science, but I'm just not sure that I would have thought about doing this had I not seen the way it is implemented in Rails.

Posted in Technology | Tags Hibernate, Ruby, Java, Rails

Comments

1.

I thought I should get en example of the clever Ruby way of doing things in contrast to java, but this was not an example showing anyting of that. Over all it is is amazing to see how much fuzz there is about Ruby and how little that come out of it.

# Posted By oldman on Friday, April 13 2007 at 10:03 AM

2.

seeing a dao class interface that has methods like this, eg:

findWithComments(long objectId)
findWithCommentsAndFoo(long objectId)
findWithCommentsAndBar(long objectId)
findWithCommentsAndBarAndFoo(long objectId)

every programmer who is worth their money, no matter if he/she has ever used ActiveRecord, should see that this should be handled by just one method and the part behind comments should be passed in as a parameter. The code in the Method should look alike except for the propertyname which is passed to hibernate for eager fetching. so its obvious to single that out as a method-parameter, or you violate DRY principles.

# Posted By Lutz Müller on Friday, April 13 2007 at 11:15 AM

3.

Well, I never had thought of a really good way of specifying which collections to have eagerly loaded in a non-hibernate specific way. Looking back on it, passing in a collection of Strings does seem pretty obvious, but I hadn't seen that "pattern" (if that constitutes a pattern) being used or advocated in Java development. Rails gave me that idea and the Java code I wrote was better and more DRY for it.

# Posted By Paul on Friday, April 13 2007 at 1:58 PM

4.

Sorry, but all your discussion is incorrect. I guess you would need to study Hibernate or any other JPA framework a bit more, because then you will discover that "Lazy loading" is a feature that is very very nice and advanced and brings allot for majority of the cases (in my experience it's usable in 80% of the cases). But in 20% of the cases you don't need it you can centralized disable it. No problems absolutely, no code required, nothing.
Please consider study Hibernate, it will worth it!

# Posted By Renat Zubairov on Saturday, April 14 2007 at 3:47 AM

5.

Renat,

I'm not sure which part of the discussion you feel is incorrect. You are correct that lazy loading is a nice feature of Hibernate and other ORM systems like ActiveRecord. But in some situations, if you know you are going need to use eager fetching to avoid the N+1 select problem. That doesn't mean you want to completely disable lazy loading for all queries. This is why ActiveRecord and Hibernate both support eager fetching.

See this page:
http://www.hibernate.org/hib_docs/v3/reference/en/html/performance.html#performance-fetching-custom
Also Eager loading of assocations on this page:
http://api.rubyonrails.org/classes/ActiveRecord/Associations/ClassMethods.html

# Posted By Paul on Saturday, April 14 2007 at 9:23 AM

6.

Hibernate also allows you to set a batch size, so that rather than N+1 selects you have (N/batchSize)+1 selects. Which often translates to 1+1 selects, which is really not a problem.

Also, Seam (my preferred Java framework, although there are many to choose from) does a fantastic job of persisting the database session for the whole page rendering, so lazy loading is a thing of the past. Other simple techniques (threadLocals, et cetera) are also well known solutions to that problem.

# Posted By Richard on Saturday, April 14 2007 at 7:34 PM

7.

Having worked with both frameworks on a professionnal basis, I have this to say.

Hibernate is great. Really really great. Really. Rails is freakin' sweet. Freakin' doodly sweet. Really. That said, I don't understand where the comparaison comes from.

Hibernate is a data mapper. It's not an active record pattern implementation, never was meant to be and most certainly will never be. It fills gaps left by the active record design pattern, and yes, it falls in many many more.

DATA MAPPER != ACTIVE RECORD

Anyone who doesn't understand this didn't make is architecture classes. You should know better. An active record pattern maps one business domain entity to one table, while a data mapper is used to fill gaps between multiple tables entities and the object world.

As I've said, I worked with both. And yes, I LEARNED FROM RAILS. A lot as a matter of fact. Let's re-use the example where you have a FooDAO.find(Long id) method. The most stupid thing to do (yes, there's no worst way I can think of right now) is the :

FooDAO.find(Long id)
FooDAO.findWithBar(Long id)
FooDAO.findWithBarAndWhatever(Long id)

Who the heck would do that in the first place anyways?? Not doing this is certainly not what I learned from Rails. They should teach you that this is a 'no no' in junior school.

Rails has thought me rigor and creativity.

RIGOR

With Rails, you can't violate the framework layers. In fact, you will never have to do it. Lesson 1 is therefore : With a good architecture, you don't even need workarounds. If you have to build a hack for the framework because it gets in your way, choose another one.

Java offers a shitload of frameworks, and Hibernate can't handle all the situations, despite the so numerous claims. If needed, don't try to hack Hibernate, use something else which will work side by side with it. And don't flame me with this, it will not double the work load. It does double the work load if you thought that Hibernate would do everything in the first place, but again, you should know better.

Hibernate can't guess what you need to be initialized on each situations. It's better like this anyways. Theorically, it's the service layer responsability. Services define what are the actions on a given system. Let them decide. Don't even try to put business logic in DAOs. It's a dead end.

CREATIVITY

Use the language at your advantage. Hibernate's eager fetching can become a serious pain in the sack. You'll often end-up with 1200 selects executed for a single entity load. And lazy loading is not a perfect solution in itself. Beleive me, when you have polymorphic many-to-many associations, or any exotic business domain mapped in Hibernate, you could end up having a real problem.

As mentionned up here, the correct solution to lazy fetching is to create methods which implement a lazy loading management framework. It creates a layer between your application and Hibernate because, again, Hibernate can't handle every situations.

Java has interfaces, right? So let's create object interfaces which abstracts the two frameworks : Hibernate and custom made lazy loader manager.

FooDAO.find(Long id)
FooDAO.find(Long id, String[] lazyPathsToInit)*

* the second method calls the first one, then relays the initialization to a path finder which initializes the correct properties before returning the object tree.

There you go ! Rails-like behavior for a creative solution. You've just mimicked the :include mechanism of Rails, but it's suitable for a data mapper; Hibernate. You now have combined two frameworks and your service layer didn't see the difference.

There are a lot of lessons to be learned from Rails. And I didn't even mention the most important lesson of all : Convention over Adaptation. This one could be argued over and over, indeed. But I beleive that conventions and standards were at the root of computing in the first place, so why not adopt it?

Cheers !

# Posted By Luc on Tuesday, April 17 2007 at 4:29 PM

8.

Regarding the "first" problem: take a look at Seam and you'll discover that there's no LIE in the views anymore. The OpenSessioInView is a very limited curiosity of the past...

# Posted By Manuel Palacio on Sunday, May 13 2007 at 4:44 PM

Comments Disabled