Method Dispatching in Ruby

9:37 PM EDT Thursday, May 29 2008

Patrick Farley's "Ruby Internals" talk from Mountain West Ruby Conf gives a clear picture of how Ruby's Class/Object system handles method dispatching. I highly recommend that any Rubyist watch the video. Or better yet, if you are are at RailsConf, check out the presentation live and in person on Saturday. This article is a summary and restatement of the ideas put forth in that presentation.

Every object in Ruby is an instance of a class. Each object knows which class it is an instance of (which I will refer to as a class pointer), keeps track of some basic properties like is it frozen, it is tainted, is it a virtual class, etc. and has is a table of instance variables. What objects don't have is methods.

Classes are very similar to objects. In fact, in Ruby, classes are objects. (Note that this may or may not be the case in the internals of the interpreter, and is not the case in MRI, but it doesn't matter as far as this discuss goes). In Ruby, classes are instances of the class Class. Once you understand this, many things in Ruby like def self.foo or class << self really start to make sense.

Because classes are objects, they have the class pointer, the same basic properties as all other objects and also have a table of their instance variables. But in addition, a class has a super pointer, which points as the parent class of the class and a method table, which is where all methods for a class are stored.

So when you send a message to an object (a.k.a call a method on an object), Ruby first looks at the class pointer of the object. It figures out which class it is pointing at. Then it looks in the method table for that class. If it has a method, it calls that method. If there is no method in the method table, it uses the super pointer of the class to find its parent class. It checks the parent class's method table and keeps doing that until it finds a method or gets to a class that has no parent. And that's it. One of the "rules" of Ruby is that this is the process for looking up a method, which is known as method dispatching, in all cases in Ruby. There is only way of doing method dispatch.

Now, you might see code like this:

class Animal; end
dog = Animal.new
cat = Animal.new
def dog.speak; puts "woof"; end
def cat.speak; puts "meow"; end
dog.speak
cat.speak

Which prints woof and then meow. In this example, dog and cat are both instances of the same class Animal, but they respond differently to the speak method. But how does Ruby achieve this instance specific behavior? Objects can't have methods, so the method must be defined on a class, but the method could not have been defined on the Animal class, otherwise one would have overwritten the other.

The answer is the singleton class. It is called the singleton class because it is a class that there can only be once instance of. Ruby wants to have instance specific behavior and wants to adhere to the rule of having just one method for method dispatch. In order to achieve that, when you define a method on an instance, Ruby creates a singleton class, makes the class pointer of the object point to the singleton class and makes the super pointer of the singleton class point to the class.

Once you have the singleton class in place, method dispatch can proceed as normal and everything works and you now have instance specific behavior. But what about this?

dog.class == cat.class

This evaluates to true. But that's can't be, because the class pointers of dog and cat are pointing at different classes, a singleton class for each object. The answer here is that the method class does not actually return the direct parent of the object. It returns the first non-virtual class in the object's inheritance chain. A singleton class is a virtual class, and there other types of virtual classes that I will mention later.

Now that you understand the concept of singleton classes, class methods should be clear. As I said earlier, every class is an instance of Class. Each instance happens to be assigned to a constant, but it doesn't have to be. We can modify our previous example like this:

animal = Class.new
dog = animal.new
cat = animal.new
def dog.speak; puts "woof"; end
def cat.speak; puts "meow"; end
dog.speak
cat.speak
puts "dog is a #{dog.class}"

This will print woof and meow as before, and the last line will print something like dog is a #<Class:0x129dd0c>. Normally the last statement would print dog is a Animal, but a class doesn't get a name assigned to it until you assign the class to a constant.

So class methods are really just instance specific behavior defined on the class object. Class methods are defined in the method table of the singleton class of the class, not in the class's own method table. There is a difference between the way the singleton class of any Ruby object is handled and the way a singleton class of a class is handled. Because of this difference, Patrick refers to the singleton class of a class as a metaclass, although there is much debate about that particular nomenclature. Ola Bini says Ruby doesn't have metaclasses at all, whereas Why The Lucky Stiff refers to all singleton classes as metaclasses. Personally, I think Patrick's definition makes the most sense and I'm going to use his definition throughout the rest of the article. Also, the metaclass of a class is notated by prefixing it with a single quote, so Dog's metaclass is 'Dog.

The difference between a metaclass and singleton class has to do with the super pointer. Let's say for example, we define a Dog class that is a child class of Animal and we put a class method on each:

class Animal
  def self.foo
    puts "foo"
  end
end
class Dog < Animal
  def self.bar
    puts "bar"
  end
end
puts "#{Dog.foo} #{Dog.bar}"

Both Dog and Animal are instances of Class. Their class pointers both point to each of their metaclasses, and those metaclasses class pointers point to Class (there are actually a few other things in the chain, but for the purpose of this discussion, it's enough to say that they point to Class). foo is defined in 'Animal's method table and bar is defined in 'Dog's method table. What we want and what happens is that the class methods are inherited. But that wouldn't work by the rules of method dispatch as I've already covered.

When you send the message foo to Dog, Ruby would check its class pointer, which is pointing at it's metaclass. The method table of 'Dog has no foo method. So then Ruby checks 'Dog's super pointer, which points at Class. Class's method table has no foo, so you would get a method missing error.

But that's not what happens, what does happen is that the foo method defined in 'Animal gets call. The rule that makes this happen is:

The super of your metaclass is the metaclass of your super

Normally when a singleton class is created for any object, the super of the singleton class points to the class of the object, but when it is a metaclass (the singleton of an object that is a class), the super gets pointed to the metaclass of the object's super. So as the fortune cookie says, "The super of your metaclass is the metaclass of your super".

One last wrinkle in method dispatching are Modules. Modules, or mixins, as they are sometimes referred to, provide a way of adding methods to a class in a way that provides the benefits of multiple inheritance without actually doing multiple inheritance.

A module is just like a class, it has a super pointer and a method table, in fact in Ruby, Class inherits from Module. When you include a module in a class, effectively what happens is that the module inserts itself in the inheritance chain of that object, much like how the singleton class works. But the problem is that in order to do that, you would have to change the module's super pointer, and since the same module can be included in difference classes, that can't work.

So the way Ruby deals with this is that when a module is included, a virtual class is created. That virtual class doesn't have a method table of it's own, instead it has a pointer to the method table in the module. This makes it so that if you have a module that is included in multiple classes, if the module is changed, all classes see that change. The virtual class's super pointer points at the next class up in the inheritance chain for the object, so now each object has a virtual class for each module it includes, and method dispatching can continue to work as it does in all other cases.

Posted in  | Tags Ruby

Comments Feed

1. More animal sounds here: http://snippets.dzone.com/posts/show/3378

# Posted By dj on Wednesday, June 04 2008 at 03:16 EDT

2. Paul:

Thanks for introducing me to the wonderful video. I learned so much more from watching this video than reading other blogs/books.

# Posted By Neeraj on Wednesday, June 18 2008 at 03:26 EDT

Add a Comment