Paul Barry

In Search Of Sharper Tools

July 21, 2009

After reading the comments in my last post, one thing I realized that I neglected to do is define what I mean by metaprogramming. Rubyists probably already know what I mean, but people coming from other programming languages might have different ideas about what I mean by metaprogramming.

I’ve actually mentioned this in a few talks I’ve given about Ruby. I don’t really like the word Metaprogramming, it’s a little bit nebulous. I think a better term is dynamic code generation. I like that term because I think most programmers will have a pretty good mental model of what I am talking about when I say that. There are several features of Ruby that when combined together allow you to do almost anything to bend the language to your will. To understand how to do that, I recommend reading the book Metaprogramming Ruby, which is in beta right now.

I’ll give a short, real world example of what I’m talking about. I’m working on a Rails app that uses a database that has bit field columns. I want to treat the boolean flags in the bit field as though they are regular boolean attributes throughout my code. So I wrote a Ruby gem has-bit-field, which generates all the methods necessary to work with the bit field columns. You define what is in the bit field in the most clear, simple, elegant way possible by just stating which column has the bit field and then what attributes should be generated for each bit:

class Person < ActiveRecord::Base
  has_bit_field :bit_field, :likes_ice_cream, :plays_golf, :watches_tv, :reads_books

This is the kind of abstraction that the metaprogramming capabilities of Ruby afford you. I threw this together, with tests, in an hour or so. Can you imagine the amount of nonsense you would have to go through in Java to create an abstraction equivalent to this?

This type of abstraction is what I think makes Ruby a great language, but I realize you are definitely walking a fine line with this kind of thing. It’s possible for this sort of thing to go wrong quickly. The first way these things go wrong is when the abstraction is not intuitive and not documented. First of all, a good abstraction should be almost intuitive, requiring other programmers to be able to guess what is does. This is commonly referred to as the Principal of Least Surprise. This doesn’t mean that you are excused from providing some sort of documentation explaining how it works, especially for more complex abstractions.

The reason why it’s important that the abstraction is clear is that most of the time the code that defines the abstraction is dense at best, and downright ugly at worst. This isn’t a problem specific to Ruby, as anyone who has worked with Lisp macros can attest to. But in the end I’d rather have a small chunk of code that is tested and documented that I don’t really need to look at that enables me to make the code where the business logic is defined as clear as possible. If other programmers are constantly having to dive into the guts of the definition of these abstractions just to understand how the code works, you have officially created a mess. And this is no ordinary mess, this is meta-spaghetti, and is a mess on an order of magnitude not possible in statically typed languages.

So does this mean you shouldn’t use Ruby? Not at all, and I think Glenn Vanderburg sums it up best:

Weak developers will move heaven and earth to do the wrong thing. You can’t limit the damage they do by locking up the sharp tools. They’ll just swing the blunt tools harder.

I think developers often associate “blunt tools” with static typing, because really they associate static typing with Java. I’m not sure that static typing is in fact a blunt tool. If static typing means I can’t create these kinds of abstractions, then yes, it’s a blunt tool. But can you do this kind of thing with Scala Compiler Plugins? How about with Template Haskell? What about with MetaOCaml? If you can, are those tools then sharper than Ruby? Or is there a way to define abstractions like these without metaprogramming at all?

Types and Metaprogramming: Can We Have Both?

July 21, 2009

Michael Feathers has posted a an excellent article called Ending the Era of Patronizing Language Design. You should go read the whole article, but the crux of the arguments boils down to:

The fact of the matter is this: it is possible to create a mess in every language. Language designers can’t prevent it. All they can do is determine which types of messes are possible and how hard they will be to create. But, at the moment that they make those decisions, they are far removed from the specifics of an application. Programmers aren’t. They can be responsible. Ultimately, no one else can.

I agree with Michael, but I see it from a slightly different viewpoint. The most common argument for static typing is that it will prevent you from shooting yourself in the foot. If your program has syntax errors or type errors, the compiler will catch them for you. These means that you can write less tests than you would need to write for the same program written in a dynamically typed language.

This argument I completely disagree with. Having a spell checker doesn’t mean you can proofread less. No amount of spell checking and grammar checking is going to prevent your writing from having errors. If you don’t test your program, it will have errors.

I like looking at static typing in a more “glass is half full” sort of way. The bigger benefit of static typing is that the type system can be used as a mechanism to make the code more self-descriptive. Here’s an analogy from Real World Haskell:

A helpful analogy to understand the value of static typing is to look at it as putting pieces into a jigsaw puzzle. In Haskell, if a piece has the wrong shape, it simply won’t fit. In a dynamically typed language, all the pieces are 1x1 squares and always fit, so you have to constantly examine the resulting picture and check (through testing) whether it’s correct.

I think the jigsaw puzzle analogy is even better if you think about it a different way. Think of a simple puzzle that is made up of only nine pieces. If each piece is a 1x1 square, you have to look at the part of the picture on each piece to determine where it goes. If you had a puzzle with nine uniquely shaped pieces, you could solve the puzzle without looking at the picture on each piece at all. You could assemble the puzzle simply by connecting the piece that fit together.

When analyzing a program, “looking at the picture on the piece” means reading the body of a method. Let’s take a look at some Ruby code without looking at the body of the method. Take for example, these two method declarations:

def select(*args)
def select(table_name=self.class.table_name, options={})

The first method gives us no hint as to what this method is, whereas in the second example, we can start to see what’s going on with this method. What if the method signature looked like this:

SQLQueryResult execute(SQLQuery query)

It should be obvious what this does. So it should be clear that having types in our method signatures is not the language’s way of making sure we don’t make errors, but instead is a means for expressing the intent of how the code works and how it should be used.

The problem is that this is not the only way of making code more self-descriptive or easier to reason about. The authors of Structure and Interpretation of Computer Programs state:

A powerful programming language is more than just a means for instructing a computer to perform tasks. The language also serves as a framework within which we organize our ideas about processes. Thus, when we describe a language, we should pay particular attention to the means that the language provides for combining simple ideas to form more complex ideas. Every powerful language has three mechanisms for accomplishing this:

  • primitive expressions, which represent the simplest entities the language is concerned with,
  • means of combination, by which compound elements are built from simpler ones, and
  • means of abstraction, by which compound elements can be named and manipulated as units.

The meta-programming capabilities of Ruby, which are on par with that of any other language, including Lisp, provide a powerful means of abstraction. ActiveRecord is a clear example of that:

class Order < ActiveRecord::Base
  belongs_to :customer
  has_many :line_items

Can statically typed languages provide a means of abstraction on par with that of dynamic typed languages? Steve Yegge doesn’t think so. I’m not sure and I am in constant search of that answer. The combination of meta-programming, object-orientation and functional programming in Ruby provides a powerful means of abstraction. I would love to have all of that as well as a type system, so I would hate to see the programming community completely give up on static typing and just declare that dynamic typing has won.

Testing in a statically-typed language

December 19, 2008

One of the many “fringe” languages that has been on my radar for some time now is Haskell. Haskell has what looks to be a pretty good book available online and a local book club forming, which presents a great opportunity to study it. What is interesting to me about Haskell is that it is a statically-typed language, like Java, not a dynamically-typed language like Lisp, Ruby, Python, JavaScript, etc. I have really become fond of dynamic typing while programming in Ruby, but I always like studying languages that challenge your fundamental beliefs about programming.

Incidentally, this is one of the reasons I have been really interested in Clojure. It challenges the validity of object-oriented project in a very serious way. In Clojure, you don’t define new classes, instead your program is simply comprised of functions that takes input, possibly call other functions, and return values. Studying functional languages helps you to see the benefits and weaknesses of object-oriented programming. Read more about this idea of studying programming in Glenn Vanderburg’s article about Koans.

So anyway, back to Haskell, and the paradigm here that is worth studying is static typing versus dynamic typing. I must admit, after moving from Java (a statically-typed language) to Ruby (a dynamically-typed language), I’m currently a big fan of dynamically-typed languages. So I’m trying to keep an open mind in studying this, but this passage from Chapter 2 of Real World Haskell I have to object to:

Programs written in dynamically typed languages require large suites of tests to give some assurance that simple type errors cannot occur. Test suites cannot offer complete coverage: some common tasks, such as refactoring a program to make it more modular, can introduce new type errors that a test suite may not expose.

In Haskell, the compiler proves the absence of type errors for us: a Haskell program that compiles will not suffer from type errors when it runs. Refactoring is usually a matter of moving code around, then recompiling and tidying up a few times until the compiler gives us the “all clear”.

One thing I learned when moving from Java to Ruby is that developers often rely on the compiler as a false sense of security. The notion that you can refactor a bunch of code, get it to the point where it compiles and then feel that your code works is dangerous. You code can still contain all sorts of runtime and logic errors, therefore if you want to have any degree of certainty that your code is free of bugs, you need to have some sort of test suite, regardless of if you are using a dynamic and statically typed language.

I’ve also heard the argument made that statically-typed languages allow you to write less tests, but I have also not found this to be the case. Most syntax and type errors will be found with the same tests you use to test the logic of the program. In other words, you don’t have to write one suite of tests to check for syntax errors, one for type checking and another for testing the logic of your program.

So the moral of the story is that regardless of whether you are writing your code in a statically or dynamically typed language, you still have to TATFT.