Paul Barry

The Busy Rails Developer's Intro To Rake

April 20, 2009

This is just a quick 60 second intro to Rake. If you are doing development with Ruby on Rails, you undoubtedly use Rake on a daily basis. For example, you use rake db:migrate to run your migrations or rake test:units to run your unit tests. But do you know how to write your own Rake tasks? If you never have done that, you might hesitate to write a rake task instead thinking you ran just write a quick ruby script rather than take the time to figure out how to write a Rake task.

So to create a rake task, in an existing Rails app, create a file called lib/tasks/app.rake. You can name it whatever you want as long as it ends in .rake. I’m choosing to use app because we are going to write some app-specific tasks. In the file, put this:

task :hello_world

Now from the command line you can run your task with rake hello_world. Nothing happens, but it runs. Now let’s have it print hello world:

task :hello_world do
  puts "Hello, World!"
end

Now when you run your task, it prints “Hello, World!”. Let’s add a description to our task to let people know what it does:

desc "Prints 'Hello, World!'"
task :hello_world do
  puts "Hello, World!"
end

Now if you run rake -T, you will see your hello_world task in the list of tasks. rake -T only shows tasks that have a description. You can also run rake -D hello_world to see the full description of the task. You should give all of your tasks that you expect users to run from the command-line a description.

Now a problem with our task is what happens if someone else wants to write a task named hello_world? Well, we would have a namespace problem. So what we want to do is put all of our tasks into the app namespace:

namespace :app do
  desc "Prints 'Hello, World!'"
  task :hello_world do
    puts "Hello, World!"
  end
end

So now we can run our task as rake app:hello_world.

So this is obviously not a real task. Let’s say we want to know what the load path of our app looks like. Easy, we’ll just do this:

namespace :app do
  desc "Prints load path of this app"
  task :load_paths do
    Rails.configuration.load_paths.each do |p|
      puts p
    end
  end
end

When you try to run this rake tast, you will get this error:

rake aborted!
uninitialized class variable @@configuration in Rails

The problem is that by default, a rake task doesn’t load the Rails environment. It’s easy to tell it to do that with this:

namespace :app do
  desc "Prints load path of this app"
  task :load_paths => :environment do
    Rails.configuration.load_paths.each do |p|
      puts p
    end
  end
end

By saying :load_path => :environment, you are saying that the load_path task depends on the environment being loaded. Or more specifically, you are saying “run the environment task before running this task”. There is a task called “environment”, and it loads the Rails environment. You won’t see it under rake -T, because it has no description because it is not a task you should run directly, only as a dependency of other tasks.

Now that we have the Rails environment loading, when you run the rake task, you will get the output you expect. If you make your task depend on environment, you will also be able to access your models from within your task. Now that you know the basics of Rake, you can easily get going making Rake tasks for your Rails app.

Posted in Technology | Topics Ruby, Rails, Rake | 8 Comments

Implicit Conversions: Scala's Type Safe Answer To Ruby's Open Class

April 17, 2009

Let’s say that you are writing an application that squares things often, so you would like to be able to do this:

>> 4.squared
=> 16

In Ruby, that’s easy-peasy:

class Integer
  def squared
    self * self
  end
end

Wham. Done. Open it right up, shove your method in there and you are bending the language to your will. But what if you couldn’t modify existing classes? Maybe you would do this:

class IntegerWrapper
  def initialize(value)
    @value = value
  end
  def squared
    @value * @value
  end
end

Cool. Now you haven’t modified Integer, but you get the same effect. So you just use it like this:

>> IntegerWrapper.new(4).squared
=> 16

WOAH! WTF? That is a lot of extra syntax. Having all to call wrapper classes like this really makes it too verbose. So now we shift gears into Scala mode:

scala> class IntegerWrapper(val value : Int) { def squared = value * value }
defined class IntegerWrapper

scala> new IntegerWrapper(4).squared
res0: Int = 16

As you can see on the first line, we define the same IntegerWrapper class. Then we use it just the same way we do in Ruby. But now we throw in an implicit conversion to make the syntactic magic happen:

scala> implicit def wrapInt(i:Int) = new IntegerWrapper(i)
wrapInt: (Int)IntegerWrapper

scala> 4.squared
res1: Int = 16

The implicit def defines an implicit conversion from an Int to an IntegerWrapper. This is our way to tell scala that if we try to call squared on an Int, use this function to convert the Int to an IntegerWrapper for me. Now as you can see, we can call squared on what appears to be a Int, and get the same syntactic advantage of adding methods to existing classes, without really modifying existing classes.

Posted in Technology | Topics Scala, Ruby | 6 Comments

Keyword Arguments and the Case for Literal Syntax for Hashes

March 8, 2009

I’ve studied various programming languages and one feature I’ve grown to love is keyword arguments. Here’s Paul Graham’s take on keyword arguments:

Rtml even depended heavily on keyword parameters, which up to that time I had always considered one of the more dubious features of Common Lisp. Because of the way Web-based software gets released, you have to design the software so that it’s easy to change. And Rtml itself had to be easy to change, just like any other part of the software. Most of the operators in Rtml were designed to take keyword parameters, and what a help that turned out to be. If I wanted to add another dimension to the behavior of one of the operators, I could just add a new keyword parameter, and everyone’s existing templates would continue to work. A few of the Rtml operators didn’t take keyword parameters, because I didn’t think I’d ever need to change them, and almost every one I ended up kicking myself about later. If I could go back and start over from scratch, one of the things I’d change would be that I’d make every Rtml operator take keyword parameters.

In order to have keyword arguments in your language, what you only really need is Hash as a first class data type with a literal syntax. This is a feature that all modern programming languages should have. One reason why is it makes the feature of keyword arguments trivial to implement and use. If your language has literal syntax for Hashes, you don’t really need much as in the terms of syntax of the language to have support for keyword arguments. Python has a little extra built-in support for keyword arguments. In Ruby, hashes are used as keyword arguments to methods pretty easily. To clarify, when I say Hash, I’m talking about the data structure that is referred to as a Dictionary in Python, NSDictionary in Objective-C, a Hash in Ruby, or a Map in Java. Objective-C does not have a literal syntax for Hashes, neither does Java, and I’ll show why not having a literal syntax for Hashes leads to not having keyword arguments.

Objective-C does not have keyword arguments, but what it does have is named arguments. I’m not sure if keyword arguments and named arguments are the official correct terms, but I like those terms and I’m going to define specifically what I mean by each. Named arguments in Objective-C mean that each and every argument to a method call must have a name. In a language like Java which doesn’t have named arguments, you could see something like this:

Date iWeeksFromNow = now.add(0, 0, (i*7), 0, 0, 0);

It’s hard to tell what this method does, because you have to know what the position of each argument means. In Objective-C, it might look like this:

NSCalendarDate *iWeeksFromNow = [now dateByAddingYears:0 
                                                months:0
                                                  days:(i*7)
                                                 hours:0
                                               minutes:0
                                               seconds:0];

This call is self-documenting, because you can now easily determine what each parameter means. Although this is an improvement, it still has some flaws. First, I still must know the correct order of the arguments. For example, this won’t work:

NSCalendarDate *iWeeksFromNow = [now dateByAddingSeconds:0 
                                                 minutes:0
                                                   hours:0
                                                    days:(i*7)
                                                  months:0
                                                   years:0];

More egregiously, you can’t omit the arguments for which you would like to supply no value or have the default value used. This is why we see all these parameters with a value of 0 being passed in. You could fix this API in Objective-C by defining separate method to add each unit, so something like this would work:

NSCalendarDate *iWeeksFromNow = [now dateByAddingDays:(i*7)];

But what if you want to specify values for two of the arguments? You end up with a huge explosion of methods in the API. This also leads to another big problem, which is what if you want add another value that could be passed in? For example, let’s say we wanted to by able to pass in weeks, which is exactly what we are trying to do in this example. We could modify the API and call the code like this:

NSCalendarDate *iWeeksFromNow = [now dateByAddingYears:0 
                                                months:0
                                                  days:0
                                                 weeks:i
                                                 hours:0
                                               minutes:0
                                               seconds:0];

But that means we have to go change all the code that doesn’t have weeks in the method call to include it. Depending on how much code you have calling the API, this could be a nightmare. In this case, where we are talking about a core method of Cocoa, so I would suspect that Apple is unlikely to ever change this method for that reason. This is why APIs developed in a language without keyword arguments sometimes stagnate over time, for fear of breaking backward compatibility. If this were a keyword argument method, support for accepting weeks as an argument could easily be added without breaking any existing code.

Keyword arguments, as typically implemented in Ruby or Python, are optional and unordered. A method like this one in a Ruby API might look like this:

iWeeksFromNowNextYear = now.add(:weeks => i, :years => 1)

This is a very clean way to design an API. It can easily be extended in the future when more arguments need to be added. I don’t need to pass in values for parameters I don’t care about. It is clean and easy to use as the caller of the API. Techincally, this exact kind of thing is possible in Java and Objective-C. The NSCalendateDate could have a method called add that takes an NSDictionary, which might make calling code look like this:

NSCalendarDate *iWeeksFromNowNextYear = [now dateByAdding: 
  [NSDictionary dictionaryWithObjectsAndKeys:
    [[NSNumber alloc] initWithInt:i], @"weeks",
    [[NSNumber alloc] initWithInt:1], @"years",
    nil]];

But as you can see, there is too much syntactical noise for this kind of thing to every become idiomatic, which goes back to my original point, which is that having a literal syntax for Hashes is a huge win for the syntax of a language.

Posted in Technology | Topics ObjectiveC, Python, Ruby | 2 Comments

Testing in a statically-typed language

December 19, 2008

One of the many “fringe” languages that has been on my radar for some time now is Haskell. Haskell has what looks to be a pretty good book available online and a local book club forming, which presents a great opportunity to study it. What is interesting to me about Haskell is that it is a statically-typed language, like Java, not a dynamically-typed language like Lisp, Ruby, Python, JavaScript, etc. I have really become fond of dynamic typing while programming in Ruby, but I always like studying languages that challenge your fundamental beliefs about programming.

Incidentally, this is one of the reasons I have been really interested in Clojure. It challenges the validity of object-oriented project in a very serious way. In Clojure, you don’t define new classes, instead your program is simply comprised of functions that takes input, possibly call other functions, and return values. Studying functional languages helps you to see the benefits and weaknesses of object-oriented programming. Read more about this idea of studying programming in Glenn Vanderburg’s article about Koans.

So anyway, back to Haskell, and the paradigm here that is worth studying is static typing versus dynamic typing. I must admit, after moving from Java (a statically-typed language) to Ruby (a dynamically-typed language), I’m currently a big fan of dynamically-typed languages. So I’m trying to keep an open mind in studying this, but this passage from Chapter 2 of Real World Haskell I have to object to:

Programs written in dynamically typed languages require large suites of tests to give some assurance that simple type errors cannot occur. Test suites cannot offer complete coverage: some common tasks, such as refactoring a program to make it more modular, can introduce new type errors that a test suite may not expose.

In Haskell, the compiler proves the absence of type errors for us: a Haskell program that compiles will not suffer from type errors when it runs. Refactoring is usually a matter of moving code around, then recompiling and tidying up a few times until the compiler gives us the “all clear”.

One thing I learned when moving from Java to Ruby is that developers often rely on the compiler as a false sense of security. The notion that you can refactor a bunch of code, get it to the point where it compiles and then feel that your code works is dangerous. You code can still contain all sorts of runtime and logic errors, therefore if you want to have any degree of certainty that your code is free of bugs, you need to have some sort of test suite, regardless of if you are using a dynamic and statically typed language.

I’ve also heard the argument made that statically-typed languages allow you to write less tests, but I have also not found this to be the case. Most syntax and type errors will be found with the same tests you use to test the logic of the program. In other words, you don’t have to write one suite of tests to check for syntax errors, one for type checking and another for testing the logic of your program.

So the moral of the story is that regardless of whether you are writing your code in a statically or dynamically typed language, you still have to TATFT.