Emulating Haskell's Maybe in Java with Checked Exceptions

July 17, 2009

Haskell is one of the interesting languages that I've been looking into off and on for a while now. I think the type system is really interesting and if you don't know Haskell, you should check it out. I've found that the best introduction to the language is Learn You A Haskell. Once you've got the basics down, you should tackle Real World Haskell, which is a much more in-depth book. I'm still working through it when I have some spare time.

One interesting concept in Haskell is the Maybe type. The purpose of the Maybe type is to declare in the type signature of a function that a function might return a value or it might return nothing. In Java, the return type of method can't indicate that. For example, if you have a method with this signature:

Object lookup(String key)

You have no idea if that will actually give you an object or not. The lookup method may or may not be implemented in such a way that prevents it from returning null. If the method does return null, you are risking the possibility of encountering the dreaded NullPointerException.

Haskell doesn't have this problem because it doesn't treat null in the same way that Java treats null. As we just implied in the previous paragraph, this is a valid implementation of the lookup method from the Java compiler's perspective:

Object lookup(String key) {
  return null;
}

Haskell's equivalent to null is called Nothing, but the difference is that Nothing cannot stand in for all other types, as it can in Java. So if you actually had a type called Object in Haskell, a function that returns an Object cannot return Nothing.

A practical example of this would be looking up a value in a Map. When you try to retrieve the value for a key in a map, you might get a value or you might not find one. Here is an invalid Haskell program that tries to do that:

import Prelude hiding (lookup)
import Data.Map

zipToCityMap = fromList([("21230", "Baltimore"), ("21046", "Columbia")])

main = do
  putStrLn "What is your zip code?"
  zip <- getLine
  putStrLn $ "You live in " ++ (lookup zip zipToCityMap)

If you try to run this (which you can do on a mac by first installing ghc through macports with the command sudo port install ghc, then saving the contents above to city.hs and running the command runghc city.hs), you will get this compilation error:

city.hs:9:32:
    Couldn't match expected type `[Char]'
           against inferred type `Maybe [Char]'
    In the second argument of `(++)', namely
        `(lookup zip zipToCityMap)'
    In the second argument of `($)', namely
        `"You live in " ++ (lookup zip zipToCityMap)'
    In the expression:
          putStrLn $ "You live in " ++ (lookup zip zipToCityMap)

What the compiler is trying to say is that the lookup function returns a Maybe [Char], not a [Char]. [Char] is a type that is a list of characters. In Haskell, the type String is just an alias for [Char]. By wrapping the [Char] in a Maybe, the lookup function requires it's callers to unravel the value out of the Maybe in order to use it as well as explicitly handle the case that no value was found. If we modify our program to look like this:

import Prelude hiding (lookup)
import Data.Map

zipToCityMap = fromList([("21230", "Baltimore"), ("21046", "Columbia")])

main = do
  putStrLn "What is your zip code?"
  zip <- getLine
  case lookup zip zipToCityMap of
    Nothing -> putStrLn "I don't know where you live"
    Just city -> putStrLn $ "You live in " ++ city

This will compile and run. The program will ask you for your zip code and it will print the city you live in, but if it can't find it in its "database", it prints a friendly response rather than blowing up. Now here's a Java version of this program:

import java.util.Scanner;
import java.util.Map;
import java.util.HashMap;

class City {

  String name;
  String state;

  public City(String name, String state) {
    this.name = name;
    this.state = state;
  }

  public static City lookup(String zip) {
    Map zipToCityMap = new HashMap();
    zipToCityMap.put("21230", new City("Baltimore", "MD"));
    zipToCityMap.put("21046", new City("Columbia", "MD"));
    return (City)zipToCityMap.get(zip);
  }

  public static void main(String[] args) {
    System.out.println("What is your zip code?");
    Scanner in = new Scanner(System.in);
    String zip = in.nextLine();
    System.out.println("You live in "+lookup(zip).name);
  }

}

If you compile and run this, if you enter one of the chosen zip codes, it will tell you what city you live in. But if you enter an incorrect zip code, you will get the dreaded NullPointerException:

$ javac City.java
Note: City.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
$ java City
What is your zip code?
21230
You live in Baltimore
$ java City
What is your zip code?
90210
Exception in thread "main" java.lang.NullPointerException
        at City.main(City.java:26)

Now to get around this you would normally just check to see if the city is null and then print an alternative method if it is:

City city = lookup(zip);
if(city != null) {
  System.out.println("You live in "+.name);  
} else {
  System.out.println("I don't know where you live");
}

We know to do this only because we know the lookup method might return null. It would be nice if when defining the lookup method we could make it explicit to the callers of the method that the method might return null and have the compiler generate an error if the caller does not explicit check for the condition of a null result. If this were the case, a NullPointerException would be impossible. This is exactly what Haskell's Maybe type does.

A few other people have attempted to define a class in Java that mimics the Haskell Maybe type. That appears to work somewhat, but actually seems fairly convoluted to me. Another way to achieve the same goal would be to use a checked exception. To do this, put this in a file called MaybeNullException:

public class MaybeNullException extends Exception {}

Then add throws MaybeNullException to the end of the method signature for the lookup method. With just those changes, you will get a compilation error:

City.java:26: unreported exception MaybeNullException; must be caught or declared to be thrown
    System.out.println("You live in "+lookup(zip).name);

Which is exactly what we want. We have indicated in the method signature of our lookup method that it might return null, therefore the caller of our method is required to do something about it.

To get this to work, first we need to make the lookup method actually throw the exception if the city isn't found. This is semantically equivalent to returning Nothing in the Haskell version:

City city = (City)zipToCityMap.get(zip);
if(city == null) {
  throw new MaybeNullException();
} else {
  return city;
}

And then in the method where we call lookup, we need to check for the Exception:

try {
  System.out.println("You live in "+lookup(zip).name);  
} catch(MaybeNullException e) {
  System.out.println("I don't know where you live");  
}

Now we get the desired behavior:

$ javac -cp . City.java
Note: City.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
$ java City
What is your zip code?
21230
You live in Baltimore
$ java City
What is your zip code?
90210
I don't know where you live

The problem with this solution, at least from a Java perspective, is that many people, myself included, consider checked exceptions to be a failed experiment. Making use of checked exceptions all over your code results in having to add a lot more boilerplate and try/catch statements. But is this a small price to pay for eliminating the NullPointerException? Or does the Maybe type impose too much overhead in Haskell, making it doomed to fall out of favor the same way checked exceptions did in Java? My thought is no, that packing as much information as possible into the type of functions is the essence of Haskell. Also in Java, checked exceptions are something programmers learned to essentially get rid of by wrapping them in unchecked exceptions, where Nothing is very core to the way Haskell works. Nevertheless I thought this was an interesting thought experiment and might serve as a good analogy for Java programmers learning Haskell.

I should point out one more thing on this topic which is that the Maybe type is definitely more powerful in Haskell because it is a Monad. Because of this you can chain function calls together without having to deal with the possibility of something like a null pointer exception. The Java Checked Exception doesn't provide us a similar mechanism.

Posted in Technology | Tags NullPointerException, Maybe, Haskell, Java, MaybeMonad, CheckedExceptions | 18 Comments

A rake task for tracking your time with git

July 7, 2009

Are you using Ruby on Rails? Are you using Git? Do you have a need to track how long you spend on things? Then I have just the thing for you.

I threw together a quick rake task that gets all of your commits in a git repo and parses out the times and commit message from them. Then it formats them with the time and also the time interval between them. You can get the rake task to track your time from this gist.

The output will look something like this:

Fri, Jul 07 10:55AM  20m 49s  Added toolbar for controllers using temp...
Fri, Jul 07 10:34AM  21h 52m  Added support for using page templates i...
Thu, Jul 07 12:42PM  37m 57s  LH#77, fixed issue with tests failing on...
Thu, Jul 07 12:04PM  12m 18s  LH#67, added a limit option to the rende...
Thu, Jul 07 11:52AM  17m 30s  Removed debug statement                 ...
Thu, Jul 07 11:34AM  19h 52m  LH#66, added :path option to render menu...
Wed, Jul 07 03:41PM           Added DSL for modifying portlet behavior...
Tue, Jun 06 02:05PM  18h 44m  LH#119, multiple HTML fields on one bloc...
Mon, Jun 06 07:20PM   6h 21m  Converted docs to textile               ...
Mon, Jun 06 12:58PM           Fix for LH#118, create directories in ge...
Sat, Jun 06 10:22PM           Added support for other template handler...
Fri, Jun 06 04:49PM   0m 58s  bump build                              ...
Fri, Jun 06 04:48PM  23m 11s  Fix LH#106: Section not correctly loadin...
Fri, Jun 06 04:25PM  34m 25s  Fix for LH#107, images were not showing ...
Fri, Jun 06 03:51PM   9m 48s  Fix for LH#110, can't view usages of a p...
Fri, Jun 06 03:41PM  11m 12s  Fix for LH#113, check to see if there is...
Fri, Jun 06 03:30PM   2m 52s  Fixed LH#114, documentation typo        ...
Fri, Jun 06 03:27PM   0m 38s  bump build number                       ...
Fri, Jun 06 03:26PM   5h 38m  Fix for LH#98, tags not getting updated ...
Fri, Jun 06 09:48AM  33m 14s  Fixed LH#105, deleted portlets showing u...

It doesn't actually truncate the commit messages, I just did that here to make each one fit on a line. If the time interval is over 24 hours, it doesn't bother printing the interval, because you probably didn't actually work on that one commit for 37 hours straight. I've been thinking if you really want to track time this way then each time you sit down to start hacking on a project, you just make a minor change to the .gitignore or something and then commit it with a message like "started hacking on foo", so then when you commit your first chunk of actual work, you will know how long you spend on that.

Posted in Technology | Tags Git, Ruby, Rake | 4 Comments

Programming Clojure

June 28, 2009

Programming Clojure

A few weeks ago I received my printed copy of Programming Clojure by Stuart Halloway in the mail. I had been a technical reviewer of the book so I was excited to see it finally in print. In case you haven't heard of it yet, Clojure is a programming language designed by Rich Hickey, a Lisp dialect, that runs on the Java Virtual Machine and is designed support concurrent programming.

Clojure has excellent documentation and Rich has posted several great videos of talks he has given that cover the rational for Clojure as well as an introduction into the major concepts. I highly recommend that you watch those videos if you haven't already because Rich does a great job explaining why concurrency is hard using the typical Object-Oriented model that we program in today and how the features of Clojure support a better model for concurrency. Whether you are programming in Java, C#, Erlang, Haskell, Python, Ruby, etc., you will probably be able to learn something from these talks, and plus Rich is just an interesting guy to listen to.

So after all that, you might be saying why do we need a book about Clojure? The answer is that although the documentation is good, it can be a little intimidating when first learning Clojure. For programmers with little or no Lisp or functional programming experience, figuring out how to do basic things the idiomatic way in Clojure can be a daunting task. Stuart's book does an excellent job filling this gap.

The book covers all the major feature of Clojure and is very up-to-date. As a reviewer I got to see the evolution of this book from revision to revision and it was amazing to me to see how much work Stuart put in. Chapters were often completely re-written to keep up with changes in the language that occurred before it stabilized in a 1.0 release. I think the final product greatly benefited from that work and is an excellent resource for learning Clojure. I encourage you to pick up a copy today.

Posted in Technology | Tags Clojure | 2 Comments

Zip Code Proximity Search with Rails

June 27, 2009

So you're building the next big social networking website using Rails and like all the other hip kids you are going to need to allow your users to search for other users near them. The fancy term for this is "Proximity Search". For our search, we just want be able to find other people that are generally within some radius, like 5, 10 or 25 miles. For this, there is no need to geocode the address for each user in our database, we'll just use their zip code. So effectively, in our system, every user's location is just the center point of their zip code.

For starters we want to create a zip code model:

script/generate model zip code:string city:string state:string lat:decimal lon:decimal

That will create a model and a migration. You need to alter the migration to specify the precision and scale for the lat and the lon.

t.decimal :lat, :precision => 15, :scale => 10
t.decimal :lon, :precision => 15, :scale => 10

So to populate this database, luckily the good people over at the US Census Bureau have the data readily available for us. I've created a rake task to download and load that data into your zips table. Simply put the load.rake file from this gist into the lib/tasks directory of your Rails app.

So now when you run rake load:zip_codes you should see something like:

== Loaded 29470 zip codes in ( 1m 40s) ========================================

Next we need a table for our users. So let's generate a model and a migration:

script/generate model user

I'll save you the hassle of typing out all the fields at the command-line and just give them to you here. Paste this into the create_users migration that was generated:

t.string   :username
t.string   :email
t.string   :password
t.string   :password_confirmation
t.string   :first_name
t.string   :last_name
t.string   :address
t.string   :city
t.string   :state
t.integer  :zip_id    

Next you need to hook up the relationship between the zip and the user. This is basic stuff, the zip has many users and the user belongs to a zip.

Now we need some users to play with. A great tool for this is Mike Subelsky's Random Data gem. I've already created a rake task that uses this gem to create some test user accounts. You call it like this:

rake load:random_users[10000]

The 10000 is the number of users we want the rake task to generate for us. Did you know you can pass command-line arguments to a rake task like that? Pretty spiffy. 10000 is a pretty good number because it gives us a fairly large dataset to work with and is still able to load in a reasonable amount of time. 10000 users finished in about 6 minutes and 30 seconds for me.

Next we need to setup our methods to do the querying. For this I basically used Josh Huckabee's Simple Zip Code Perimeter Search method, but re-worked it a little so we can use named scope with it. You can grab the code for both zip.rb and user.rb from the gist.

There are a couple of things we get here. First is a named scope to easily find zip codes. Looking at the output of the loading of the random users, the last one for me was Mr. Steven Moore of Koloa, HI, 96756. So let's see how many other people are in that zip code. Start up script/console and run this:

>> Zip.code(96756).users.count
=> 1

Hmm...I guess it's lonely in Hawaii. Let's find the zip code that randomly ended up with the most inhabitants:

>> Zip.count_by_sql "select zip_id, count(*) as count 
from users group by zip_id order by count desc limit 1"
=> 18177

Ok, so that's the id of the zip record, not to be confused with the actual zip code. So let's find the first person in this zip code:

>> user = Zip.find(18177).users.first
=> #<User id: 1267, username: "cabel1266", ...>

I got Ms. Cheryl Abel of Bloomville, NY. So now for the big moment. What we really want to do is find everyone within 25 miles of Cheryl.

>> user.within_miles(25).count(:all)
49

Looks like Cheryl has 49 people nearby. Let's see who they are:

>> user.within_miles(25).all.each{|u|
?> puts "%.2f %20s, %2s, %5s" %
?> [u.distance, u.city, u.state, u.zip.code]}
0.00           Bloomville, NY, 13739
0.00           Bloomville, NY, 13739
0.00           Bloomville, NY, 13739
0.00           Bloomville, NY, 13739
0.00           Bloomville, NY, 13739
7.04            Worcester, NY, 12197
7.04            Worcester, NY, 12197
7.43             Maryland, NY, 12116
8.09             Meredith, NY, 13753
8.54            De Lancey, NY, 13752
8.71     Livingston Manor, NY, 12758
9.11             Roseboom, NY, 13450
9.88          Jordanville, NY, 13361
...

So there you have it! I'm still trying to work out some kinks with this and get it to work with count and will paginate, so if you have any suggestions, fork the gist, hack away and leave a comment. I'll update this post when I get count and pagination working.

Posted in Technology | Tags Ruby, Rails | 2 Comments

Ruby Nation

June 13, 2009

Thanks to everyone who attended my talk yesterday at Ruby Nation about BrowserCMS. I've posted my slides online here. Unfortunately the demo part of the talk wasn't record, so I'll try to record a screencast of the demo. Look for that in the next few days.

If you did attend my talk, I would appreciate it if you take a minute to rate it.

Posted in Technology | Tags Ruby | 0 Comments

<< Newer Articles   Older Articles >>