Paul Barry

Zip Code Proximity Search with Rails

June 27, 2009

So you’re building the next big social networking website using Rails and like all the other hip kids you are going to need to allow your users to search for other users near them. The fancy term for this is “Proximity Search”. For our search, we just want be able to find other people that are generally within some radius, like 5, 10 or 25 miles. For this, there is no need to geocode the address for each user in our database, we’ll just use their zip code. So effectively, in our system, every user’s location is just the center point of their zip code.

For starters we want to create a zip code model:

script/generate model zip code:string city:string state:string lat:decimal lon:decimal

That will create a model and a migration. You need to alter the migration to specify the precision and scale for the lat and the lon.

t.decimal :lat, :precision => 15, :scale => 10
t.decimal :lon, :precision => 15, :scale => 10

So to populate this database, luckily the good people over at the US Census Bureau have the data readily available for us. I’ve created a rake task to download and load that data into your zips table. Simply put the load.rake file from this gist into the lib/tasks directory of your Rails app.

So now when you run rake load:zip_codes you should see something like:

== Loaded 29470 zip codes in ( 1m 40s) ========================================

Next we need a table for our users. So let’s generate a model and a migration:

script/generate model user

I’ll save you the hassle of typing out all the fields at the command-line and just give them to you here. Paste this into the create_users migration that was generated:

t.string   :username
t.string   :email
t.string   :password
t.string   :password_confirmation
t.string   :first_name
t.string   :last_name
t.string   :address
t.string   :city
t.string   :state
t.integer  :zip_id    

Next you need to hook up the relationship between the zip and the user. This is basic stuff, the zip has many users and the user belongs to a zip.

Now we need some users to play with. A great tool for this is Mike Subelsky’s Random Data gem. I’ve already created a rake task that uses this gem to create some test user accounts. You call it like this:

rake load:random_users[10000]

The 10000 is the number of users we want the rake task to generate for us. Did you know you can pass command-line arguments to a rake task like that? Pretty spiffy. 10000 is a pretty good number because it gives us a fairly large dataset to work with and is still able to load in a reasonable amount of time. 10000 users finished in about 6 minutes and 30 seconds for me.

Next we need to setup our methods to do the querying. For this I basically used Josh Huckabee’s Simple Zip Code Perimeter Search method, but re-worked it a little so we can use named scope with it. You can grab the code for both zip.rb and user.rb from the gist.

There are a couple of things we get here. First is a named scope to easily find zip codes. Looking at the output of the loading of the random users, the last one for me was Mr. Steven Moore of Koloa, HI, 96756. So let’s see how many other people are in that zip code. Start up script/console and run this:

>> Zip.code(96756).users.count
=> 1

Hmm…I guess it’s lonely in Hawaii. Let’s find the zip code that randomly ended up with the most inhabitants:

>> Zip.count_by_sql "select zip_id, count(*) as count 
from users group by zip_id order by count desc limit 1"
=> 18177

Ok, so that’s the id of the zip record, not to be confused with the actual zip code. So let’s find the first person in this zip code:

>> user = Zip.find(18177).users.first
=> #<User id: 1267, username: "cabel1266", ...>

I got Ms. Cheryl Abel of Bloomville, NY. So now for the big moment. What we really want to do is find everyone within 25 miles of Cheryl.

>> user.within_miles(25).count(:all)
49

Looks like Cheryl has 49 people nearby. Let’s see who they are:

>> user.within_miles(25).all.each{|u|
?> puts "%.2f %20s, %2s, %5s" %
?> [u.distance, u.city, u.state, u.zip.code]}
0.00           Bloomville, NY, 13739
0.00           Bloomville, NY, 13739
0.00           Bloomville, NY, 13739
0.00           Bloomville, NY, 13739
0.00           Bloomville, NY, 13739
7.04            Worcester, NY, 12197
7.04            Worcester, NY, 12197
7.43             Maryland, NY, 12116
8.09             Meredith, NY, 13753
8.54            De Lancey, NY, 13752
8.71     Livingston Manor, NY, 12758
9.11             Roseboom, NY, 13450
9.88          Jordanville, NY, 13361
...

So there you have it! I’m still trying to work out some kinks with this and get it to work with count and will paginate, so if you have any suggestions, fork the gist, hack away and leave a comment. I’ll update this post when I get count and pagination working.

Posted in Technology | Topics Ruby, Rails | 2 Comments

Comments

1.

Paul, this is a great article. One comment: I noticed the zip code database you reference only contains around 29,000 of the 43,000 active US zip codes. I found this database as well which seems more complete and only requires slight modification to your script.

http://www.boutell.com/zipcodes/

# Posted By Doug Petkanics on Wednesday, January 20 2010 at 9:49 PM

Comments Disabled