February 24, 2010
Let's say you're dealing with a large Rails codebase and you've got a Hash stored in a global variable or a constant and you want to know who is changing that Hash. Here's a contrived example:
IMPORTANT_STUFF = {
:password => "too many secrets"
}
def change_password(h)
h[:password] = "FAIL"
end
def print_password
puts IMPORTANT_STUFF[:password]
end
print_password
change_password(IMPORTANT_STUFF)
print_password
Here it's pretty obvious where the Hash gets changed, but as I said, imagine you are trying to figure this out in a much larger codebase. Something is changing the value of IMPORTANT_STUFF and you don't know what. So how do you figure out what is? Easy, you do what Lester Freeman would do!

We set up a sting! We put a wire tap on IMPORTANT_STUFF and monitor all communication with IMPORTANT_STUFF. So how do we do that? Let's create a class that proxies all communication with a Hash:
class HashSpy
def initialize(hash={})
@hash = hash
end
def method_missing(method_name, *args, &block)
puts "***** hash access"
puts " before: #{@hash.inspect}"
r = @hash.send(method_name, *args, &block)
puts " after: #{@hash.inspect}"
puts " backtrace:\n #{caller.join("\n ")}"
r
end
end
This uses a couple of interesting Ruby techniques. First, we just pass the actual Hash to the constructor. Then, we use method missing so that any method that is called on the HashSpy will be then called on the Hash and the return value of that method call with be called instead. Note that in Ruby 1.8, this isn't a transparent proxy because if you called class on the HashSpy, you would get HashSpy, not Hash. In Ruby 1.9, you can have your object inherit from BasicObject, which won't have those methods, making it easier to be a transparent proxy. In Ruby 1.8, you can use Jim Weirich's Blank Slate pattern
In HashSpy's method missing, we use caller to get a backtrace of the current call stack, which will tell us who the perpetrator is.
So, if we just change IMPORTANT_STUFF to be created like this:
IMPORTANT_STUFF = HashSpy.new(
:password => "too many secrets"
)
Now when we run the program, we'll get output something like this:
***** hash access
before: {:password=>"too many secrets"}
after: {:password=>"too many secrets"}
backtrace:
hash_spy.rb:27:in `print_password'
hash_spy.rb:30
too many secrets
***** hash access
before: {:password=>"too many secrets"}
after: {:password=>"FAIL"}
backtrace:
hash_spy.rb:23:in `change_password'
hash_spy.rb:31
***** hash access
before: {:password=>"FAIL"}
after: {:password=>"FAIL"}
backtrace:
hash_spy.rb:27:in `print_password'
hash_spy.rb:32
FAIL
And by reading through the output, we can see that the second time the hash is accessed is when the value is changed, so the perpetrator is on line 23 of hash_spy.rb in the change_password method. Here's the entire script in one gist for reference.
Posted in
Technology
|
Tags
Ruby, Rails
|
2 Comments
December 9, 2009
Whether you are a Java or a Ruby programmer, I'm sure you are familiar with this idiom:
require 'logger'
log = Logger.new(STDOUT)
log.level = Logger::INFO
log.debug("hello")
log.info("Done")
That's a simple logger where the log level is set to info, so the debug statement isn't logged, but the info statement is. One gotcha to look out for is something like this:
require 'logger'
log = Logger.new(STDOUT)
log.level = Logger::INFO
def fib(n)
if n < 1
0
elsif n < 2
1
else
fib(n-1) + fib(n-2)
end
end
log.debug("fib(30) => #{fib(30)}")
log.info("Done")
This also just logs "Done", but it take more than a few seconds to do so. The reason why is that even though you aren't logging the string that gets passed to debug, the ruby interpreter still has to incur the cost of generating the string and passing it to debug, where it gets ignored.
If you are an old Java programmer like me, you'll probably know you can fix it like this:
require 'logger'
log = Logger.new(STDOUT)
log.level = Logger::INFO
def fib(n)
if n < 1
0
elsif n < 2
1
else
fib(n-1) + fib(n-2)
end
end
if log.debug?
log.debug("fib(30) => #{fib(30)}")
end
log.info("Done")
That works, but it's not the Ruby way of doing it. It's the idiomatic way of doing it in Java, but that is due to the fact that Java doesn't have anonymous functions nor a concise syntax for creating them. The Ruby way of doing it is:
require 'logger'
log = Logger.new(STDOUT)
log.level = Logger::INFO
def fib(n)
if n < 1
0
elsif n < 2
1
else
fib(n-1) + fib(n-2)
end
end
log.debug { "fib(30) => #{fib(30)}" }
log.info("Done")
The difference between this version and the original is that instead of passing a string to debug, we pass a block that returns a string when it is called. We don't have to wrap it in an if statement because the block can be conditionally evaluated based on the current log level.
The difference between the if statement and the block is admittedly minor. That being said, prefer the block syntax. :)
The important thing to remember is that if you have a debug statement that does any kind of calculating, pass it a block instead of just a string to avoid the overhead associated with unnecessarily building the string.
Posted in
Technology
|
Tags
Ruby, Java, RubyOnRails
|
1 Comment
November 5, 2009
If you haven't seen it before, Peter Norvig has a spelling corrector that is written
in just 21 lines of Python code (not counting blank lines and the import). He also lists a few other implementations in other languages,
include one in Ruby. The Ruby one was listed as 34 lines. I was surprised that
it was that many lines more in Ruby, so I wanted to give it a try. I didn't look at the
Ruby solution first and here's what I came up with:
require 'set'
def words(text)
text.downcase.scan /[a-z]+/
end
def train(features)
features.inject(Hash.new(1)){|model, f| model[f] += 1; model }
end
NWORDS = train(words(File.open('big.txt').read))
ALPHABET = 'abcdefghijklmnopqrstuvwxyz'.split //
def edits1(word)
s = (0..word.size).map{|i| [word[0,i], word[i,word.size]]}
deletes = s.map{|a,b| !b.empty? ? a + b[1..-1] : nil}.compact
transposes = s.map{|a,b| b.size > 1 ? a + b[1].chr + b[0].chr + b[2..-1] : nil}.compact
replaces = s.map{|a,b| !b.empty? ? ALPHABET.map{|c| a + c + b[1..-1]} : nil}.flatten.compact
inserts = s.map{|a,b| ALPHABET.map{|c| a + c + b}}.flatten
Set.new(deletes + transposes + replaces + inserts)
end
def known_edits2(word)
s = edits1(word).map do |e1|
edits1(e1).map{|e2| NWORDS.include?(e2) ? e2 : nil}.compact
end.flatten
s.empty? ? nil : Set.new(s)
end
def known(words)
s = Set.new(words.find{|w| NWORDS.include?(w)})
s.empty? ? nil : s
end
def correct(word)
candidates = known([word]) || known(edits1(word)) || known_edits2(word) || [word]
candidates.max{|a,b| NWORDS[a] <=> NWORDS[b] }
end
To run this, download the data file, put the code in a file called spelling.rb, then start up IRB:
$ wget http://norvig.com/big.txt
$ irb -r spelling -f --simple-prompt
>> correct "speling"
=> "spelling"
>> correct "korrecter"
=> "corrected"
This one weighs in at 30 lines. I tried to do this as close to the Python implementation
as possible. I also tried to use idiomatic Ruby. You could shave the number of lines down
below 21, but it wouldn't meet any reasonable Ruby style guidelines. I'm still probably
cheating a little as a few of those lines are approaching 100 characters, but it's at least
reasonable, in my opinion.
Here are some things I noticed that caused the Ruby version to be longer or less clear:
end vs. significant indentation
6 of the lines are just an end statement. Python uses indentation to end methods,
so the end statements aren't needed in Python. This adds more lines, but has trade offs.
I actually really like the idea of significant indentation, it's one of the reasons
that I'm such a big fan of Haml. But it falls down in certain places.
For example, Ruby has Embedded Ruby, which looks similar to JSP or PHP, but it's actually trivial to implement the basic cases. It truly is embedded Ruby, because the code between the <% %> and <%= %> tags is just ruby code. You commonly see things like this:
<% if logged_in? %>
Welcome, <%= current_user.login %>
<% else %>
Howdy Stranger!
<% end >
You can't do this in Python because of the lack of the end statement. This is why
I'm surprised Haml was invented in Ruby and not Python. In Python, it fits with the
language and is actually necessary, whereas in Ruby, significant indentation isn't
part of the language and ERB is actually pretty good. There is a Python port of HAML,
but I'm not sure how well that works or how widely it is used in the Python community.
List Comprehensions vs. Blocks
Python and Ruby, compared to C, Java, etc., have very powerful, concise syntax for
iterating through and transforming collections, but they are very different.
Python uses list comprehensions and Ruby uses blocks. As you can see above,
there are certain cases where list comprehensions are very compact. List comprehensions
can have a guard, which is a little cleaner than the Ruby equivalent where you have to
return nil if the guard doesn't match and then compact that result. Also,
when you want to iterate over two lists, you can do so with mutliple for statements,
whereas in Ruby you do nested blocks and then flatten the result.
Collection Slicing
Python has a little cleaner, more consistent syntax for breaking up collections
into sub collections. It also treats strings as a collection of characters more
consistently.
Truthiness
In Python, many things are considered false. I'm not sure what the entire list is,
but it seems to include empty collections (and therefore empty strings) as empty.
Since Ruby only treats nil and false as false, we have to return nil instead
of an empty set from the known and known_edits2 methods, so that the series
of statements in the first line of the correct method will work correctly.
In summary, in this case, the Python code is shorter and clearer, but it's pretty close.
I'm sure there are other code examples where the Ruby code would be a little shorter,
but I do think in general, Ruby and Python are going to be pretty close in terms of
code clarity and number of lines of code.
Posted in
Technology
|
Tags
Python, Ruby
|
0 Comments
October 9, 2009
You've probably seen more than a few articles on the web showing how to build a Rack app. If not, here's a good one to start with. You'll quickly see that building a Rack app is really simple, which is why Rack is awesome, because it's simple. But what about writing a Rack-compliant server? Well it turns out that is pretty easy as well.
I just pushed a little Rack-compliant HTTP Server that I wrote using GServer to github. The whole thing is less than 200 lines of code. The core of it is short enough that I can explain how it works here.
First, GServer. GServer, which is short for "Generic Server" makes it pretty simple to create a multithreaded TCP Server. Taking out some error handling code, here's what the GServer looks like for our Rack HTTP Server:
module GThang
class HttpServer < GServer
attr_reader :port, :rack_app
def initialize(options={})
@port = options[:Port] || 8080
@rack_app = options[:rack_app]
super(@port)
end
def serve(socket)
RackHandler.new(socket, rack_app, port).handle_request
end
end
end
So all there is to a GServer is basically a serve method. This will be called each time a client connects to the server. The argument to the method is the client socket connection. You read and write data from the socket as you see fit for your application. As you can see here, we just pass the socket, along with the rack app and the port to the RackHandler initializer and then call handle_request on that. We'll look at how you setup the rack app in a minute, but first let's take a look at the meat of what the RackHandler does. The handle_request method looks like this:
def handle_request
return unless add_rack_variables_to_env
return unless add_connection_info_to_env
return unless add_request_line_info_to_env
return unless add_headers_to_env
send_response(app.call(env))
end
So what happens is the various add_ methods build up the rack environment. Once the environment is ready, we call the rack app. The rack app responds with the standard 3 element array, which we pass off to the send_response method, which writes the actual http response to the client. Take a look at the full code for this on github for the details.
Now the fun part is that we now have a fully functional HTTP server that is capable of acting as a file server or serving a Rails app. All we have to do is give the HttpServer the correct Rails app. If you look in the examples, you see this for the file server:
GThang::HttpServer.run(
Rack::CommonLogger.new(
Rack::Lint.new(
Rack::Directory.new(root, Rack::File.new(root)))),
:Port => 8080)
Now I choose to write it this way to make it clear what is actually happening. You will normally see the builder DSL used to configure a rack app, which would look like this:
use Rack::CommonLogger.new
use Rack::Lint.new
use Rack::Directory.new(root)
run Rack::File.new(root)
This is obviously a lot cleaner, but to understand how Rack works, you have to realize that all this is doing is what we see in the first example. A Rack app with Rack middleware is simple a chain of apps that call the next app in the chain, possibly modifying the environment or response before or after the rest of the chain is called.
So there you have it, beauty in simplicity.
Posted in
Technology
|
Tags
Rails, Ruby, Rack
|
4 Comments
September 14, 2009
This past weekend I attended the Windy City Rails conference. It was in a great location in the heart of downtown Chicago and seemed to have a pretty good turn out. There were many great talks but this blog post will be focusing on a specific talk, and more precisely, part of a talk. Yehuda Katz gave a talk on the status of Rails 3. One of the things that he mentioned, which you may have already heard, is that Rails 3 will require Ruby 1.8.7 or higher, dropping support for Ruby 1.8.6. He also mentioned why they are doing this and I found the reason to be interesting. It's not that the Rails core team wants to try to take advantage of any specific new features, it's that Ruby 1.8.6 has a bug which has been fixed in 1.8.7.
To see the bug in action, I recommend that you install Ruby Version Manager (rvm). Once you have installed rvm, install Ruby 1.8.6 and Ruby 1.8.7.
The bug is that in Ruby 1.8.6, the hash method for Hash doesn't generate the same hash code for different hashes with the same values:
$ rvm use 1.8.6
$ irb
ruby-1.8.6-p383 > {:x => 1}.hash
=> 1313270
ruby-1.8.6-p383 > {:x => 1}.hash
=> 1307060
ruby-1.8.6-p383 > {:x => 1}.hash
=> 1296440
ruby-1.8.6-p383 > {:x => 1} == {:x => 1}
=> true
ruby-1.8.6-p383 > h = {{:x => 1} => "foo"}
=> {{:x=>1}=>"foo"}
ruby-1.8.6-p383 > h[{:x => 1}]
=> nil
So despite the fact that two hashes have the same values and are equal, you can't use a hash as a key in a hash, because that depends on the hash codes of the values being equal, which they aren't. This is fixed in Ruby 1.8.7:
$ rvm use 1.8.7
$ irb
ruby-1.8.7-p174 > {:x => 1}.hash
=> 327875
ruby-1.8.7-p174 > {:x => 1}.hash
=> 327875
ruby-1.8.7-p174 > {:x => 1}.hash
=> 327875
ruby-1.8.7-p174 > {:x => 1} == {:x => 1}
=> true
ruby-1.8.7-p174 > h = {{:x => 1} => "foo"}
=> {{:x=>1}=>"foo"}
ruby-1.8.7-p174 > h[{:x => 1}]
=> "foo"
This is important because you could use a hash cache calls to a method that expects a hash, but only if you can use a hash as the key. This is one of the main reasons Rails 3 is going to require 1.8.7. They could make it worth for both 1.8.6 and 1.8.7 and higher, but why? It simplifies things to just require that you upgrade to Ruby 1.8.7 to use Rails 3. If you are using 1.8.6, this is probably a gotcha that you should be aware of.
Posted in
Technology
|
Tags
Rails, Ruby
|
0 Comments