Paul Barry

Thoughts on Monolith First

March 22, 2020

As I was reading about the Hanami Ruby web framework and making notes about how similar it is to an open-source web framework that I had been working on years ago, one of the architectural principles of Hanami that caught my attention is a pattern proposed by Martin Fowler called Monolith First. I encourage you to read Fowler’s article as well as the counterpoint posted on Fowler’s site as well called Don’t Start Monolith.

I find that the topic of monolith vs. microservices comes up often when working with software teams building applications. Many engineers today take it as fact that a monolithic architecture is just a flawed way to build software. The analogy I would make is to waterfall vs agile. It is widely accepted within modern product and engineering organizations that the waterfall methodology is a flawed, outdated model of working and that agile development is more productive. I have heard the term waterfall used to describe a broken product development process, meaning someone can make the statement “what we are doing is waterfall” and the person making the statement assumes that everyone they are speaking to knows the implication is that it is bad. In other words, waterfall is synonymous with broken or flawed. There is no one arguing that waterfall is better than agile.

I have also been in similar conversations where the term monolith is treated the same way. There is a general assumption that everyone agrees monolith is bad and microservices is good. This is not the case though, as there are people who do argue the benefits of monoliths, as Fowler, who is an extremely well-regarded software architect, is. Rails creator David Heinemeier Hansson makes the argument in favor of monolithic architecture and even tries to brand it in a positive light with the term that he coined “The Majestic Monolith”.

The point that Fowler argues is that it isn’t that a monolithic architecture is better, it is just that it has different advantages and disadvantages than a microservices architecture, and that the tradeoffs are in the favor of the monolith for new projects.

Now having been in many different technology organizations throughout my career and built many applications all across the spectrum of monolith to microservices and everything in between, I’ve come to agree with DHH on this issue, but here is the key point in his Majestic Monolith article:

The biggest factor in determining whether you should just a monolithic or microservices architecture is the size of your team and organization.

DHH isn’t arguing that a monolith is better, he is saying that it is the right architecture for a small team. A microservices architecture isn’t right for a small team in the same way that a monolith isn’t right for a large organization.

So I think to Fowler’s point on monolith first, I would say it depends on the size of the team and organization. If you are a building a new application at Amazon or Google, a monolith isn’t going to make sense, but if you are a 3 person startup, it does.

This is a principle that has been known within software development circles for a long time, which was originally captured in Conway’s Law, which states that architecture design of software systems tend to reflect the structure of the organizations that build them.

Therefore, I think you can think of the monolith vs microservices argument as a decision that ends up being made for you once you decide what the organization and the team building the application will look like. If you are going to have a small team, it is likely a monolith will be more productive and if you are going to have a big team, it is likely that a microservices architecture will be more productive, so let that be your guide.

Posted in Technology

Hanami, a better Rails?

March 22, 2020

For some time now, I have been observing from afar Hanami, the framework previously known as Lotus. Hanami is promoted as a “modern web framework for Ruby”. What I see in Hanami is a new version of Rails re-designed from the ground up. It is still a full featured framework that has everything needed to build web applications, actions, views, layouts, templates, mailers, models, ORM, etc. So this isn’t like Sinatra or Merb, a lightweight framework that only is useful for very simple web application, it’s a batteries-included full-fledged web framework.

Hanami differs from Rails in several ways. First is the idea that each component of the framework should stand on its own. This is something that has always bugged me about Rails. It is very hard and unintuitive to use most of the components of Rails in a standalone, outside of Rails way. Not that you really do need to do this a lot, but where I did find this to be annoying is when debugging issues where you need to go into the internal implementation of Rails. I always found it to be unnecessarily complicated. I would often ask myself “why is this this hard?” and “why is there this much code required to do X?”.

For example, why can’t you just instantiate a controller, call it and get the response? There are several components like this, and as a side project, I started implementing things like this on my own. For example, I created Rack::Action. The idea is that each action is its own class, instead of multiple actions per controller. If you want to share functionality, you use inheritance. Each action ends up being a Rack app. It’s small (few hundred lines of code), fast and easy to use and understand. So I was excited to see that Hanami has actions that work the same way.

Next is routing. Now that each action is its own Rack app, it’s easy to think of routing as just having a router this is also itself a Rack app and all it does is find the right Rack app, call it and then return that response. So this is what lead me to create Rack::Router. Again, it’s small (few hundred lines of code), fast and easy to use and understand. You can write a few lines of standard Ruby, instantiate a router and use it, without even having to generate a project. And again in Hanami, the router works very similar.

So I’ll go on to views and templates. This is another one that bothers me about Rails. Why can’t I just instantiate a class and call a method to render a template? Do you know how to do that in Rails? Could you go into the console and just call a method where you give it a Hash and the name of template and it would give you the rendered result? This is why I created Curtain. Again, same philosophy here, a simple to use standalone component, small, fast, etc. This one also has a design difference from Rails, where you have a class that is called a view, that is separate from the actual template. Rails calls templates “views”, which never really made sense to me. In the terminology I use in Curtain, a View is the object that is the context that the template is rendered within. Views eliminate the need for “helpers”, because you can just define methods on the view and then call them from the template. You can organize your helpers into modules and include them in specific views as needed. This is another one where Hanami follows the same philosophy as Curtain, including separate view classes from the template.

Another thing to call out is code reloading. To avoid having to restart your application in development, Rails has a lot of code to take care of that for you. A much simpler way of achieving the same effect is to use Shotgun. This is what Hanami recommends and what I have used in non-Rails Rack applications in the past as well. It is another prudent choice here by Hanami to just encourage the use of Shotgun rather than complicate the framework with this functionality.

The last thing that I’ll point out that is something where I’ve independently come to the same conclusion as Hanami is what Hanami calls Interactors. I have to admit that I wasn’t familiar with the term Interactor until I saw it in the Hanami docs, but I’m definitely familiar with the concept. The basic idea here is that for somewhat complex business or persistent logic, that involve maybe wrapping multiple persistence calls in one transaction, instead of having that pollute your models with class or instance methods, write an Interactor for each of these operations. In my applications, I’ve been calling these things “Services” and I’ve heard that same terminology used by others. I have found this pattern to be a huge improvement in code testability, reusability and organization. This is such an improvement in the quality of application architecture that it is a bit of head scratcher to me that there isn’t something in Rails like this, if for no other reason to standardize and encourage the pattern.

I haven’t had a chance to build a real application with Hanami yet, but I’d like to give it a try at some point. There are so many design decisions in Hanami I agree with and have considered myself in the past. 7 years ago, I started to work on a framework that I was calling Sharp that pulled together all the components that I have mentioned in this article in a similar way to what Hanami has. I never got the opportunity to get it to be production ready, “life got in the way” as they say, but it is nice to see something complete and polished that is similar philosophically in so many ways.

Posted in Technology | Topics Ruby, Rails

Fresh Coat of Paint

March 22, 2020

Wow, it’s been over 10 years since I last wrote anything new on my blog here or updated that application that it runs on, so I finally got around to updating today. It’s got a new responsive design and is running on Rails 6.0. I have a few new articles I have in mind to write, so check back for new content soon!

Posted in General

Zipping Arrays

August 23, 2010

When programming in any language, you are sure to be in a situation at some point where you have two or more arrays that match up by index. For example, say you have this:

cities = %w[Baltimore Washington Pittsburgh]
teams = %w[Ravens Redskins Steelers]

So the in this case, the name of the team in the nth city is the nth team. In languages like Java and JavaScript, a common method for doing this would be to use a for loop and pull each value out of the array using the index:

for i in (0...cities.size)
  puts "%s %s" % [cities[i], teams[i]]
end

As you can see, this works in Ruby as well. The output of that will be:

Baltimore Ravens
Washington Redskins
Pittsburgh Steelers

But Ruby’s enumerable class has a built-in method for handling this that you might not know about. On any Enumerable, you can call zip and pass in another array and it will return a two-dimensional array with each of the values paired up:

p cities.zip(teams) 
# => [["Baltimore", "Ravens"], ["Washington", "Redskins"], ["Pittsburgh", "Steelers"]]

Conveniently, Ruby’s each method also allows you to assign each value of the sub-array to a variable in the block. So we can perform the for loop from above like this:

cities.zip(teams).each do |city, team|
  puts "%s %s" % [city, team]
end

Which outputs:

Baltimore Ravens
Washington Redskins
Pittsburgh Steelers

No indexes to keep track of. Also, the zip method can take multiple arrays, so you can zip up more than one array and iterate through them in a similar fashion:

qbs = %w[Flacco McNabb Roethlisberger]

cities.zip(teams, qbs).each do |city, team, qb|
  puts "%s %s %s" % [city, team, qb]
end

Which outputs:

Baltimore Ravens Flacco
Washington Redskins McNabb
Pittsburgh Steelers Roethlisberger
Posted in Technology | Topics Ruby | 6 Comments

Fibers in Ruby 1.9

April 1, 2010

One of the new features in Ruby 1.9 is Fibers. In order to understand how Fibers work, we need to first understand how threads work.

A thread is an execution context. When a ruby programs starts, there is a main thread, which you can access by calling Thread.current. You can create new threads within your program. Here’s an example of creating a thread:

Thread.new do
  puts "start"
  sleep 5
  puts "finish"
end
Thread.list.each do |t|
  p t
end

The first thing this program does is create a new thread. What the thread should run is passed in via a block. Remember that in Ruby, the block is a Proc which does not execute when it is created, only when it is called. Next we call Thread.list to iterate each of the threads that exists in our program. Unlike Thread.new, the each method does call it’s block immediately, so we see each thread printed out. What you actually see when you run this program is hard to say. When running in 1.9, you might see this:

#<Thread:0x000001008648e0 run>
#<Thread:0x00000101031010 run>

What we can see is that we have a couple of threads and they are both ready to be run. We don’t see the puts "start" from the thread we created because in this case, the program exited before our thread got a chance to run. You might see this:

#<Thread:0x000001008648e0 run>
start#<Thread:0x00000101003a08 sleep>

In this case, we can see that while were iterating over the thread list, after we printed the main thread and before we printed the second thread, the second thread started executing. Now also notice that the status of the second thread is sleep. This doesn’t just mean the thread is sleeping, a thread could have a state of sleep if it’s waiting on IO.

What is happening in this code is that we have multiple threads and a thread scheduler is deciding when each thread should execute. In Ruby 1.8 MRI the thread scheduler is part of the ruby interpreter process and in Ruby 1.9 YARV, the thread scheduling is being handled by the operating system, but in either case what is happening is conceptually similar.

The thread scheduler allows each thread to run for a short period of time, like 10ms. Once that time runs out or when the thread’s status changes to sleep, the thread scheduler finds the next thread that isn’t sleeping and let’s that thread run for 10ms. This continues throughout the life of the program. It’s actually more complicated than this, but at the heart of if, this is what happens.

Every time the thread scheduler switches from one thread to the next, it has to switch the execution context to allow the next to run. There is some overhead with this that can add up if your program has to do a lot of context switching. More importantly, there is no way of knowing ahead of time when the context switching will occur. So not only is this inefficient, it’s also dangerous, because the outcome of your program can change based on circumstances out of your control. In order to achieve parallelism in your program though, ruby has to switch from one thread to the next as some point, so the thread scheduler has to just guess. But what if you could indicate in your code exactly when you want a context switch to occur?

Enter fibers in Ruby 1.9. The easiest way to understand fibers is to think about them as being very similar to threads. When your program starts, there is a current fiber. You can create more fibers as your program runs. Each fiber defines some code to run. Here’s our example from above:

fibers = [Fiber.current]
fibers << Fiber.new do
  puts "start"
  sleep 3
  puts "finish"
end
fibers.each do |f|
  p f
end

In the case, the output of our program is more determinate. It will be something like this: (the only thing indeterminate about it is what the ids of the objects will be)

#<Fiber:0x0000010109cdc8>
#<Fiber:0x0000010109cd58>

Unlike threads, fibers don’t have a state that can be runnable or sleeping. This is because with fibers, only one fiber in the process can be running at once. This is true of threads as well, there can only be one thread running at once within one Ruby process. The difference is that a fiber gets to decide how long it wants to run for, unlike threads, which get preempted by the thread scheduler.

In our example above, our second fiber never executed because the main fiber never started it. In this case, the main fiber ran until the end of the program. If we want to run the fiber, we have to call resume on it:

require 'fiber'
fibers = [Fiber.current]
fibers << Fiber.new do
  puts "start"
  sleep 3
  puts "finish"
end
fibers.each do |f|
  p f
end
fibers.last.resume

Now we will see the fibers printed out as before, but then since we call resume on our second fiber, then it will execute, print start, then after 3 seconds, print finish:

#<Fiber:0x0000010101f030>
#<Fiber:0x0000010101efc0>
start
finish

Where things actual get interesting with fibers is that once a fiber is started, it can then yield back to the fiber that started it. Then, you can call resume on the fiber and it will pick up executing where it left off. Take a look at this example:

require 'fiber'
you = Fiber.new do
  Fiber.yield "potato"
  Fiber.yield "tomato"
end
puts "I say potato"
puts "You say #{you.resume}"
puts "I say tomato"
puts "You say #{you.resume}"

The output of this will be:

I say potato
You say potato
I say tomato
You say tomato

What happens here is when the second puts is called, it calls you.resume. This means start executing you, which is a fiber. The return value of the call to resume will be the argument to Fiber.yield. A good mental model for thinking about fibers is a stack. When you call resume on a fiber, that fiber gets pushed on to the stack and starts executing. It executes until it’s finished or until it calls Fiber.yield. Fiber.yield means pop the current fiber of the stack, keep track of where that fiber was, and resume executing the fiber that’s at the top of the stack now. This is why in our example above, when we call resume on you the second time, Fiber.yield "potato" doesn’t happen because the fiber is already past that point, so Fiber.yield "tomato" is executed.

Fibers have some powerful uses in the context of code that does asynchronous IO. Mike Perham gave a talk at Austin on Rails which covers using Fibers with Event Machine, which I highly recommend. For more detail on threads and thread scheduling, I recommend the “Scaling Ruby” envycast, which is available at peepcode. Also checkout this post on Ruby Inside, which has a list of 8 other articles on Fibers.

Posted in Technology | Topics Ruby | 6 Comments