Why mod_rails is great for light-duty Rails apps

The Ruby on Rails story is usually presented to the new developer as a wonderful break from tradition that makes a developer’s life so much better than the frameworks of the past. The clattering of skeletons in the closet you’re hearing? Well, that’s because it makes the sysadmin’s life much worse than PHP or Java. That just improved on Friday, with the release of mod_rails. If you’re looking for a way to do shared (or low traffic) hosting of Rails applications, this is for you.

With Java there’s this alien environment of CLASSPATHs and WARs and JARs and heap size limits, but once you get it up and running, developers can include libraries in with their application or the lib/ directory of the J2EE server, and the sysadmin doesn’t have to care. A Java developer is unlikely to ask you to build and install a pile of custom libraries.

With PHP it’s just another Apache module, but you might need to build a few extra libraries and maybe custom-compile Apache. Once you get it up and running, though, you don’t even need to restart the server when you deploy new code. It’s automatically updated.

With Ruby on Rails, it has been far uglier, especially as you go further back. The standard “Matz Ruby Interpreter” (MRI) doesn’t thread well and is quite remarkably slow, and Ruby + Rails in an MRI process use a lot lot lot of memory. So you don’t really want RoR running inside each Apache process. Folks used to use FastCGI (which should have died over a decade ago, but lingers on like a bad cold) but now use Mongrel, which is conceptually kind of like FastCGI, except that it actually works. Mongrel presents the application via HTTP, which is much easier to understand and integrate with other parts of your architecture (such as a load balancer) than FastCGI.

Whereas in J2EE you’d run one big honkin’ JVM that used lots of memory to load up your code and data structures, but then ran many threads inside that one process, with the limitations of the MRI (green threads and many, many trips into non thread safe C code that requires the use of a “giant lock” that essentially makes it single-threaded), you run one process per thread. That’s like Apache+PHP or OpenSSH or many other unix programs that fork, right? Well, sort of. The issue is that your Ruby code is not seen by the kernel as something that all those forked processes can share; it sees the parsed Ruby code as data, and when the MRI’s garbage collector marks all those objects during garbage collection, it seems this data as being recently changed, differently for each forked process. So not only do you need 30-70MB or more per process, but very little of that is shared between processes. Ouch!

A second problem is that these processes take a while to start up and load the code, so it’s not reasonable to embed the Ruby interpreter in Apache when using Rails; the overhead is just too high. So the Mongrel solution is to pre-launch a bunch of interpreters, and have them just sit there until requests arrive. That’s pretty inefficient from a memory standpoint, but the latency when a request comes in is quite low since there is no initialization needed.

There have been a few interesting alternatives under development: JRuby is very promising, because it reuses all of the investment in VM development that Sun made over the last 10+ years for Java. At this point the JVM is pretty darn good at running many threads across multiple CPU cores, and at garbage collecting efficiently, among other things. These are key weaknesses of MRI, so running Rails on JRuby seems like a huge benefit. I haven’t tried it yet but I suspect that this will become one of the 2 or 3 most common ways to run Rails applications in the near future.

Another interesting alternative was some experimental hacking to MRI’s garbage collector by Hongli Lai, to store its working data separately from the objects being examined, so that preloaded Ruby code would remain shared by many forked interpreter processes over long periods of time. In other words, this is a potentially major memory use savings for Mongrel cluster users, which would in turn allow the sysadmin to run more Mongrels to service more simultaneous requests, or to bump up the database cache, or to increase the size of the running memcached instance. So, this would indirectly be a performance booster, and Ruby could really use that.

This experimentation apparently became Ruby Enterprise Edition, which as of this writing is not available yet. But the other development coming from Hongli Lai’s new company, Phusion, is Passenger, a.k.a. mod_rails.

What’s interesting about mod_rails for the beginning Rails developer is that it is intended to make Rails hosting easier, particularly for shared hosting enviroments, which have been struggling to offer Rails hosting in a uniform and cost-effective fashion. That means that in a short while (weeks?), shared hosting plans for fiddling around with Rails will become much cheaper and more widely available than they are now.

What’s interesting about mod_rails for the experienced sysadmin is that it mimics the min/max process pooling behavior of Apache, and addresses startup overhead in a clever way. It also serves static images via Apache automatically, eliminating the need for a separate block of mod_rewrite rules that must be crafted carefully so as to avoid conflicts with mod_proxy.

The architectural overview is comprehensive and well written, but here’s a summary: The Spawn Server makes a tree of child processes that preloads Ruby, Rails, and your application code for you, and then that is fork()ed to satisfy incoming requests. So the first request after startup incurs startup overhead (in my case, 5 seconds to load the Redmine login page) but subsequent requests get much better response time (.6s to reload that login page).

That seems like a lot of overhead in terms of big Ruby processes. Here’s what I measured just now: 97MB free with just Apache running (no spawn server yet). After the first page view, there was 36MB free, and four new processes: the Spawn Server taking a little over 6MB (rsize), the FrameworkSpawner taking 20MB (rsize), the ApplicationSpawner taking 34MB (rsize), and one Rails process taking 34MB (rsize).

The new “free” value is 36MB. The Buffers and used Swap values remained constant, with only 48KB of swap used. So that means that all four processes, which would seem to need 94MB to run (34+34+20+6), are actually overlapping enough that they are using only 61MB (97-36). And the ApplicationSpawner eventually terminates, leaving 36MB still free, which makes sense – it’s the process that fork()ed the Rails process, so they should ideally be overlapping nearly 100%. I’m surprised that this is so high; based on the GC experimentation that Hongli Lai did, I would have expected them not to overlap as much.

The idle Rails process exits eventually also, controlled by the RailsPoolIdleTime setting. That saves memory but re-introduces the startup overhead. That leaves the FrameworkSpawner and the SpawnServer running, taking about 25MB of memory (quite close to the 20+6 shown by their rsize values).

Let’s compare this memory footprint to a Mongrel cluster. In a Mongrel cluster the processes start up and stay running forever, so the users are unlikely to incur much startup overhead at all, since it’s done long before they visit the application. Some amount of application-specific internal overhead is still an issue, though; that might include gradually filling an initially empty memcached, template compilation and/or caching, etc. As for memory, each Mongrel would need the same 34MB of memory, but there’s no SpawnServer, FrameworkServer, or ApplicationServer, so the extra 25MB of overhead would not be present with a Mongrel cluster.

That means that for a shared hosting setup where many low-traffic Rails sites may be used, or a multifunction server where serving one or more low-traffic Rails applications is just part of the job, mod_rails is a benefit. When the Rails app isn’t being used, it will exit and free up that memory for other processes. The starting and stopping of Rails with mod_rails is automatic and demand-based, so the sysadmin can tune it and forget about it.

On the other hand, a single dedicated server or VPS with a fixed amount of memory serving a single application would be better off with Mongrel, because of the lower memory overhead (25MB less), and the fact that the Mongrel processes start up before users need them and stay running indefinitely. Mongrel clusters could still potentially benefit from the Ruby Enterprise Edition’s garbage collector tweak if forking were used after preloading all of the code.

A single-purpose dedicated server running mod_rails could attain similar performance to a Mongrel cluster by simply setting the RailsPoolIdleTime value to a very high number. Then the Rails processes would hang around, and although you’d pay the price of a 25MB memory overhead, the startup overhead would only be paid by the very first visitor. However, you’d lose the main benefit of mod_rails, which is demand-based pool resizing, particularly if you’re running more than one application, Rails version, or Ruby interpreter version.

In short, I think mod_rails is very nice, and having actually used it I’m impressed with how polished it is for a 1.0 product. But if you’re already running a single application as a Mongrel cluster on a dedicated server, there’s no point in switching.

Leave a Reply

Your email address will not be published. Required fields are marked *