architecture – Pervasive Code

Why mod_rails is great for light-duty Rails apps

Jamie Flournoy — Mon, 14 Apr 2008 20:30:25 +0000

The Ruby on Rails story is usually presented to the new developer as a wonderful break from tradition that makes a developer’s life so much better than the frameworks of the past. The clattering of skeletons in the closet you’re hearing? Well, that’s because it makes the sysadmin’s life much worse than PHP or Java. That just improved on Friday, with the release of mod_rails. If you’re looking for a way to do shared (or low traffic) hosting of Rails applications, this is for you.

With Java there’s this alien environment of CLASSPATHs and WARs and JARs and heap size limits, but once you get it up and running, developers can include libraries in with their application or the lib/ directory of the J2EE server, and the sysadmin doesn’t have to care. A Java developer is unlikely to ask you to build and install a pile of custom libraries.

With PHP it’s just another Apache module, but you might need to build a few extra libraries and maybe custom-compile Apache. Once you get it up and running, though, you don’t even need to restart the server when you deploy new code. It’s automatically updated.

With Ruby on Rails, it has been far uglier, especially as you go further back. The standard “Matz Ruby Interpreter” (MRI) doesn’t thread well and is quite remarkably slow, and Ruby + Rails in an MRI process use a lot lot lot of memory. So you don’t really want RoR running inside each Apache process. Folks used to use FastCGI (which should have died over a decade ago, but lingers on like a bad cold) but now use Mongrel, which is conceptually kind of like FastCGI, except that it actually works. Mongrel presents the application via HTTP, which is much easier to understand and integrate with other parts of your architecture (such as a load balancer) than FastCGI.

Whereas in J2EE you’d run one big honkin’ JVM that used lots of memory to load up your code and data structures, but then ran many threads inside that one process, with the limitations of the MRI (green threads and many, many trips into non thread safe C code that requires the use of a “giant lock” that essentially makes it single-threaded), you run one process per thread. That’s like Apache+PHP or OpenSSH or many other unix programs that fork, right? Well, sort of. The issue is that your Ruby code is not seen by the kernel as something that all those forked processes can share; it sees the parsed Ruby code as data, and when the MRI’s garbage collector marks all those objects during garbage collection, it seems this data as being recently changed, differently for each forked process. So not only do you need 30-70MB or more per process, but very little of that is shared between processes. Ouch!

A second problem is that these processes take a while to start up and load the code, so it’s not reasonable to embed the Ruby interpreter in Apache when using Rails; the overhead is just too high. So the Mongrel solution is to pre-launch a bunch of interpreters, and have them just sit there until requests arrive. That’s pretty inefficient from a memory standpoint, but the latency when a request comes in is quite low since there is no initialization needed.

There have been a few interesting alternatives under development: JRuby is very promising, because it reuses all of the investment in VM development that Sun made over the last 10+ years for Java. At this point the JVM is pretty darn good at running many threads across multiple CPU cores, and at garbage collecting efficiently, among other things. These are key weaknesses of MRI, so running Rails on JRuby seems like a huge benefit. I haven’t tried it yet but I suspect that this will become one of the 2 or 3 most common ways to run Rails applications in the near future.

Another interesting alternative was some experimental hacking to MRI’s garbage collector by Hongli Lai, to store its working data separately from the objects being examined, so that preloaded Ruby code would remain shared by many forked interpreter processes over long periods of time. In other words, this is a potentially major memory use savings for Mongrel cluster users, which would in turn allow the sysadmin to run more Mongrels to service more simultaneous requests, or to bump up the database cache, or to increase the size of the running memcached instance. So, this would indirectly be a performance booster, and Ruby could really use that.

This experimentation apparently became Ruby Enterprise Edition, which as of this writing is not available yet. But the other development coming from Hongli Lai’s new company, Phusion, is Passenger, a.k.a. mod_rails.

What’s interesting about mod_rails for the beginning Rails developer is that it is intended to make Rails hosting easier, particularly for shared hosting enviroments, which have been struggling to offer Rails hosting in a uniform and cost-effective fashion. That means that in a short while (weeks?), shared hosting plans for fiddling around with Rails will become much cheaper and more widely available than they are now.

What’s interesting about mod_rails for the experienced sysadmin is that it mimics the min/max process pooling behavior of Apache, and addresses startup overhead in a clever way. It also serves static images via Apache automatically, eliminating the need for a separate block of mod_rewrite rules that must be crafted carefully so as to avoid conflicts with mod_proxy.

The architectural overview is comprehensive and well written, but here’s a summary: The Spawn Server makes a tree of child processes that preloads Ruby, Rails, and your application code for you, and then that is fork()ed to satisfy incoming requests. So the first request after startup incurs startup overhead (in my case, 5 seconds to load the Redmine login page) but subsequent requests get much better response time (.6s to reload that login page).

That seems like a lot of overhead in terms of big Ruby processes. Here’s what I measured just now: 97MB free with just Apache running (no spawn server yet). After the first page view, there was 36MB free, and four new processes: the Spawn Server taking a little over 6MB (rsize), the FrameworkSpawner taking 20MB (rsize), the ApplicationSpawner taking 34MB (rsize), and one Rails process taking 34MB (rsize).

The new “free” value is 36MB. The Buffers and used Swap values remained constant, with only 48KB of swap used. So that means that all four processes, which would seem to need 94MB to run (34+34+20+6), are actually overlapping enough that they are using only 61MB (97-36). And the ApplicationSpawner eventually terminates, leaving 36MB still free, which makes sense – it’s the process that fork()ed the Rails process, so they should ideally be overlapping nearly 100%. I’m surprised that this is so high; based on the GC experimentation that Hongli Lai did, I would have expected them not to overlap as much.

The idle Rails process exits eventually also, controlled by the RailsPoolIdleTime setting. That saves memory but re-introduces the startup overhead. That leaves the FrameworkSpawner and the SpawnServer running, taking about 25MB of memory (quite close to the 20+6 shown by their rsize values).

Let’s compare this memory footprint to a Mongrel cluster. In a Mongrel cluster the processes start up and stay running forever, so the users are unlikely to incur much startup overhead at all, since it’s done long before they visit the application. Some amount of application-specific internal overhead is still an issue, though; that might include gradually filling an initially empty memcached, template compilation and/or caching, etc. As for memory, each Mongrel would need the same 34MB of memory, but there’s no SpawnServer, FrameworkServer, or ApplicationServer, so the extra 25MB of overhead would not be present with a Mongrel cluster.

That means that for a shared hosting setup where many low-traffic Rails sites may be used, or a multifunction server where serving one or more low-traffic Rails applications is just part of the job, mod_rails is a benefit. When the Rails app isn’t being used, it will exit and free up that memory for other processes. The starting and stopping of Rails with mod_rails is automatic and demand-based, so the sysadmin can tune it and forget about it.

On the other hand, a single dedicated server or VPS with a fixed amount of memory serving a single application would be better off with Mongrel, because of the lower memory overhead (25MB less), and the fact that the Mongrel processes start up before users need them and stay running indefinitely. Mongrel clusters could still potentially benefit from the Ruby Enterprise Edition’s garbage collector tweak if forking were used after preloading all of the code.

A single-purpose dedicated server running mod_rails could attain similar performance to a Mongrel cluster by simply setting the RailsPoolIdleTime value to a very high number. Then the Rails processes would hang around, and although you’d pay the price of a 25MB memory overhead, the startup overhead would only be paid by the very first visitor. However, you’d lose the main benefit of mod_rails, which is demand-based pool resizing, particularly if you’re running more than one application, Rails version, or Ruby interpreter version.

In short, I think mod_rails is very nice, and having actually used it I’m impressed with how polished it is for a 1.0 product. But if you’re already running a single application as a Mongrel cluster on a dedicated server, there’s no point in switching.

Document Databases – New Kids on an Old Block

Jamie Flournoy — Sat, 16 Feb 2008 06:16:16 +0000

There’s a new crop of databases that has appeared lately, under the rubric of “document databases”, and there’s quite a lot of enthusiasm for them given that they tend to be slow and very feature-poor compared to the SQL RDBMSs that are the typical persistence mechanism for web applications. What’s mainly appealing about them is that they are easy to use, and theoretically quite scalable, compared to the traditional “one big SQL database server” approach.

But the simplicity of these new document databases is tied to some significant trade-offs in the current implementations. And so I’m going to try and put them into context with some of the other data persistence options that have been around for a while, but which aren’t currently getting as much hype as document databases. Hopefully that will help all of us to understand how these new and evolving document databases can be useful to us, and what the alternatives are in areas where they may not fit well.

I’d first like to try and deconstruct a false dichotomy that I’ve noticed being used in arguments in favor of some of these new databases. That dichotomy casts SQL RDBMSs (such as MySQL, Oracle, PostgreSQL, MS SQL Server, etc.) as big, complicated, and hard to scale, compared to document databases which are small, simple, and easy to scale. The main problem with this dichotomy is that there are far more choices than just two. Each database product embodies a set of design choices, and although there is some clustering of decisions into general types of product, the boundaries are a lot fuzzier than a product evangelist might have you believe.

Furthermore, trade-offs made early in a product’s lifetime may have been altered over time. A good example is the no-longer-true characterization of MySQL being fast but not reliable, vs. PostgreSQL being reliable but not fast. In reality, recent releases of both products are moving toward being very fast and very reliable.

Because there are so many database products out there, I’m going to have to fall back on a small subset of example products, as illustrations of issues that may or may not exist in a particular product you’re looking at. The key for an application architect evaluating a persistence mechanism is to understand the abstract concepts, and to figure out which ones matter to your current application. That will let you select a product (or a combination of products, including some custom code perhaps) that suits you. As with all aspects of architecture, there is no cookbook you can use, and in six months all the options will change. You have to analyze your needs first, and then get your hands dirty with evaluation second.

So, let’s start deconstructing some of the examples into design decisions.

SQL RDMBS

This is the traditional choice for web applications. You get a remote query language, so you can specify in great detail what you want to retrieve. You get very precise control over data representation, including some that may be burdensome if you’re not concerned with internationalization, multiple currencies, and time zones. You get a lot of control over performance and a lot of information about how things work at a low level inside the database, from indexes to data page size to transaction logs and checkpoint frequency. Most of them include ACID transaction support, which is nice for reliability, but which usually obligates you to implement a backup scheme or else they will eventually stop accepting new transactions and/or run out of disk space.

Some of the design drawbacks include:
– the use of local storage, for performance and for a guarantee that data has been committed to disk
– the use of a high level query language (SQL) and a query optimizer, so the specific process the database uses to satisfy your query is not in your face (and thus may be surprisingly inefficient, if you aren’t familiar with how it works)
– the use of a proprietary network protocol, which means that you need a special client library for just that one product, which may or may not implement all of the features that the server offers (such as encryption)

However, there are some variations that make the edges of this category fuzzier. There are ACID-compliant SQL RDBMSs that have no network layer, and are very lightweight; in some cases they may not even support concurrent access. Examples include HSQLDB and SQLite.

Networked Filesystem

This is typically used for accessing shared file servers, or allowing “thin client” behavior so that users can get to their own environment and data from any given endpoint. Examples include NFS, SMB, AFP, GFS, and quite a few others. The main advantage of these systems is that the remote filesystem is represented as being directly connected to the local system, while also being available to other users or other client systems who are connected to the same remote system.

Trade-offs of this design include:
– performance on a LAN may be good, but over a slow, high latency link may be very poor
– there is usually no ACID transaction support, just file locking
– file ownership and permissions can be very hard to manage
– if the file server goes offline, the entire local system may hang or crash

In particular, content indexing, complex querying, and data integrity features are generally not offered. You can layer that on top, though, but that layer will not necessarily work if it was originally designed to work on local filesystems. In particular I’m thinking about DBM files; they’re fast and easy to use but not all of them will work properly with files located on a network filesystem.

Also, directory scanning performance can be very poor if thousands of files are located in a single directory; listing all files starting with the letter T may actually require the entire directory to be retrieved and filtered on the client side.

Variations include FTP and WebDAV, which are not intended to simulate a local filesystem, but instead have filesystem-like semantics. Some operating systems will mount them as remote filesystems anyway, for ease of use for viewing and copying files, but it’s not possible to lock a remote file, so safe multiuser access is not possible.

Object Database

Object databases offer a direct representation of an application’s data in almost exactly the same form that exists in memory. Whereas a relational database stores data in tabular form regardless of the particulars of a client application, an object database stores data in the same form that the application uses. The exception to this is in the representation of references to other objects; at some level these encapsulate pointers to the memory address of the data in the application’s address space, and this must be substituted with a pointer to the location in the database’s storage system before storing it.

An object database will not handle time zones or internationalization, but nor will it complicate those matters if the application handles those already. The data is simply stored as-is. Also, object databases typically do offer ACID transaction support.

Aside from the conceptual simplicity of the similar data model, one major bonus of an object database is that the use of pointers makes data retrieval extremely fast; rather than parsing a query and searching indexes for the on-disk location of a desired object, the application can simply ask the database for it by its reference.

The big trade-offs here are twofold:

First, the lack of indirection through a query language and an indexing system mean that the application developer must anticipate all of the queries that will be needed, and incorporate collections into the the object graph that will be used to get to the stored objects. Otherwise the application’s object model will need to be updated frequently to include these later.

Second, altering the application’s object model and then retrieving data stored using older code can be very complex. Because the stored objects and application’s code are out of sync in this situation, additional application code must be written to convert existing stored objects into the new representation and persist them back to the database.

Combine those two trade-offs, and it’s clear that the performance benefit comes with the price of considerable additional application development effort.

Also, an object database is by nature bound to a single application, rather than being a point of integration between multiple applications. Any attempt to create a shared code library that manages access to the object database introduces potential “impedance mismatches” between each application and the shared object model, which reduces the simplicity that an object database offers in comparison to a relational database.

Document Database

Arguably a rejection of relational technology, document databases offer several advantages compared to the three classes of database previously mentioned. Documents need not be internally represented as a flat set of key-value pairs as seen in a SQL RDBMS; for example, the document may be an XML document. Queries are possible and may even use a standardized query language to express the conditions for matching desirable documents. Documents may have internal structure that is understood by the database server (so that it can query against the document’s contents), as in the case of an XML database, or they may have an external metadata structure consisting of key-value pairs, or both.

The drawbacks of this type of system derive from the fact that it is similar in many ways to each of the other database types.

Querying ability means that the server must incorporate some kind of indexing system for performance reasons, which means that the document must either internally or externally conform to some sort of standard data model. Some document database systems simply omit querying except by the document’s main ID (similar to a SQL primary key, a network filesystem’s filename, or an object database’s storage reference). This has the same drawback as with an object database: the application must take on the responsibility of managing querying, searching, and sorting itself, across the network.

The similarity to a networked filesystem may also have scalability benefits; if referential data integrity is not provided, then documents can be located on any remote system, and partitioning is simple. However, distributed queries will still need to be managed at the application level, and potentially any transaction becomes a distributed transaction, since a changed document on one server may be referenced from any number of documents in any number of other servers. (This is also true of the other systems, though.)

Still, because a document database is closest to a networked filesystem, it may be suitable for simple requirements where a relational database or object database seems to complex and slow, but where the bare-bones functionality of a networked filesystem is too simple. The compromise of a binary file with simple additional metadata or properties attached “out of band” with the data itself, or of a structured document format that is flexible but not ideal, may be acceptable if sophisticated querying is possible as a result.

It’s hard to provide good examples of document databases, because the category is very broad, and includes a lot of simple projects that provide just a little functionality above and beyond what WebDAV already provides. But a few that I’ve heard of recently include CouchDB, SimpleDB, and RDDB.

CouchDB imposes a simple key-value data structure on document content, but no internal document schema or grouping of documents by type. It does offer indexing and querying. Notably, it also offers transparent replication, at a field level (changes to two different fields in two copies of the same document are synchronized to both copies).

Amazon SimpleDB similarly imposes a key-value structure, though one key can have multiple values. It too offers a query language, and indexing. Because it’s built on Amazon’s S3 service, transparent replication is also included.

Future Prospects

The main complaints that I’ve seen directed at relational databases involve two things: one, the difficulty of scaling them up, and two, the restrictive data model. Sharding (a.k.a. data partitioning) is the usual remedy for scaling problems, but that requires the elimination of referential integrity in the SQL RDBMSs I’m aware of, and requires distributed transactions in order to preserve ACID transaction properties across denormalized copies of the modified data.

Interestingly, these issues are the same across the board, regardless of database type. Either you abandon transactions, or you move them up to a level that’s aware of the data partitioning. I see no reason why these and other high-end RDBMS features couldn’t be offered in a proxy layer that possibly even contains the query processing as well.

One way to approach this is to build a closed system with a given set of features and a limited API that permits a single query language. This seems to be the way that CouchDB and SimpleDB are approaching the problem.

Another way to approach this problem is to simply say that the storage back-ends of relational databases could be enhanced to incorporate built-in transparent partitioning. I don’t think that SQL RDBMSs will abandon the concept of a table schema any time soon, but there’s no reason why products that already include XML query and indexing capabilities and free-form natural language indexing (a.k.a. Full Text Search) couldn’t also include indexing capabilities for simple key-value structured data inside a single column of semi-structured data, giving most of the same functionality as a document database.

Given that, the remaining limitation of a SQL RDBMSs is the requirement that the back-end storage system be located on a disk drive physically connected to the same server, and that the storage be touched only by processes running on the same server together so that they can coordinate access to the data.

For now, though, document databases look like they can be very useful for certain types of persistence requirements; I don’t see them as a viable substitute for everything that a SQL RDBMS does, but that perception is limited mainly by the choices at hand. CouchDB looks like the most generally useful option so far, though I’d like to see the addition of optional schemas (opt-in on a per-object level, as seen in LDAP), and/or a pluggable language option. (It seems that everyone using a document database is also enamored of their application language and dislikes the idea of putting logic in the data tier unless it’s written in the same language.)

I welcome your comments – this is mostly a brain dump of things I’ve seen before to help myself and others contextualize the new document databases, and document databases are evolving too rapidly for me to keep up with all of them on my own.

Capacity vs. Scalability

Jamie Flournoy — Wed, 14 Nov 2007 00:23:58 +0000

In I still donâ€t get the fascination with Ruby on Rails, Andy Davidson writes:
Scaling does not mean â€œAllows you to throw money at the problemâ€, it means â€œCan deal with workloadâ€. He goes on to recommend mod_perl instead of Rails.

I’m not interested whether he likes Rails or not. Lots of people hate Rails, and I don’t care. I’m not going to make a big deal about the fact that he’s comparing a runtime architecture (Apache + mod_perl) with a framework (Ruby on Rails).

Those are insignificant compared to his claim that scalability means “Can deal with workload”. Actually, that’s a description of capacity.

Scalability is a very distinct concept from capacity. Scalability is not a true/false property of a system; there are degrees of scalability, which can be represented in a 2D graph of # of simultaneous requests that you can service with an acceptable response time (X axis), plotted against the resources required to service those requests (on the Y axis). The function f in the y=f(x) equation that is behind that graph is how scalable your application is.

If it’s a straight line, that’s quite good: “linear scalability”. More requests cost the same amount per request as the ones you’re getting now. Double your customers, double your net profits.

If it curves down away from a straight line, that’s even better than linear scalability: you’ve attained an economy of scale, so twice as many requests costs less than twice as much as the amount you’re paying now.

If it curves up away from a straight line, that’s bad, because more load means a greater cost per request. Each new customer makes you less money than the last one. Eventually you will grow to a point where you lose money and your business fails. This is what people are referring to when they say something won’t scale. Linear (or better) scalability curves are what people mean when they say something will scale.

In the worst case, the upward curve is asymptotic to a vertical line. In other words, at some number N of simultaneous requests coming in, you “hit a wall”, and no amount of extra resources will help you. “Allows you to throw money at the problem”, as Mr. Davidson puts it, actually describes all three curves, except for this worst case of curving upward asymptotically. But as long as you don’t hit a wall, “Can deal with workload” is satisfied. The more interesting questions, though, are how much it costs you to add capacity, and whether there’s a certain number of requests above which you start to make or lose money.

Of course, the ideal curves are not what you see in practice. In reality you buy resources in chunks, such as a server or a specific plan of bandwidth, power, and rack space from your colocation provider. The graph looks more like a staircase, and going from N customers to N+1 customers means you have to spend $thousands on new hardware. Each of those chunks represents a certain amount of capacity. Capacity is just a measure of how large each chunk is, or of the largest value of X that your server cluster can support without more resources.

But you can’t just extrapolate from the fact that a single server S will support, say, 100 requests per second, that 100 of them will support 10,000 requests per second. If only it worked that way, capacity planning would be really easy. Sadly, architecting a web application for linear scalability is hard. (It’s doable and the approach is fairly well documented, but it’s not easy.)

I think it’s worth pointing out something now which should be obvious: the slope of a straight line doesn’t change its curvature. If you’re paying a silly amount for each request because you’re using an inefficient architecture that scales linearly, but you’re making an even larger silly amount from your customers, you’re still going to be in business if you have 10x as many customers. You may be leaving money on the table due to inefficient use of resources, but you’re not ruined.

If you’re lucky enough to be in that situation, you can probably hire one or two sysadmin/developer ninjas to optimize your app and change the slope of your line downward. Alternatively, you might decide to increase your profits by just buying more servers and advertising. You could even do both.

Likewise, if you’re losing an average of $5 per customer visit regardless of how many customers you have (with that hard-to-attain linear scalability again), then adding a bunch of servers isn’t going to help you. Sun, Compaq, and Dell sold a ton of server hardware in the late 1990s to companies that didn’t understand this.

In a more realistic scenario, you might be paying a lot for servers but not have much revenue. Improving your application’s efficiency would reduce the cost of your resources somewhat, and you might change from losing money to making money. That’s great, but if your curve bent upward before, it still bends upward, and if you were gonna hit a wall before, you still will. The right decision might be to worry about that later, once you’re profitable. At that time you could afford to change your architecture to scale better. Or, you might choose to invest in the future and focus about scalability improvements and growing your customer base now, losing money now but making piles of cash later when you eventually improve your efficiency.

In conclusion, to tie the two terms together: scalability is a measure of how cost-effectively you can grow (or shrink) your capacity.

And to tie this topic to my ongoing claim that trading runtime language performance for developer productivity is generally a good idea for web apps:

Language performance does not affect whether an application scales or not. It is a coefficient to the cost of capacity.

The cost of capacity affects the slope of your curve, but not the curvature. That’s important. Your architecture and application design are what affect the curvature of your scalability. You need to pay attention to both: the curvature of your scalability function and the cost of capacity will tell you where to invest your developer and sysadmin resources for the best return on investment.

Evaluating Future Web Application Technologies

Jamie Flournoy — Mon, 12 Nov 2007 23:22:30 +0000

Technical Architecture is a Form of Investing. I’m reminded of this sort of thinking because of recent news from RubyConf 2007.

First, IronRuby joins Ruby.NET in providing a Ruby runtime on .NET. They’re at different stages of completeness, and building on different .NET runtimes (DLR vs. the regular CLR), but the important point is that Microsoft is investing in dynamic languages. Is it ready for production today? Probably not. But keep an eye on Ruby, Python, and JavaScript if you’re a .NET developer.

Second, JRuby 1.1b1 has been released and as expected is considerably faster (see item #5 in this link) than the standard “MRI” runtime. JRuby joins Jython and Rhino in providing a JVM-based runtime for a dynamic language, with features designed to help developers mix and match the dynamic language code with Java code.

See the trend here? Python, Ruby, and JavaScript are emerging as the dynamic languages of the future for .NET and Java developers.

The hard work done by Sun and Microsoft to make their VMs work well is being leveraged by the next wave of languages. Threads, high performance I/O, memory management, and portability are all features that are quite expensive to get right, and the .NET and Java platforms have pretty much achieved that at this point. (Piggybacking newer, higher-level languages on these mature runtimes means that you get a mature new language runtime faster than if each language’s runtime were built from scratch and painstakingly debugged in isolation from the others.)

There are still some hurdles (performance, type safety fears, lack of mass market acceptance, ECMAScript 4 standardization and adoption, etc.), but in 2 or 3 years, things are going to change dramatically in the web application development world. The seeds of this change are already sown, and it’s just a matter of time. Threads, SQL, OOP, and garbage collection are all features of web application architectures that were initially controversial, but have now met with general acceptance. Dynamic languages are clearly the next step.

Obviously, Java and C# are far from dead, and in 10 years people will still be coding in Java and C#, because as with other languages like C and assembly, the newest and highest-level language isn’t automatically right for every project. But if you’re building web applications, most of what your code does falls into the categories of string manipulation, collection operations, or file and socket I/O. Image processing, crypto, full text search, and other CPU-heavy, byte-twiddling features may be part of your application, but you’re not writing the image scaler, RC4 cipher, or inverted index yourself; those are done in a library, probably written in C, and you’re just calling it. So your needs are likely to be similar to the sweet spot of dynamic languages: maximum expressivity and the fancy features to let you write clever code, making you productive and making the code as clean and elegant as possible. In other words, they put developer productivity first (lower labor cost and shorter development schedules) at the expense of runtime performance. Since hardware gets cheaper over time but code gets uglier over time, this is probably the right choice to make for most web application projects.

Another interesting benefit of layering instead of starting over is that the integration between dynamic languages and Java or CLR languages is much nicer than managed vs. unmanaged code in .NET or, even worse, JNI in Java. That is, it won’t be a bloody mess to mix and match code, from a technical feasibility standpoint. This matters, because The Big Rewrite is among the Things You Should Never Do. But little bitty rewrites are fine, especially if you have a thorough test suite to help you avoid breaking things. (By the way, dynamic languages are great for writing automated tests.)

Which of these three (JavaScript, Python, or Ruby) is going to be dominant 5 years from now? I don’t think any of them will be. The dynamic language community is fragmented, and the various vendors and big sponsors of these three languages are fairly entrenched already. Microsoft is investing in all three; Google has standardized on Python and JavaScript to the exclusion of Ruby; Sun has hired the JRuby team; Mozilla is heavily invested in JavaScript; Adobe supports JavaScript in AIR but not Ruby or Python, etc.

In fact, if you encapsulated the glue code sufficiently well, you could mix and match JavaScript, Python, and Ruby in your application, and port your hideous hydra between the JVM and .NET. You would be wasting a lot of effort since the three languages are largely similar, but you could do it. Alternatively, you could create a portability layer between the DLR and the JVM a la WxWindows, and write-once-debug-everywhere in a more productive language than Java.

These are all repugnant ideas, but only because as I write this and as you read this, we probably realize that to attempt this today would be a huge task. But what about in 2010? Probably not so gross. What about an application that could be executed on Silverlight, AIR, Firefox, SWT, and Mono, unmodified? How about a mobile app that runs on smartphones regardless of the runtime (.NET vs. J2SE)? Not gross at all, and not unthinkable if your app is written in JavaScript using some kind of portability layer that doesn’t exist yet.

In the longer term, JavaScript (a.k.a. ECMAScript 4) is likely to become extremely popular. As far as I know it’s not quite a perfect fit for Steve Yegge’s The Next Big Language, but it’s the closest thing there is, and it has two critical advantages over Ruby and Python that will make it successful: C-family syntax (which makes development tools cheaper to build) and effectively unanimous buy-in from vendors and developers.

So, what about the other dynamic languages that people are using in large numbers today? What’s going to happen to ActionScript, CFScript, PHP, and VB?

ActionScript and CFScript are pretty close to JavaScript by design; I’ve read that ActionScript 3 is actually compliant with the ECMAScript 4 draft specfication. It’s pretty clear that Adobe is betting on JavaScript. In the near future (2 or 3 years) I predict that Adobe will rev its products and support ECMAScript 4 across the board.

PHP and VB.NET/VBScript will hang around for a long time because they’re approachable and already very popular, but they’ve already peaked, and will steadily decline as developers switch to C# (on the .NET side) and Rails (on the Linux side), and then JavaScript as soon as a serious web app framework and an ISP-friendly runtime exist. Microsoft will keep investing in VB to keep customers happy; Yahoo will keep investing in PHP because it is so heavily invested in PHP already; new developers will find PHP to be an easy starting point for light duty web development, with tons of documentation and free applications that they can download and hack. But PHP will not inherit the kingdom from C# or Java, and the languages which do achieve mainstream success after C# and Java will do everything that PHP does language-wise, and the market momentum around those languages will make them better than PHP at what PHP does. Developers will ask themselves why they would write the client side and server side in two different languages, especially when the server-side language is more expressive and has better portability and libraries. That’s not true yet, but it will be in a couple of years. In 5 years or so PHP and VBScript will go the way of Perl CGIs: still used, but by a community a tenth of the size it is today.

What about the new Java-based dynamic language, Groovy? Groovy is interesting, but it’s too late. The Java mainstream of vendors and developers only recently managed to convince the world of “serious” C++ developers that automatic garbage collection and JIT compiled bytecodes can actually work in a high traffic context. The next battle, to promote the dynamic language features that Java lacks but which Groovy brings, will take years to fight. Once a developer makes a decision to not use standard Java, Groovy is on a more or less level playing field with the JVM-hosted versions of Python, JavaScript, and Ruby, but each of those languages has far greater adoption than Groovy, and each of them has greater opportunity for leverage on other runtimes than Groovy. For a Java developer, once the door is opened to other languages, the only advantage Groovy has is that its syntax is familiar. Compare this to JavaScript which web developers also need to know how to use; why learn a third language (Groovy) in addition to Java and JavaScript? Over time the simplicity of coding and debugging in JavaScript on client and server, together with dynamic-language productivity, will overcome the momentum of the Java standard, and web developers using server-side Java now will gradually replace it with JavaScript on the JVM. Conservative attitudes in the mainstream Java community (including Fortune 500 companies and the many offshore development firms that write code for them) will make this take quite a while – probably 5 years before JavaScript becomes a common part of the architectures that currently use J2EE, and 10 years before Java goes the way of COBOL (maintained forever but not used for new projects).

So in conclusion, keeping an eye on the future value of a technology, including who’s investing in it and who’s talking about investing in it, is critical to making your own investments today. In five years you’re not going to be using the same technology stack that you are today, and your project’s success and your own salary will be tied in large part to how well you invested today.

Technical Architecture is a Form of Investing

Jamie Flournoy — Mon, 12 Nov 2007 22:02:06 +0000

Technical architecture (the act of researching and specifying a set of technologies to address a particular need) is a form of investing. Sadly, like stock market investors, many technical architects are blinded by hype, hero worship, tribalism, and short-sightedness, and make poor decisions as a result. A comparison between current web application development issues and the stock market may help you to avoid these tendencies in yourself or your team.

All too often in architecture, we get defensive about decisions we’ve already made and attack things that we’re ignorant of, forgiving the faults of our Own Stuff and downplaying the strengths of Their Stuff. We assume that Our Stuff is going to win and Their Stuff is going to lose, because we like Our Stuff and hate Their Stuff, and the best stuff always wins, right?

It doesn’t work like that, though, and our judgement is biased toward the things we already are familiar with. The stock we’re holding is undervalued, while the stock we want to buy is overhyped. The technology we’re using right now is obviously better than the technology we don’t know anything about. The stock will bounce back; we’ll sell it when it turns around and hits twice its current price. The new release with feature XYZ will make everything so much easier that we’ll be able to take Fridays off.

Concepts from financial investing that are helpful when thinking about technology more objectively are the Efficient Market Hypothesis, Fundamental Analysis, Market Capitalization, and Net Present Value. Each of these represents a structured way of thinking about the current opinion of the value of something in a manner that includes its potential future value and still gives you a single number to work with.

Do you know something that most other people don’t know? Paul Graham says you can exploit this, in Beating the Averages says . EMH says you’re fooling yourself.
Are you judging something based on hearsay? Fundamental Analysis is a tool for thinking about whether a stock is as good or bad as people say it is, and how that is likely to change in the future.
Are you ignoring the behavior of others when making decisions for yourself? Market Capitalization is an aggregate of what everybody thinks, regardless of whether they’re right or wrong.
Are you judging something based on its current value instead of where it will be, and how much it might cost you to get to a certain point in the future? Net Present Value is a tool for figuring out the value of future outcomes so you can decide whether it’s worthwhile to try to achieve them.

All of these concepts are just models for thinking about what’s really going on, and as such you should understand their shortcomings before using them. A major shortcoming is that your predictions are still going to be based on your subjective experience, and as such may be biased toward making you feel good about the choices you’ve already made. But they still hold lessons for techies trying to make choices about technology choices, for current projects or for improving your future employment options.

In Part 2 I’ll talk about why this is a timely topic.

ActiveRecord: the Visual Basic of Object Relational Mappers

Jamie Flournoy — Fri, 05 Oct 2007 02:07:53 +0000

I’ve been working with Ruby on Rails intensively for several months, and I’ve finally found a place where Rails can’t readily be extended to do what I want. It’s ActiveRecord, which is probably the most controversial part of Rails.

I’m reminded of a James Gosling quote disparaging Microsoft tools, particularly Visual Basic: “The easy stuff is easy, but the hard stuff is impossible.” There’s a parallel between VB and Rails in this instance, in that if you only let yourself use the high level tools, the hard stuff is impossible, but the designers specifically tell you to do the hard stuff using a lower level toolset. The controversy that surrounds “X can’t do everything, therefore it sucks” should really be focusing on the feasibility of going through that trapdoor to do things “the hard way”. This is what Delphi did, which is why so many folks chose it over VB; it made the hard stuff easier.

Here’s the task I need to accomplish, for which ActiveRecord is not well suited: complex queries involving SQL functions and multiple-table joins. I want to join a few tables together, order by a SQL function, include with each result row the result of a SQL function that operates on each row, and have all that come back as a graph of high-level objects.

Despite my attempts to use plugins, extend and/or fix bugs in those plugins, and to dig through the ActiveRecord source to figure out what the documentation won’t tell me, I was unable to get it to work. Most of the parts of what I wanted was possible: acts_as_tsearch cleverly weaves SQL functions into a high-level ActiveRecord::Base.find calls; paginating_find provides a very convenient pagination API on top of ActiveRecord::Base.find, and ActiveRecord includes some clever association tricks such as automatic many-to-many relationships (has_and_belongs_to_many), eager loading of associated records using a join (via the :include option to ActiveRecord::Base.find), and a fairly low-level :joins option that lets you add tables to a ‘find’ query which can be used in your :conditions. Problem is, they don’t all work together in a fancy way.

Really, the issue in this case is related to the design choices that went into ActiveRecord.

Some ORMs (object-relational mappers) are designed in a modular fashion: there is a part that helps you describe the relationships between your model objects, a part that helps you construct queries, and a part that does the storage and retrieval. Sometimes there’s another part that uses your description of object relationships to create an empty database with the appropriate data model, or that looks at an existing database and creates an object model that matches it. Sometimes there’s an import/export tool for bulk data loading or dumping as well.

ActiveRecord has the first three functions integrated (which has benefits and drawbacks compared to a more modular approach), has a very isolated schema manipulation module, and has a somewhat isolated data loader tool.

The relationships are explicitly declared in source code using associations: has_one, has_many, belongs_to, and has_and_belongs_to_many. These are pretty fancy and provide some convenience features that make the associations appear as object collections, such that changing the collection and saving it turns into insert/delete/update activity in the database.

Query construction is basically tied to the objects themselves, in a way that greatly simplifies star-join queries, but which handles only the simplest joins across multiple tables, and is barely able to handle self-referential joins at all. So, you can easily load an object (or group of similar objects) and associated objects, but OLAP-style queries (“what are the top 5 states where customers are located who have bought classical CDs within 2 weeks of their release using American Express and had them shipped as gifts via UPS 3-day Select?”) are impossible. Oddly, views, functions, and stored procedures could bridge the gap between real-world data models and ActiveRecord’s limited set of association types, but they are not supported either.

The storage and retrieval code is inseparable from the query code, and so it is not possible to examine and modify the final SQL before it is executed, nor is it possible to provide an arbitrary query and have the results be parsed into an object graph based on the associations you have defined. The code that would allow these features appears to exist and be sufficiently well designed to allow this with a fairly small amount of changes to ActiveRecord. However, it is currently (as of Rails 1.2.3, which is the current release) not part of the documented API and is declared private.

There is a limited facility for constructing simple objects from arbitrary SQL, in find_by_sql. This loses essentially all of the high level functionality of the find method; most notably, it isn’t possible to use find_by_sql results to instantiate an object graph, rather than a flat array of objects (similar to the eager loading feature in the regular find method).

ActiveRecord has fairly good high-level schema creation functionality (“migrations”). Though it lacks concepts for all but the basic database objects, support can be added for foreign key constraints (I kid you not, they aren’t supported by Rails itself!) and views. There’s also a simple way to execute arbitrary SQL. Migrations aren’t technically that amazing, but rather they’re a helpful organizational approach to what can be a really hairy problem: defining a schema and then applying changes to live databases while keeping track of what changes you’ve already applied.

Finally, there is a test data loading facility called Fixtures. The common opinion of Fixtures seems to be that they are broken by design and should be avoided. The main issue I’ve found with them is that the implementation ignores the kind of database design elements that any book on SQL would recommend, such as foreign keys and check constraints. I managed to circumvent this with a combination of a plugin and some customization, described in detail in my previous post, Rails, Fixtures, the Test DB, and Test::Unit. With those changes, all test fixture data is preloaded in the right order (so constraints aren’t violated) before any tests run, and any data alterations within tests are rolled back automatically by Rails.

A secondary issue with Fixtures is that they go directly from YAML text files to SQL INSERT statements, bypassing the ActiveRecord Model classes. ActiveRecord does pretty much rule out any fancy mapping between database tables and objects, so that’s not a problem, but this model-skipping fixture loading implementation means that any code in your model object (validations, before_save filters, etc.) will not be executed when loading fixtures. So fixtures do not work well with the otherwise pervasive Rails design rule of “put all the intelligence in the application”.

Still, despite the commonly-held disdain for using fixtures at all, I find that they can be tamed. In fact I’ve even created a base data facility for loading the fundamental data set that needs to be in the live database (e.g. initial admin user info). My approach is basically to alter fixture behavior to treat it as essentially a bulk data loading tool, and to do the extra housekeeping after loading to make up for the fact that the ActiveRecord model code was bypassed.

As far as I know, there is no bulk data dumping functionality in Rails.

So, to summarize, of the five main ORM features, here’s how ActiveRecord stacks up:

Describing Relationships: Easy to understand and use, with lots of slick functionality
Querying: Easy to understand and use, but limited to simple join structures, and not possible to customize query building or rewrite SQL before execution
Storage and Retrieval: Very easy to use, but only within the limits of the query builder’s features
Schema manipulation: Easy to understand and use; limited in functionality but readily extensible; solid third party plugins are available for missing schema objects
Bulk Loading and Dumping: Loading is badly designed and implemented, but fixable with some effort; dumping is not offered

Okay, so it definitely makes the easy stuff easy. But what about the rest?

As I observed before, ActiveRecord is not designed as a set of modules that you use to assemble a solution that fits your needs. That’s more of the Java approach to design, and it trades flexibility for convenience. It can be a major pain to assemble a working system out of all of those abstract Java APIs, which are sometimes so comically over-patternized as to draw mockery such as the hilarious “Are Javalanders Happy?” code snippet from Execution in the Kingdom of Nouns. Rails makes the opposite trade-off: sacrifice flexibility and gain a very approachable API.

Unfortunately, the Java approach (too abstract to readily use, but extremely flexible) is easily wrapped with a simpler, more convenient, less customizable API. The Rails approach isn’t internally componentized (have a look at ActiveRecord’s activerecord/base.rb source file in its 2,165-line glory, almost all of which is one class), so if you want to fiddle with its internal behavior, you can’t. So with Rails, it’s all or nothing: high level slickness for simple requirements, or hand-written SQL and hand-coded results mapping for your complex requirements.

As I said at the beginning, though, the key question is not how comprehensive the high level feature set is. More important is the question of how painful things are when you drop down to a lower level for a greater degree of control.

It would be nice if there were a middle level of complexity, between the high-level ‘find’ method and ‘has_xxx’ associations, and raw SQL. There isn’t. I think that the reason there isn’t one is that there is still a persistent belief among many Rails core team members and community members that databases should be stupid: just a persistent hash. Once upon a time I worked that way myself: I didn’t have access to or skill with a SQL RDBMS, and so I solved all of my persistence problems with DBM files, which (using Perl’s Tie::Hash class) are conceptually just persistent hashtables. miniSQL was little more than a SQL query parser on top of that sort of storage engine, and MySQL originally was pretty similar. But big databases have all sorts of useful features that address complicated persistence requirements in a fairly elegant way.

Given that Ruby fans like the idea of domain specific languages, which let you work in a super high level language customized to the problem domain, it’s surprising that Rails groupthink is that SQL is bad. It’s actually a very high level language, and allows a well written database to do some pretty amazing optimization on the fly because it provides a strong layer of abstraction between what you requested and how the storage engine provides it.

No, it’s not dynamic, nor is it pure relational perfection, but it’s pretty darn good. Pre- and post-event validations and arbitrary callbacks to user-specified code, functions providing behavior on top of data… these are all things that Ruby and Rails fans hold in high regard when provided by Ruby and Rails, but which are considered a bad idea at the database layer. As I discussed at length in Rails and the notion of Stupid Databases Being a Good Idea, this is a philosophy rooted in DRY, but it has some major flaws.

Mainly, there is the issue that some things must be done in the data tier, and trying to put them in the application tier doesn’t work. The best example that comes to mind is full text search. Satisfying queries is the database’s job, period. It’s just hideously slow to try and do an inner join in the application across a network link to a database. If you find yourself doing this, that’s a pretty good sign that your architecture is broken. But some queries are too complicated for ActiveRecord, so sometimes you must choose between a series of high level queries whose results are intersected in application code (easy to understand, but extremely inefficient), or hand coded SQL.

Well, SQL is fast and is a high level domain-specific language, so it isn’t actually a bad tool for the job. The problem is that this approach (the trapdoor to the lower level API) is regarded differently by different people. Some see it as a common and reasonable approach to complex requirements; others see it as a bad evil scary thing that should be avoided at all costs, a kludge and a design mistake.

As a result, the low level option in Rails is anemic. It’s there, but you’re not supposed to use it. Ruby’s ActiveRecord Makes Dropping to Raw SQL a Royal Pain (Probably on Purpose) notes that there are no bind variables allowed in ActiveRecord. You may be saying, “No, wait a minute, I’ve used them, that can’t be right.” That’s what I thought. Look at the source; the bind variable functionality is actually a high level feature built on top of drivers that don’t have that feature. Whatever you did at the high level, it’s going to the driver as a single string. Okay, it’s nice that they added that feature, especially since it provides a single point of testing and verification for safe escaping. But that functionality (in sanitize_sql) is not part of the public API. Fortunately that same article provides a workaround that makes sanitize_sql accessible, so you can use bind variables in your hand coded SQL code, and pretend that the driver supports them. But that’s not likely to work forever.

The key problem with ActiveRecord is its least common denominator feature set, based around the least featureful of all popular SQL databases: MySQL. Years ago, MySQL AB (the vendor of the MySQL database) took a strong philosophical stand against pretty much any advanced database features (which their product lacked, and which competing products had), but lately they’ve softened and added those features that they claimed nobody really needed. In the meantime, Rails has been designed with minimal expectations for database sophistication; therefore, the limited functionality of ActiveRecord is fairly complete, assuming you’re using a database with similarly limited functionality.

Triggers, stored procedures, functions, data integrity constraints, nested transactions, and views are all examples of unsupported database functionality. Try and use them via ActiveRecord’s high level API, and you will quickly see how fragile and inflexible ActiveRecord really is. If you shouldn’t need those features in your database, then you shouldn’t need anything that ActiveRecord doesn’t already provide, so it shouldn’t matter that you can’t extend ActiveRecord.

Truly, these are features that you need only in a few small cases in your application, so looking at individual queries they’re needed rarely (which is not the same thing as “never”). But looking at whether you need one or more of them in a given application, they’re needed more often than not. The pain of using hand coded SQL makes this worse: some tricky things could be done either using a view or stored procedure, or using a really slick dynamic SQL statement. Making all of those options painful means that even a clever developer can’t use anything in their bag of tricks to craft an elegant solution.

Unfortunately, non-trivial web applications need things like full text search, complex associations between persistent objects, non-trival summary information about associated objects, and complex reports, and ActiveRecord fails at all of these. These are not just things that big dumb ancient companies that like using Object COBOL think they need; Amazon and eBay need them too.

The acts_as_tsearch plugin is a good case study of ActiveRecord’s design flaws. TSearch2 is the standard PostgreSQL full text search engine, and it’s pretty good in my opinion. It’s also pretty straightforward to use. Unfortunately for developers using Rails, TSearch2 uses SQL functions (mainly to_tsquery and rank_cd). The acts_as_tsearch plugin tries to inject SQL into ActiveRecord’s queries via the high-level find interface, but ultimately fails as soon as you use the :joins or :include options. The problem is that ActiveRecord has a very simplistic idea of how queries and joins work, and so if you need to inject SQL functions to get the job done (as is necessary in TSearch2 queries), too bad. (See also issues 7 and 8 in acts_as_tsearch, in which I describe and attempt to clean up the mess that results when you use find_by_tsearch in non-trivial ways.)

A fellow Rails developer asked me in all seriousness why I wasn’t abandoning the full text search functionality of TSearch2 and just using a completely separate, redundant database product designed exclusively for full text search. Seriously, that is considered the “easy” approach: one database for full text search, and another for ACID/OLTP/CRUD. Honestly if I were going to go down that road I would try hard to just abandon the SQL RDMBS and put everything in the other database, since Lucene and its imitators are capable of far more than just find-text-in-document queries. The pain of duplicating everything, using two query languages, two document representations (in addition to the object representation in Ruby) and writing application-tier query correlation makes the double-DB approach seem very unwise.

It makes far more sense to me to use the SQL RDMBS’s full text search facility, even if there’s a 2x or 3x read performance penalty, because the conceptual simplicity of having one powerful storage tier (instead of two halves cobbled together) eliminates a ton of ugliness in the application, and the SQL RDBMS is going to get clustered for reads anyway. Nevertheless, even if I’m wrong about this case (putting search in the SQL RDBMS instead of in a separate server), there are other cases for needing a smart database that gives you exactly the results you need and lets you push data logic into the data tier.

So, what do I suggest? Abandon Rails? Nope. I still like Ruby a lot, and find Rails very useful. I just think that ActiveRecord needs to support the low-level and middle-level abstractions better.

Specifically, supporting bind variables (either by exposing that sanitize_sql function, or better yet by making drivers and connection adapters support bind variables for real) would make the find_by_sql, select_all, and exec approaches to low-level SQL query execution less painful.

More difficult, and substantially more valuable, would be refactoring ActiveRecord::Base to split it up in the way I described above: association descriptions and unmarshalling code separate from query building code separate from SQL execution and result retrieval code. All of this could remain hidden for most users under the same old slick high-level API, but for advanced requirements, the ability to fiddle with the SQL and still use the built in high-level unmarshalling code to create object graphs from flat result sets would be very powerful, and useful.

I looked at one alternative to ActiveRecord, called Sequel, which overlaps with ActiveRecord only partially. It is a query builder and lazy result proxy, which is actually what I thought ActiveRecord would do when I first started working with Rails. The proxy design means that you can either keep adding constraints or start fetching results, from the same Dataset class. This seems like a pretty good approach, though I haven’t really looked closely to make sure it would fit what ActiveRecord needs.

What Sequel lacks, though, is the unmarshalling side: turning a 2-dimensional (rows of columns) result set into a complex object graph (customers with orders with order lines with products from suppliers stored in warehouses), with user-controlled eager or lazy loading behavior. Ruby is well-suited to a design that would allow user-specified code (i.e., a block) to decompose each row into the object graph associated with that row, leaving the remaining associations on those objects to be lazily provided via future queries.

So, I think there is hope for ActiveRecord, definitely. I considered the idea of rolling a minimal Hibernate clone, or some other sort of challenger to ActiveRecord, but I don’t that ActiveRecord is broken beyond repair. I think the shortest path to a badass Ruby ORM is through improvements (refactoring and abstraction) to ActiveRecord.

So, if you’ve read this far, you probably care about these issues. Here’s my call to action: Please help me make ActiveRecord less like VB and more like Delphi. Who else is interested in helping me with this effort? Are there alternatives that I’ve missed, or components that could be integrated into ActiveRecord to make it better?

Immature developer attitudes revealed in flames regarding CDBaby

Jamie Flournoy — Sun, 23 Sep 2007 20:36:31 +0000

Derek Sivers of CDBaby kicks ass. He got a sophisticated and very very user-friendly, efficient, straightforward e-commerce system (including the back-end systems) written in PHP. Based on what I’ve read, he’s up there with Phil Greenspun in my opinion; that is, he’s among those who understand strategy and customer service and low-level technology and are able to build systems that don’t suck, resisting the temptation to be distracted by technological panaceas and fads. I may disagree with their individual technology decisions, but their higher-level thinking is excellent, so they’re definitely in the class of people who I’ll give the benefit of the doubt.

So when I read 7 reasons I switched back to PHP after 2 years on Rails I was a bit surprised, but not much. He’s experienced with PHP (he says he’s written 90,000 lines of code for CDBaby!), and has a huge installed base of code he wrote and understands intimately. He tried Rails, it didn’t work the way he wanted, and he went back to PHP. It was immediately obvious to him that this was what he should continue using.

The most shrill and arrogant among the Rails community have been rather unkind, partly due to this rather poorly written Slashdot headline that misrepresents what Derek says in his article.

I’m using Rails and I’m generally happy with it, though I have had to do some customization of the framework (mostly with pre-existing Rails plugins from like-minded developers) to suit my style. At this time I plan to continue using Rails for my project, and to keep using it in future projects.

But, I thought it would be instructive to summarize the arguments made by the folks slamming Derek in the comments to his article.

Rails is the correct solution for CDBaby, regardless of what the founder//designer/lead programmer of CDBaby thinks.
If you don’t agree with Rails’ design, you’re wrong, and you need to change your design if not your whole business to fit Rails.
PHP is bad and can never result in good code. Conversely, Ruby is good and is always better than PHP.
Objects are better than SQL and tables. Domain specific languages are great, as long as they’re written in Ruby; using SQL as a domain specific language for data manipulation and querying is not Ruby and therefore is a bad idea. Also, just because one person can code something in PHP in 2 months that runs on one server, instead of two people taking over two years to do it in Rails (which is notoriously hardware-hungry), doesn’t make PHP + SQL better. Because objects make you more productive.
Just because you wrote a successful online store yourself from scratch, have an O’Reilly column, and then hired a Rails core team member who now works for the same company as most of the rest of the Rails core team including DHH (37Signals), doesn’t mean that you actually had people smart enough to rewrite your app in Rails. You need to prove to the world that what you wanted to do was impossible in Rails before we’ll give you permission to make technical decisions for yourself.
If you don’t like Rails it must be because Rails is too good for you. Maybe instead of learning from years of real world experience running a business and writing the entire software suite for that business yourself, and hiring one of the best Rails developers in the world for your transition to a new architecture, you should have gone back and gotten a CS degree.

Since this is such an active flamewar with such sloppy readers (responding to the Slashdot headline’s misapprehension of what Derek wrote, instead of what he actually wrote), let me say this clearly:

The above listed points are not my opinions. They are summaries of opinions I find immature and silly.

(I’m assuming someone somewhere will read this post in anger and make themselves look foolish by arguing against these points as if I believed them anyway. I tried to warn ya…)

We can learn something from this. This same flamewar keeps appearing over and over and over, all over the internet, and before that, on BBSs and in print.

The simple fact of human mortality means that most of us are going to be learning and making mistakes that we imagine a wise, seasoned developer wouldn’t make. But that guy retired and is fishing, so it’s up to us to screw up and learn and hopefully do better next time.

For the last few thousand years, we’ve had the advantage of being able to read books, and get wisdom that way. But there are fads, and hyped books that seem to know it all. So you have to read a lot of books, and try a lot of contradictory ways of doing things, to get a mature enough perspective to make wise choices on future projects.

The point that is generally missed by everyone trying to do anything new to them, is that there’s a lot more out there that you don’t understand, and a lot of what you think is your brilliant new invention has been done before. The more you learn, the more humble you become as you learn that there are people WAY smarter that you are, and that there are people who have come before you who you will never catch up to.

In the case of web development, that means that there are people out there who disagree with you, and you might not actually know everything about everything like you think you do. Local experience and immersion in the problem they’re trying to solve makes them much better suited to solving their problem than some armchair quarterback. It’s tempting to fold your arms and feel smug about your quick dismissal of someone else who clearly isn’t as big a genius as you are, but the more certain you are of your own brilliance, the more likely it is that you’re just ignorant. (See also: Unskilled And Unaware Of It.)

In Derek’s case, he clearly jumped feet first into Rails and hired a rockstar developer, and then made the decision given a uniquely advantageous perspective that it just wasn’t working, after giving it a hell of a lot more time to pay off than I would have. (I’m not exactly sitting here scratching my head wondering “how could Derek have been so wrong?”)

What matters a lot more than choice of programming language is the ability to get the project done, meaning tested and correct and launched. Apparently for Derek, PHP is the way to get that done, and Rails ain’t.

Finally, consider the parallels between Derek in this case, and Phil Greenspun in his book (late 90’s). They both use SQL more than the “Objectistas” would like, which is to say, they use it directly and like it. They both use languages that are not the most fancy high level languages available, but they do completely understand how their application works, from UI all the way down to hardware performance, and they make integrated decisions that take business strategy and technological factors into consideration.

This is really critical to understand about why both of these guys are so smart: both of these people put aside dogma and made decisions that were all over the map, sometimes pragmatic and hackish, sometimes very rigorous and disciplined, depending on their assessment of the particular micro-issue they were addressing at the moment. They weren’t subscribing to any particular software development philosophy (Big Ball of Mud, XP, RUP, UML, Agile whatever, waterfall, etc.), nor did they just choose the most hyped architecture of the day and blindly stick with it, but freely mixed and matched whatever they thought was appropriate given their perspective on the situation.

In other words, they didn’t fall into the trap of thinking that there is One True Way to do everything. They improvised. They integrated their own experience and perspective with the wisdom of others, and made the decision that worked for them.

I hate to sound like a moral relativist, because I’m not. But as a practitioner, I’m definitely becoming more and more of a process relativist every day. One of the best ways to make a project fail is to try and force-fit an architecture and/or a methodology onto that project without customizing them to the particular details of the project. The best methodology is called “Roll Your Own.”

And so I leave you with this question, in the hope that it will help you with your current and future projects: Have you made any dogmatic decisions about your process or technology that are hurting your project?

Maybe it’s smarter at this point to keep doing what you’re already doing, but maybe there are some things that you could change. Try to keep your Mind Like Water, strong enough to cut through mountains but flexible enough to flow around obstacles, and making the decisions of which one to do based on the information that you and only you have.

J2ME: Write Once, Be Disappointed Everywhere

Jamie Flournoy — Mon, 20 Aug 2007 05:37:59 +0000

We developers and other nerdy folk are used to using strange and klunky applications that do something special, and we’re used to that trade-off.

Eclipse is an IDE so it’s hard to imagine it not being baroque and difficult to use, requiring weeks of effort to become productive. JBidWatcher has saved me a lot of money on eBay so I could probably put a dollar value on how much it’s worth to endure its bizarre UI. Azureus is fairly fugly also but it does a very good job and has a deep, sophisticated UI that’s fairly easy to understand, so despite the eyesore, it’s at least fairly clear. The common thread among all of these is that they are all written in Java, and that they are so valuable that it’s worthwhile to overlook the ugly UIs.

Now imagine those sorts of trade-offs, but on already difficult to use mobile devices, and aimed at consumers. Are you making a strategically wise choice by sacrificing usability and control over the user interface, and probably access to platform-specific features such as dialing the phone, in order to save money on development? Adam Breindel talks about this in When Building a Smartphone App, Resist the Siren Song of J2ME.

Adam and I worked on a J2ME application and I totally agree with him about the disillusionment of trying to write a single app that would work across phones. Issues include:

Complex and difficult application installation procedures for end-users: How do you install the JVM on the phone? How do you get the plain J2ME app packaged up so that the phone will accept it? Does the app require manual user configuration before use? Is there a different launching process from other apps?
Lack of control of the user interface: hardware details such as how many buttons you have, whether there’s a stylus, etc. differ from phone to phone, and the API to let you code once and let J2ME handle the layout for each device leaves your code very disconnected from what’s actually happening on the screen.
Not being able to use recent J2ME APIs because even the latest phones only support older, more minimal J2ME APIs
Not being able to do things on a phone that would seem obvious, like dialing the phone, opening a hyperlink in the phone’s browser, sending an SMS, or making a network connection. Either these are entirely impossible or require phone-specific or JVM-specific tools and procedures to sign your application, or having the runtime nag the user to request permission to do something that they just asked the app to do for them.

For all the noise Sun is making about broad J2ME penetration, the developer experience is quite disappointing, and as a result, the user experience is also quite disappointing. You can look up J2ME features and APIs and get excited, but when you actually deploy your app to a handset, it won’t load, or runs terribly slowly, or looks awful, or simply doesn’t do the things that the API says will happen when you call it a certain way. Suddenly the strict J2SE and J2EE logo certification programs make sense, because the J2ME approach of making so much functionality specified but optional leaves developers high and dry. The phone supports J2ME version xyz, but write an app coded to that API that works on the emulator and deploy it to a handset and lo and behold, all those optional APIs turn out to be missing even though the handset is capable of that functionality, and some mandatory API functions are not working. Here be dragons.

Case in point: can’t dial the phone on a Treo 650 (at least, not as of a year ago). The J2ME API tells you how to do it. The PalmOS JVM (made by IBM) lets you make the API call, and returns a successful response. Nothing happens. IBM says they’re aware of this issue. The end. The docs say you can, the code you write says you did, the phone just doesn’t do it.

Case in point: Every time you start an application and it accesses the network for the first time on a Treo 650, the user is nagged for permission to access the network. Quit the app and start again, nagged again. IBM has a tool that you can use to sign the app, but you have to use their VisualAge Micro Edition IDE which costs hundreds of dollars to do that. Try and find and download the trial version. A year ago, it was not possible. So, making that persistent nag go away probably costs several hundred dollars. (I never verified that it actually works, just that IBM says the way to sign the app so that it’s trusted is to do that, and that there was no available free way to get that tool.)

These are minor issues, but they certainly interfere with the quick usage pattern of a mobile app, and make it annoying to use your app. Imagine what that would be like if you had a competitor with a native application for that phone, whose application probably cost them more, but their app is better and the user likes it a lot more.

The important distinction here is not cost, it’s ROI. It costs a lot to develop a similar application for each smartphone platform, using that platform’s native tools. It would seem to cost a lot less to develop a J2ME app. But that’s only if you assume that it’s OK to abandon features and settle for a horrid user experience in the course of development.

It’s likely that your goal as a development team is to develop an app that has a predefined feature set that you know the device can support, and a predefined UI design that your mobile-savvy UI people are sure will go over well with users accustomed to that particular kind of smartphone. In that case you will almost certainly fail to accomplish that goal using J2ME. You have to scale back your goal so that you’re satisfied that you got something kinda like what you wanted working on a bunch of phones, and determined users will probably be able to figure out how to install it and make it work.

I call that phenomenon “write once, be disappointed everywhere.”

Let’s continue talking about cost, though. The native apps may require (or suggest) different programming language skills for different devices. It might seem wise to just write everything in C, but I think that’s a false economy as well. The phone APIs will differ so much that you will really need a native developer for each platform, not a team of generic C developers who will figure out the individual phone stuff and be freely floating resources that you can assign to whatever app version needs their attention. Smartphones may run Linux, may run Windows Mobile, may run PalmOS, may run Symbian… these are different operating systems with very different ideas of how applications run and coexist. The platform specific knowledge (APIs, appropriate UI feel, device capabilities) is probably an order of magnitude harder to learn and maintain than the ability to get an application working in a given programming language.

How much of a great C programmer’s skill is really the C language, and how much of it is proficiency with the available libraries on the platform he or she is accustomed to? I think close to 90% of their professional skill set is platform and library familiarity, and 10% syntax and low-level understanding of how the language actually works.

In light of this (just using C doesn’t mean developers or code are portable across smartphones), consider that there are high level languages available for some smartphones. What if that 90% platform familiarity means they can use a language and/or development environment that makes them 5 or 10 times as productive, after spending just a few days or a couple of weeks learning the language syntax?

After our very disappointing J2ME-on-PalmOS port experience, Adam found Handheld Basic which initially appalled me (oh no, BASIC!) but turned out to be a great choice. It’s a flavor of BASIC, so learning the language didn’t take long, and the support happens to be quite good (lots of code samples) so picking up the library portion of that 90% didn’t take long at all. I imagine that C# on Windows Mobile is similar. As more and more phone start to use Linux as their OS (which will be a particularly huge improvement for PalmOS based phones), you’ll be able to use Python, Ruby, Mono, J2SE (a whole different animal from J2ME), TCL, or pretty much any other high level language available for Linux. At LinuxWorld 2006 I saw a development device running unmodified GNOME desktop apps running on a PalmOS device alongside PalmOS apps. There are more options appearing all the time, and with the exception of Handheld Basic, most of them are ports of familiar, mature, well understood desktop languages, with class libraries relevant to mobile devices.

So, I recommend that you work in whatever high level language lets you do all the platform specific stuff you want, and if that means a different language per phone, that’s actually going to be the least expensive way to get to the apps you actually wanted to build. A big chunk of the scary cost of developing native apps goes away, because native doesn’t necessarily mean abandoning Java for C. (In fact I found Handheld Basic to be a more productive environment for me after a couple of weeks than J2ME was, despite my ~8 years of full time Java experience before starting that project.)

That pretty much means no reuse for you. Sorry, but that’s the deal right now.

If that unique-app-per-platform cost is too scary, consider a few ways to save money:

For networked apps (aren’t they all?) ask yourself if there’s some logic that could just as easily be done on the server as on the client. Is there some complicated parsing code on the client that could be simplified by changing the response format from the server to something that’s easier to parse?
Can you remove or alter certain features from a subset of platforms you intend to support, so that your premium supported platforms get your ideal app, whereas a few less popular phones still get a nice app, but perhaps one that doesn’t have every feature available on your premium app. You might be reading this and thinking “but that’s what you said J2ME would force me to do! Why is this any better?” The distinction is that you are in control of the decision of what to leave out to save money, whereas with J2ME it’s the platform vendor who makes that decision for everyone using their platform. If you’re writing an address book, not being able to dial the phone is lethal; not writing the code to let the user attach a photo to the entries is not.
Can you move some user-facing functionality to the server to make the client simpler? Maybe there are some rarely-used features that could be done via a desktop or mobile web browser, so you can focus on putting the ten-times-a-day features on the handset, and making those features fast and convenient. You probably already made that trade-off in general, but perhaps for some kinds of handset, you’ll move the dividing line a little further, so that for users of particularly rare phones, some moderately frequent features can’t be done on the phone. This could also be a good approach for new handset types: design a “lite” app and a “full” app version, build the “lite” app first on each platform, and let user demand tell you whether the full app is worth it. Your developers can tell you much more accurately how much the incremental functionality would cost, since they already have done the lite version. Maybe you’d provide a VoiceXML interface, or mobile web browser interface, for that feature so that the user can still do whatever the feature requires while they’re far from a desktop PC.

A final consideration: labor. Maybe you have some Java developers, or C developers, and don’t have developers good at Handheld Basic or C#, so you’re not inclined to fire them all and hire new developers to do native apps, you’re thinking maybe C on every handset, or J2ME, is still the right choice. I still say that you should probably use native apps in native high level languages on every smartphone platform. If you’re committed to a strategy of good apps on a bunch of different phones, I think I’ve made clear that native apps are the only way to currently get there; J2ME simply doesn’t let you make good apps. So what’s left is C code written by C developers vs. high level code written by C developers who have to retrain.

As I said above, I don’t see much chance that your developers or your code will be portable across different smartphones if you use C. Maybe you’ll get 5-10% savings that way (some code ported across phones, or some developer hours shuffled between platform teams). But you’d get a 5-10x cost saving from using something very modern and high-level instead of C, and the overhead of training for the language would be very very small as compared to the large and unavoidable overhead of having to learn what the handsets can do and how the APIs for the handset’s OS work.

I suppose that means that if you have a bunch of C developers who lack smartphone skills, you’re in a pickle, but that situation seems kind of unlikely to me (a mobile app company hires a bunch of Unix and Win32 C developers with no mobile phone skills?). More likely is that you’d find a mobile developer who is proficient with one or more handset OSs and the best tools for each one, and they may be of the opinion that C is the best choice since they can reuse skills and some code across platforms. I would say that in that case you need to convince them to (or more likely encourage them to do what they were already considering, which is to) go ahead and find a highly productive high level programming language/environment for each smartphone platform.

A final possibility if you have a ton of platforms to target and a pile of killer C programmers is to try and port something like Python across most of your target platforms, or to make a very high-level API or domain-specific language that runs inside your own custom C portability layer. You might be able to find an open source option that you can invest some developer hours in, so that the actual application-specific code that you write on each platform is minimized and portable. But I suspect that this would still be expensive and would result in some J2ME-like UI abstractions that ended up being very unsatisfying in the end.

Best of luck!

Rails, Fixtures, the Test DB, and Test::Unit

Jamie Flournoy — Fri, 03 Aug 2007 02:37:22 +0000

From what I’ve seen, Rails’ weakest features lie in the way it prepares the test database and test data, and Ruby’s Test::Unit isn’t much better than the awful but ubuiquitous JUnit that Java developers are accustomed to. I set out this week to impose my preferences on Rails in this area, and that took some effort. Here’s what I did.

When I’ve implemented (in Java) what Rails does for database preparation, I did it like this:

Create the test database exactly the same way that the developers’ databases are created: by running the exact same code, pointed at a different database.
Load the appropriate sets of data for the test database. “Sets” is plural on purpose; most non-trivial databases include code tables, which constitute base data which are essentially part of the database design itself. Then, test code will want a fixed set of known test data to act upon, so that tests can measure whether the code did the right thing given the test data (the right inputs yield the right outputs).
Run the individual tests, providing some way of assuring that changes to the test data are undone before the next test.

At first (11 years ago) I used a hand-maintained SQL DDL file to create the databases. Later I split that up into one file per table, and made a list of the proper ordering of tables during creation (reversible for deletion). Later still, with Hibernate, I ditched the DDL and let a higher-level ORM description of the table do the schema generation (which was painful in Hibernate since it wasn’t made to do that except from the command line, but it was possible to hack it into a state of relative beauty). The test data was always loaded from a bunch of text files that were easy to hand-edit (as opposed to a bunch of SQL INSERT statements).

Running the test with assurance of pristine test data was more or less horrific in a J2EE+Hibernate 2.x environment. The design of Hibernate and JUnit made it difficult to wrap tests in transactions, and the version of MySQL that we were using had no transactional storage engines available at all (MyISAM? Thanks, Red Hat!), so I ended up falling back on an intrusive but relatively high-performance design that required tests to declare if they were going to alter the test data, so that the test teardown method knew it had to reload the test data. Since we were waiting for Hibernate 3.0, MySQL 5.x, and a few other things to become part of our architecture, I left that solution in place and ended up moving on to a new job before fixing it.

Rails initially seemed to nail this problem: the test database is automatically made based on the development database; the data is loaded from YAML files called Fixtures, which feature a very simple and straightforward API, and tests run inside individual transactions. Nice!

Except not. Fixtures are loaded by specifying the tables for which you need test data loaded, and this is done in each Test::Unit::TestCase class, of which I have several hundred. They are stupidly reloaded each time you say a given TestCase is going to use them. Worse, the tables you’re using for this TestCase are emptied out using SQL DELETE statements, but if there is test data in other tables that has foreign key dependencies on the data being deleted, fixture loading will fail. (Rails was not designed for FKs to be enabled in the database, so encountering this this bug is a side effect of enabling them via the plugin.) This deletion behavior is pointless in light of transactions wrapping each test, but if you’re using MySQL MyISAM you can’t use transactions, so it needs to be there for people using MyISAM, which is to say, crazy people who care not for their data.

Since Test::Unit, like Java’s JUnit, lacks a hook for the beginning or end of a given TestCase class’s set of tests, there’s no way to accumulate a list of fixtures created and then delete them and/or reload them at the end. That would at least allow you to undo the creation of the fixtures so that the tables were all empty before the next set of fixtures were loaded. Sadly, Test::Unit is not that clever.

I initially fixed this problem a couple of months ago, using a hack that simply refuses to delete and re-create (test data) fixtures if they’re already loaded. That works since the fixture data progressively accumulates and is always clean since changes within tests are rolled back at the end of those tests.

Upon adding a trigger to a Rails migration and then writing a test case that checked to see if it was working, I found the true ugliness. Rails has Migrations, which in my opinion are an excellent feature that works well, and is a more useful generalization of my ordered-list-o-tables and set of table-definition text files. But… when creating the test database, Rails uses the SchemaDumper‘s schema.rb output to create it, instead of using migrations. Talk about principle of least astonishment… I was pretty astonished. We have migrations, which is how we create databases! Great! So let’s use this other thing instead.

Also, SchemaDumper does not in fact dump the schema; it dumps tables and indices only. The RedHillOnRails foreign keys core plugin adds foreign key dumping to this output, but forget about check constraints, triggers, and stored procedures. Those schema objects are ignored, so your test database is not the same as your development (or production) database. Whoops.

I thought of about a dozen ways to deal with this:

Abandon triggers and do it all in Rails, make a TODO to fix this later, and get on with feature implementation
Add code to the tests to check for the missing schema objects and add them if missing (eww)
Replace the db:test:prepare Rake task with one that tells PostgreSQL to copy the database as-is
Replace the db:test:prepare Rake task with one that tells PostgreSQL to use pg_dump instead of ActiveRecord::SchemaDumper
Hack the PostgreSQL-specific code that SchemaDumper uses to look at the pg_proc and pg_trigger system catalogs and use code similar to the RedHillOnRails Core plugin to dump stored procs and triggers into schema.rb also
Just dump using pg_dump into a temp file and parse the output and add that to schema.rb (ewwwwwww)

etc. etc.

I finally found the Migrate Test DB Rake Plugin which simply uses your Rails Migrations to create the test database. Lovely. Except I now had some new problems.

rake db:schema:purge for PostgreSQL does dropdb/createdb on the test database to empty it out. That creates a database with no built in procedural langauges, so stored procs won’t work. Adding the language to that database is a DB superuser task, so it couldn’t be done inside of Rake. Fortunately I found that I could solve this via “createlang plpgsql template1” which puts plpgsql in the template database used for creating new databases. Easy.
My never-delete-fixtures code got into a fight with my base-data-loader code. They both used Fixtures to load data, and so the base data fixtures made the never-delete-fixtures code think that the test data was already in. So the tests failed due to lacking test data.

I fixed this initially by modifying my BaseDataLoader class to not load base data if RAILS_ENV is ‘test’, and added code to the Migrate Test DB Plugin to set RAILS_ENV to ‘test’ right before running the migrations on the test database. This is a workaround, really, because it still leaves the base data either missing entirely, or duplicated.

Then I switched to the Preload Fixtures plugin which is nice but still leads to FK related errors. It grab the fixture names from your test/fixtures directory and loads all the files it finds, in the order it found them. That fails since alphabetical order and the required table creation order are different in my case.

Fortunately since I’m using the Migrate Test DB Plugin I can just observe the order in which tables were created and tell the Preload Fixtures plugin to do its work in the same order. This is in my environment.rb because that’s where all my project-wide monkeypatching currently lives. (Cleaning that up and maybe plugin-izing it is a TODO for the future.)

# Due to FKs, gotta specify ordering of fixture preloading here. Why not let migration create_table statements do it?
# (depends on Migrate Test DB Plugin being present; is here for the benefit of the preload_fixtures plugin)
module ActiveRecord::ConnectionAdapters::SchemaStatements
    alias create_table_orig create_table
    def create_table(table_name, options = {}, &block)
        fixture_filename = "#{table_name}.yml"
        if File.file?(File.join([RAILS_ROOT, 'test', 'fixtures' ,fixture_filename]))
            ENV['FIXTURES'] = [ENV['FIXTURES'], fixture_filename].compact.join(',')
            # puts ENV['FIXTURES']
        end
        create_table_orig(table_name, options, &block)
    end
end

Sadly if you run “rake test” it runs ruby as a subprocess in order to do “rake test:units”, “rake test:functionals”, and “rake test:integration”. That means that the migrations are run once (before the tests), but that the preloading is done three times. The second and third times through, though, the preloading fails since it’s trying to delete-then-create each table’s fixtures in table-creation order. So, a patch to preload_fixtures.rb is needed, to ensure that deletes are done first, in the reverse order of table creation. Here’s what the new preload! method looks like:

def self.preload!
    puts "PRELOADING FIXTURES..."

    require 'active_record/fixtures'
    ActiveRecord::Base.establish_connection(:test)
    fixture_filenames = (ENV['FIXTURES'] ? ENV['FIXTURES'].split(/,/) : Dir.glob(File.join(RAILS_ROOT, 'test', 'fixtures', '*.{yml,csv}')))
    
    # delete first, in reverse order
    fixture_filenames.reverse.each do |fixture_file|
        table_name = File.basename(fixture_file, '.*') # hack; might not be correct if class name != camelized table name
        ActiveRecord::Base.connection.delete "DELETE FROM #{table_name}", 'Fixture Delete'
    end
    
    fixture_filenames.each do |fixture_file|
      Fixtures.create_fixtures(File.join(RAILS_ROOT, 'test', 'fixtures'), File.basename(fixture_file, '.*'))
    end      
    puts "DONE. Loaded #{Fixtures.all_loaded_fixtures.keys.length} fixtures."
  end

I’m not sure, but I think there’s an assumption in there that the table name is the same as the fixture name. My patch also makes that assumption, which is true in the case of my project. But in your project you might not have done that, so further hackery might be needed.

So, it all seems to work correctly now, and I’m back to working on my trigger code. If this seems like it took a lot of effort, it did, but I think it’ll be worth it once I start using stored procs and triggers more. That phase begins now.

Rails and the notion of Stupid Databases Being a Good Idea

Jamie Flournoy — Fri, 03 Aug 2007 02:16:40 +0000

For the last few days I’ve been struggling to bend Rails to my will regarding the proper way to assure data consistency. Today I made some progress. This builds upon some research I did a few months ago, and hopefully this is a more or less complete solution to the problem of making Rails work the way I want it to regarding test databases.

DHH has clearly stated that he does not like a smart database. This is common among application developers, particularly in the agile methods camp, in that they generally appear not to understand relational set theory, or if they do, they believe that it is inherently inferior to object oriented methods (which lack a theoretical basis, as Fabian Pascal will happily shout at anyone who will listen). I gather from DHH’s statements that he merely is trying to practice Don’t Repeat Yourself (a.k.a. DRY, one of the most important values of Rails). I gather from Rails itself that he either respects the need of some folks to disagree with him enough to provide hooks to bypass ActiveRecord, or that he at least agreed with someone else’s patch. By this I mean that there are ways around the ORM features of ActiveRecord, to do raw SQL and to execute raw DDL at database creation time, which implies that he isn’t trying to force his opinions on others, but rather to make it easier to do things his way than to do them a different way.

Fair enough. Rails is opinionated software, as DHH often says, and I have found several cases where letting go of my particular way of doing things has been fine, given that Rails has a different but equally valid way of doing things that is made super easy by the framework.

However, I disagree with his decision to keep the DB stupid, for two reasons.

First, I prefer to put logic where it belongs, rather than gathering it all in one place. AJAX, and in particular Google Maps, is a good example of presentation logic going where it belongs, making the whole application work better. SQL RDBMSs have features that can be abused, and in some cases these features are there because a wrong-thinking but wealthy client demanded them, but most of the advanced features that a “Real Database” has are there so that you can protect yourself against data loss or data corruption. The database is in a unique position to let you declare rules for things that must always be true, and then to trust that the database will never violate those rules. Older versions of MySQL were notably lacking in these features and their absence was justified by MySQL staff who basically said “you don’t need that, and if you want it, you’re confused.” Rails has inherited some of these damaged assumptions from MySQL, leaving basic relational features like foreign keys out of the framework(!). Fortunately Rails allows plugins, and there is a set of foreign key plugins that overturn this decision. But in general, if the database belongs to your application, that’s not an excuse to move database functionality into application code. By calling it your application’s database (as opposed to an Integration Database) you imply that it is part of your application, and therefore any rules or procedural code in it is necessarily also part of your application. You can’t monopolize the database and say that no one else has any business using it, while at the same time holding it at arm’s length and saying it’s not a valid part of the application. It is. No, business rules probably don’t belong in the database, but basic data consistency maintenance (in rule or procedural form) does.

I’m being charitable here, but my experience with individual practitioners of the Stupid Database Method invariably ends with me finding out that they don’t really understand databases at all (hence the desire to abstract the database away entirely with a driver plugin architecture topped by an ORM layer, lest they have to understand how a specific database product works), and would rather remain ignorant and reinvent the same functionality in the application layer or in the ORM layer. (It’s a case of “when all you have is a hammer, everything looks like a nail”, where the hammer is a general-purpose programming language, and you’re looking at a problem of high performance concurrent transactional programming.)

Not surprisingly, the database of choice for these folks is the least featureful, lowest cost, easiest to install one available. Because naturally it’s much more agile to write and debug new multithreaded transactional code in a high level dynamic language. than it is to get the same functionality for free in a thoroughly tested product that’s written in C. Right? Perhaps DHH is not one of these people. I assume he is not, again based on what he has said and coded. But nevertheless, the folks I’ve talked to personally who agree with his point of view are all coming from a point of view of willful ignorance.

Secondly, I prefer to employ defense in depth against data errors. Transient errors can have workarounds, but data errors are permanent, and that means that if your data is valuable, the damage done can be irreversible. Just because it’s possible for correct application code to avoid race conditions, improper escaping, etc. doesn’t mean that you should put all your eggs in that basket. When the price of data corruption is high (i.e. if you value the data in your database) then it’s worth the duplication of effort: test the application code, but also put a constraint in the database that will catch things the application code missed.

This is the same sort of thinking that leads to using automated unit tests, then functional tests, then integration tests, and then some manual QA, all overlapping. Duplication of effort? Yes. Worth it? Yes. Database bugs are arguably the worst kind of bugs to find in production, so they merit extra code that maybe isn’t absolutely necessary for the application to work, but is nice to have since you’d like to sleep at night.

So, I feel justified in wanting to put CHECK constraints and triggers in my database.

The implementation details are discussed in part 2.