ActiveRecord: the Visual Basic of Object Relational Mappers

I’ve been working with Ruby on Rails intensively for several months, and I’ve finally found a place where Rails can’t readily be extended to do what I want. It’s ActiveRecord, which is probably the most controversial part of Rails.

I’m reminded of a James Gosling quote disparaging Microsoft tools, particularly Visual Basic: “The easy stuff is easy, but the hard stuff is impossible.” There’s a parallel between VB and Rails in this instance, in that if you only let yourself use the high level tools, the hard stuff is impossible, but the designers specifically tell you to do the hard stuff using a lower level toolset. The controversy that surrounds “X can’t do everything, therefore it sucks” should really be focusing on the feasibility of going through that trapdoor to do things “the hard way”. This is what Delphi did, which is why so many folks chose it over VB; it made the hard stuff easier.

Here’s the task I need to accomplish, for which ActiveRecord is not well suited: complex queries involving SQL functions and multiple-table joins. I want to join a few tables together, order by a SQL function, include with each result row the result of a SQL function that operates on each row, and have all that come back as a graph of high-level objects.

Despite my attempts to use plugins, extend and/or fix bugs in those plugins, and to dig through the ActiveRecord source to figure out what the documentation won’t tell me, I was unable to get it to work. Most of the parts of what I wanted was possible: acts_as_tsearch cleverly weaves SQL functions into a high-level ActiveRecord::Base.find calls; paginating_find provides a very convenient pagination API on top of ActiveRecord::Base.find, and ActiveRecord includes some clever association tricks such as automatic many-to-many relationships (has_and_belongs_to_many), eager loading of associated records using a join (via the :include option to ActiveRecord::Base.find), and a fairly low-level :joins option that lets you add tables to a ‘find’ query which can be used in your :conditions. Problem is, they don’t all work together in a fancy way.

Really, the issue in this case is related to the design choices that went into ActiveRecord.

Some ORMs (object-relational mappers) are designed in a modular fashion: there is a part that helps you describe the relationships between your model objects, a part that helps you construct queries, and a part that does the storage and retrieval. Sometimes there’s another part that uses your description of object relationships to create an empty database with the appropriate data model, or that looks at an existing database and creates an object model that matches it. Sometimes there’s an import/export tool for bulk data loading or dumping as well.

ActiveRecord has the first three functions integrated (which has benefits and drawbacks compared to a more modular approach), has a very isolated schema manipulation module, and has a somewhat isolated data loader tool.

The relationships are explicitly declared in source code using associations: has_one, has_many, belongs_to, and has_and_belongs_to_many. These are pretty fancy and provide some convenience features that make the associations appear as object collections, such that changing the collection and saving it turns into insert/delete/update activity in the database.

Query construction is basically tied to the objects themselves, in a way that greatly simplifies star-join queries, but which handles only the simplest joins across multiple tables, and is barely able to handle self-referential joins at all. So, you can easily load an object (or group of similar objects) and associated objects, but OLAP-style queries (“what are the top 5 states where customers are located who have bought classical CDs within 2 weeks of their release using American Express and had them shipped as gifts via UPS 3-day Select?”) are impossible. Oddly, views, functions, and stored procedures could bridge the gap between real-world data models and ActiveRecord’s limited set of association types, but they are not supported either.

The storage and retrieval code is inseparable from the query code, and so it is not possible to examine and modify the final SQL before it is executed, nor is it possible to provide an arbitrary query and have the results be parsed into an object graph based on the associations you have defined. The code that would allow these features appears to exist and be sufficiently well designed to allow this with a fairly small amount of changes to ActiveRecord. However, it is currently (as of Rails 1.2.3, which is the current release) not part of the documented API and is declared private.

There is a limited facility for constructing simple objects from arbitrary SQL, in find_by_sql. This loses essentially all of the high level functionality of the find method; most notably, it isn’t possible to use find_by_sql results to instantiate an object graph, rather than a flat array of objects (similar to the eager loading feature in the regular find method).

ActiveRecord has fairly good high-level schema creation functionality (“migrations”). Though it lacks concepts for all but the basic database objects, support can be added for foreign key constraints (I kid you not, they aren’t supported by Rails itself!) and views. There’s also a simple way to execute arbitrary SQL. Migrations aren’t technically that amazing, but rather they’re a helpful organizational approach to what can be a really hairy problem: defining a schema and then applying changes to live databases while keeping track of what changes you’ve already applied.

Finally, there is a test data loading facility called Fixtures. The common opinion of Fixtures seems to be that they are broken by design and should be avoided. The main issue I’ve found with them is that the implementation ignores the kind of database design elements that any book on SQL would recommend, such as foreign keys and check constraints. I managed to circumvent this with a combination of a plugin and some customization, described in detail in my previous post, Rails, Fixtures, the Test DB, and Test::Unit. With those changes, all test fixture data is preloaded in the right order (so constraints aren’t violated) before any tests run, and any data alterations within tests are rolled back automatically by Rails.

A secondary issue with Fixtures is that they go directly from YAML text files to SQL INSERT statements, bypassing the ActiveRecord Model classes. ActiveRecord does pretty much rule out any fancy mapping between database tables and objects, so that’s not a problem, but this model-skipping fixture loading implementation means that any code in your model object (validations, before_save filters, etc.) will not be executed when loading fixtures. So fixtures do not work well with the otherwise pervasive Rails design rule of “put all the intelligence in the application”.

Still, despite the commonly-held disdain for using fixtures at all, I find that they can be tamed. In fact I’ve even created a base data facility for loading the fundamental data set that needs to be in the live database (e.g. initial admin user info). My approach is basically to alter fixture behavior to treat it as essentially a bulk data loading tool, and to do the extra housekeeping after loading to make up for the fact that the ActiveRecord model code was bypassed.

As far as I know, there is no bulk data dumping functionality in Rails.

So, to summarize, of the five main ORM features, here’s how ActiveRecord stacks up:

  1. Describing Relationships: Easy to understand and use, with lots of slick functionality
  2. Querying: Easy to understand and use, but limited to simple join structures, and not possible to customize query building or rewrite SQL before execution
  3. Storage and Retrieval: Very easy to use, but only within the limits of the query builder’s features
  4. Schema manipulation: Easy to understand and use; limited in functionality but readily extensible; solid third party plugins are available for missing schema objects
  5. Bulk Loading and Dumping: Loading is badly designed and implemented, but fixable with some effort; dumping is not offered

Okay, so it definitely makes the easy stuff easy. But what about the rest?

As I observed before, ActiveRecord is not designed as a set of modules that you use to assemble a solution that fits your needs. That’s more of the Java approach to design, and it trades flexibility for convenience. It can be a major pain to assemble a working system out of all of those abstract Java APIs, which are sometimes so comically over-patternized as to draw mockery such as the hilarious “Are Javalanders Happy?” code snippet from Execution in the Kingdom of Nouns. Rails makes the opposite trade-off: sacrifice flexibility and gain a very approachable API.

Unfortunately, the Java approach (too abstract to readily use, but extremely flexible) is easily wrapped with a simpler, more convenient, less customizable API. The Rails approach isn’t internally componentized (have a look at ActiveRecord’s activerecord/base.rb source file in its 2,165-line glory, almost all of which is one class), so if you want to fiddle with its internal behavior, you can’t. So with Rails, it’s all or nothing: high level slickness for simple requirements, or hand-written SQL and hand-coded results mapping for your complex requirements.

As I said at the beginning, though, the key question is not how comprehensive the high level feature set is. More important is the question of how painful things are when you drop down to a lower level for a greater degree of control.

It would be nice if there were a middle level of complexity, between the high-level ‘find’ method and ‘has_xxx’ associations, and raw SQL. There isn’t. I think that the reason there isn’t one is that there is still a persistent belief among many Rails core team members and community members that databases should be stupid: just a persistent hash. Once upon a time I worked that way myself: I didn’t have access to or skill with a SQL RDBMS, and so I solved all of my persistence problems with DBM files, which (using Perl’s Tie::Hash class) are conceptually just persistent hashtables. miniSQL was little more than a SQL query parser on top of that sort of storage engine, and MySQL originally was pretty similar. But big databases have all sorts of useful features that address complicated persistence requirements in a fairly elegant way.

Given that Ruby fans like the idea of domain specific languages, which let you work in a super high level language customized to the problem domain, it’s surprising that Rails groupthink is that SQL is bad. It’s actually a very high level language, and allows a well written database to do some pretty amazing optimization on the fly because it provides a strong layer of abstraction between what you requested and how the storage engine provides it.

No, it’s not dynamic, nor is it pure relational perfection, but it’s pretty darn good. Pre- and post-event validations and arbitrary callbacks to user-specified code, functions providing behavior on top of data… these are all things that Ruby and Rails fans hold in high regard when provided by Ruby and Rails, but which are considered a bad idea at the database layer. As I discussed at length in Rails and the notion of Stupid Databases Being a Good Idea, this is a philosophy rooted in DRY, but it has some major flaws.

Mainly, there is the issue that some things must be done in the data tier, and trying to put them in the application tier doesn’t work. The best example that comes to mind is full text search. Satisfying queries is the database’s job, period. It’s just hideously slow to try and do an inner join in the application across a network link to a database. If you find yourself doing this, that’s a pretty good sign that your architecture is broken. But some queries are too complicated for ActiveRecord, so sometimes you must choose between a series of high level queries whose results are intersected in application code (easy to understand, but extremely inefficient), or hand coded SQL.

Well, SQL is fast and is a high level domain-specific language, so it isn’t actually a bad tool for the job. The problem is that this approach (the trapdoor to the lower level API) is regarded differently by different people. Some see it as a common and reasonable approach to complex requirements; others see it as a bad evil scary thing that should be avoided at all costs, a kludge and a design mistake.

As a result, the low level option in Rails is anemic. It’s there, but you’re not supposed to use it. Ruby’s ActiveRecord Makes Dropping to Raw SQL a Royal Pain (Probably on Purpose) notes that there are no bind variables allowed in ActiveRecord. You may be saying, “No, wait a minute, I’ve used them, that can’t be right.” That’s what I thought. Look at the source; the bind variable functionality is actually a high level feature built on top of drivers that don’t have that feature. Whatever you did at the high level, it’s going to the driver as a single string. Okay, it’s nice that they added that feature, especially since it provides a single point of testing and verification for safe escaping. But that functionality (in sanitize_sql) is not part of the public API. Fortunately that same article provides a workaround that makes sanitize_sql accessible, so you can use bind variables in your hand coded SQL code, and pretend that the driver supports them. But that’s not likely to work forever.

The key problem with ActiveRecord is its least common denominator feature set, based around the least featureful of all popular SQL databases: MySQL. Years ago, MySQL AB (the vendor of the MySQL database) took a strong philosophical stand against pretty much any advanced database features (which their product lacked, and which competing products had), but lately they’ve softened and added those features that they claimed nobody really needed. In the meantime, Rails has been designed with minimal expectations for database sophistication; therefore, the limited functionality of ActiveRecord is fairly complete, assuming you’re using a database with similarly limited functionality.

Triggers, stored procedures, functions, data integrity constraints, nested transactions, and views are all examples of unsupported database functionality. Try and use them via ActiveRecord’s high level API, and you will quickly see how fragile and inflexible ActiveRecord really is. If you shouldn’t need those features in your database, then you shouldn’t need anything that ActiveRecord doesn’t already provide, so it shouldn’t matter that you can’t extend ActiveRecord.

Truly, these are features that you need only in a few small cases in your application, so looking at individual queries they’re needed rarely (which is not the same thing as “never”). But looking at whether you need one or more of them in a given application, they’re needed more often than not. The pain of using hand coded SQL makes this worse: some tricky things could be done either using a view or stored procedure, or using a really slick dynamic SQL statement. Making all of those options painful means that even a clever developer can’t use anything in their bag of tricks to craft an elegant solution.

Unfortunately, non-trivial web applications need things like full text search, complex associations between persistent objects, non-trival summary information about associated objects, and complex reports, and ActiveRecord fails at all of these. These are not just things that big dumb ancient companies that like using Object COBOL think they need; Amazon and eBay need them too.

The acts_as_tsearch plugin is a good case study of ActiveRecord’s design flaws. TSearch2 is the standard PostgreSQL full text search engine, and it’s pretty good in my opinion. It’s also pretty straightforward to use. Unfortunately for developers using Rails, TSearch2 uses SQL functions (mainly to_tsquery and rank_cd). The acts_as_tsearch plugin tries to inject SQL into ActiveRecord’s queries via the high-level find interface, but ultimately fails as soon as you use the :joins or :include options. The problem is that ActiveRecord has a very simplistic idea of how queries and joins work, and so if you need to inject SQL functions to get the job done (as is necessary in TSearch2 queries), too bad. (See also issues 7 and 8 in acts_as_tsearch, in which I describe and attempt to clean up the mess that results when you use find_by_tsearch in non-trivial ways.)

A fellow Rails developer asked me in all seriousness why I wasn’t abandoning the full text search functionality of TSearch2 and just using a completely separate, redundant database product designed exclusively for full text search. Seriously, that is considered the “easy” approach: one database for full text search, and another for ACID/OLTP/CRUD. Honestly if I were going to go down that road I would try hard to just abandon the SQL RDMBS and put everything in the other database, since Lucene and its imitators are capable of far more than just find-text-in-document queries. The pain of duplicating everything, using two query languages, two document representations (in addition to the object representation in Ruby) and writing application-tier query correlation makes the double-DB approach seem very unwise.

It makes far more sense to me to use the SQL RDMBS’s full text search facility, even if there’s a 2x or 3x read performance penalty, because the conceptual simplicity of having one powerful storage tier (instead of two halves cobbled together) eliminates a ton of ugliness in the application, and the SQL RDBMS is going to get clustered for reads anyway. Nevertheless, even if I’m wrong about this case (putting search in the SQL RDBMS instead of in a separate server), there are other cases for needing a smart database that gives you exactly the results you need and lets you push data logic into the data tier.

So, what do I suggest? Abandon Rails? Nope. I still like Ruby a lot, and find Rails very useful. I just think that ActiveRecord needs to support the low-level and middle-level abstractions better.

Specifically, supporting bind variables (either by exposing that sanitize_sql function, or better yet by making drivers and connection adapters support bind variables for real) would make the find_by_sql, select_all, and exec approaches to low-level SQL query execution less painful.

More difficult, and substantially more valuable, would be refactoring ActiveRecord::Base to split it up in the way I described above: association descriptions and unmarshalling code separate from query building code separate from SQL execution and result retrieval code. All of this could remain hidden for most users under the same old slick high-level API, but for advanced requirements, the ability to fiddle with the SQL and still use the built in high-level unmarshalling code to create object graphs from flat result sets would be very powerful, and useful.

I looked at one alternative to ActiveRecord, called Sequel, which overlaps with ActiveRecord only partially. It is a query builder and lazy result proxy, which is actually what I thought ActiveRecord would do when I first started working with Rails. The proxy design means that you can either keep adding constraints or start fetching results, from the same Dataset class. This seems like a pretty good approach, though I haven’t really looked closely to make sure it would fit what ActiveRecord needs.

What Sequel lacks, though, is the unmarshalling side: turning a 2-dimensional (rows of columns) result set into a complex object graph (customers with orders with order lines with products from suppliers stored in warehouses), with user-controlled eager or lazy loading behavior. Ruby is well-suited to a design that would allow user-specified code (i.e., a block) to decompose each row into the object graph associated with that row, leaving the remaining associations on those objects to be lazily provided via future queries.

So, I think there is hope for ActiveRecord, definitely. I considered the idea of rolling a minimal Hibernate clone, or some other sort of challenger to ActiveRecord, but I don’t that ActiveRecord is broken beyond repair. I think the shortest path to a badass Ruby ORM is through improvements (refactoring and abstraction) to ActiveRecord.

So, if you’ve read this far, you probably care about these issues. Here’s my call to action: Please help me make ActiveRecord less like VB and more like Delphi. Who else is interested in helping me with this effort? Are there alternatives that I’ve missed, or components that could be integrated into ActiveRecord to make it better?

29 thoughts on “ActiveRecord: the Visual Basic of Object Relational Mappers”

  1. >You say that you cannot do complex queries. find_by_sql is your friend.

    I mentioned find_by_sql, and why it isn’t sufficient, in paragraph 11.

    It doesn’t do eager loading; that costs you one or more additional queries per row. In computational complexity-speak, that’s O(n) or worse, instead of O(1). Ouch.

  2. eBay actually don’t even do joins in the database. It sounds crazy, but they think the performance is better. I personally would not use the RDBMS’s built in search feature. The DB is likely to be a bottleneck in any significant application, and search would make that much much worse.

  3. Rails groupthink is not that SQL is bad. This is patently false. Dropping down to find_by_sql for hard queries is not bad or dirty, its great, exactly because SQL is great.

  4. Here’s a suggestion to solve your sample problem

    ”what are the top 5 states where customers are located who have bought classical CDs within 2 weeks of their release using American Express and had them shipped as gifts via UPS 3-day Select?”

    Express these conditions as plain SQL and use ActiveRecord to find only the ids (primary keys) of the states. Then, in a second query, use the ids in a condition to only return the matching records and also include the necessary associations for eager loading.

    Alternatively, define a view in the database

    http://activewarehouse.rubyforge.org/rails_sql_views/

    , define a model class, OlapState, or something, that maps to it and query through this special model class, including earger loading. You may even define an abstract base class:

    AbstractState < ActiveRecord::Base self.abstract_class = true … end for State and OlapState. Big caveat: I haven’t tried this latter approach! Anyway, I agree with you that ActiveRecord would benefit from better modularization. Nonetheless, it is possible to work around most problems, often even elegantly possible.

  5. Woah, I’m totally with you, man. Nice article!

    I tried RBatis, and I quite like it. It simply lets you define queries (in sql), and result maps (in ruby). I am not sure about complex object graphs – I did not try them. But for simple joins I tried it and it is possible with eager and lazy loading. With only little effort RBatis can be integrated into rails and can even be mixed with ActiveRecord models.

    There is also a new ORM which is called DataMapper. I only looked at some of their rspec’s and did not try it, but they have an API which partially is compatible with sequel and AR. They also claim to be thread safe and much faster then ActiveRecord.

    I think many problems that you described come from the fact that ActiveRecord was from the ground up designed following the active record pattern, unlike some other ORM’s which may first implement a general level of ORM – where you can execute arbitrary sql, map it together, typecast stuff, combine queries. On top of that you can always implement the active record pattern, but the ActiveRecord library skipped that level.

    If you are serious about writing your own ORM library, then you should also take a look at some other libraries like SQLAlchemy (Python) and Hibernate (Java). There also seems to be a nice one for Smalltalk which may be interesting. And you should read some stuff on vietnam, of course ;-)

  6. >Then, in a second query, use the ids in a condition to only
    >return the matching records and also include the necessary
    >associations for eager loading.

    This is a pretty good idea, with O(2) instead of O(n) complexity. The problem is that ActiveRecord’s associations are pretty simplistic, and so there are many cases where “include the necessary associations for eager loading” is not possible. The API docs paint a pretty picture, but if you actually try and do nested :includes across a couple of HABTMs, you get an exception.

    So, we’re back to pretty much the same trade-offs as find_by_sql: maximum SQL expressiveness, but lacking a hook to unmarshal the results using your own code (or better yet, the AR default code plus a teeny helper you added to handle something AR can’t on its own).

    A view solves the problem in a different way but brings the same problems as find_by_sql: it’s easy to put in arbitrary SQL and get a result set that contains all the data you need; the problem is getting AR to unmarshal that, so you avoid dozen or hundreds of additional round-trips to the database to populate the associations of each resulting model object.

  7. Very well-written piece. I’ve experienced many of these limitations myself, and have been aggressively pushing the boundaries of ActiveRecord. The idea of constructing graphs of objects from arbitrary queries is being researched and will solve an awful lot of problems, but it really requires the right API to work well. The lack of composite keys is another issue that’s really hard to work around. I’ve had good luck with foreign keys, but haven’t gotten into a lot of views, triggers or stored procedures, so i can’t speak from experience about the specific difficulties there.

    I think at the root of the problem is not a philosophy about SQL, but rather the fact that ORM is a fundamentally difficult problem. I disagree with your assertion that the Java approach could be wrapped up with a simple interface more easily than ActiveRecord could be componentized. On its surface that seems right, but the reality is that ActiveRecord is pretty simple to go into and refactor. It’s one cohesive unit with good test coverage, so overall it’s much easier to change because it simply doesn’t do as much.

    The goal in Rails development is to make the easy things easy with a minimal code base while allowing for the difficult things. To that end, I think the core team would accept patches and refactorings that make the system more interoperable with advanced database features, even if they wouldn’t accept a whole new module for dealing with (eg) stored procedures.

    Overall I’d rather have a simple ORM that exposes SQL directly than an ORM that attempts to solve every problem and ends up being much more complex than SQL to begin with. To that end, I think you are on the right track with your refactoring and abstracting idea. Check out Zach Dennis’ work as he has a lot of great ideas about stuff that needs to be moved out to the adapter level. I also have interest in working on this, but not necessarily much time, but I’ll definitely review code and contribute in ideas as I can.

  8. There is another big problem, which is the term “DSL”.

    I think we need to use it differently, it is too often used for many different aspects and situations.

    I once even read someone who wrote that BLAST is a “Domain specific language”. BLAST is nothing than 20 different chars, which will appear as a string. NOTHING ELSE!

    If people start to say this is a DSL, then a XML that is suited for a specific “domain” is also a proper DSL. And you see where this all leads to.

    I propose that we instead focus one ONE EXTREMELY important aspect of a GOOD DSL.

    It is one that presents to other humans, a certain logic aspect. For example:

    order 4 pizza

    That would be a nice DSL, wouldn’t it?
    Its small… limited… but also doesnt cary the overhead of ANY language which
    puts you to limits.

    DSLs like this are 10000x easier to debug in this regard, because you, as a human,
    ALWAYS know that the information stored within that is correct.

    You cant use XML because XML isnt really for humans. You also cant use YAML because YAML enforces you to use ways in a certain way (and if you dont, you get a parsing error, something which i think a proper DSL _for_ humans should never give you. )

    Now, some DSLs may be a lot more complex… Ruby’s syntax encourages you to do a full DSL with only a few minor issues (since ruby parser wants valid ruby code), but even then such a DSL-like construct may be of benefit.

    My favourite recent example for this is _why’s Shoes application. I hope people do use it more. It separates a lot of problems which other solutions (like pure C gtk) have.

    But back to my main point:

    -I hope people use the term DSL with proper care and thought!

  9. >The DB is likely to be a bottleneck in any significant application,
    >and search would make that much much worse.

    DB read activity is easy to scale up. The bottleneck is usually hard disk writes, and I can see how an ACID transaction with an insert/update and a resulting text index update could become a problem eventually.

    However, using an external search server now (and doing the inner join in the app) is a bunch of extra coding/testing work, sysadmin work, and performance drag now, in order to prevent a potential future performance bottleneck much later.

    I’d rather keep it clean and simple and efficient now, and worry about what happens when I outgrow a cluster of badass DB servers when and if that ever happens. That sure would be a nice problem to have.

  10. Great article, thank your very much for it. Found it through its discussion on Reddit.

    I’ve been wanting to write it for a while, because I keep bumping against ActiveRecord’s limitations myself, but I knew it would be long. You did a great job of it.

    About self-referential joins with eager loading and conditions, such as these:

    @attendee.meetings.find(:all,
    :include => :attendees,
    :conditions => [‘attendees.id in (:friends)’, {:friends => @attendee.friends.collect{|f|f.id}]
    )

    My ugly workaround is something along the lines of

    @meetings = @attendee.meetings
    @friends_meetings = Meeting.find(:all,
    :include => :attendees
    :conditions => [‘attendees.id in (:friends) and meetings.id in (:meetings)’, {:friends => @attendee.friends.collect{|f|f.id}, :meetings => @meetings}]
    )
    @meetings.each {|m| m.friend_attendees = @friends_meetings.detect{|fm| fm.id == m.id}.attendees}

    Sucks, but works.

  11. Very nice write up. A lot of what you mention here could be abstracted up to just about any ORM. In the Java world Hibernate has a lot of limitations much like ActiveRecord. Not the same limitations, but limitations none-the-less.

    What has surprised me is that you haven’t gotten the “it’s open source so go fix it yourself” answer that seems common in the RoR community.

    Regardless, there is something to be said for being able to use SQL to its fullest.

  12. For an alternative design, you might want to look at SQLAlchemy – http://sqlalchemy.org/ – which is a Python library. It’s very modular, very adaptive. I love it. It’s basically the only way I’ll interact with an RDBMS in Python ever again, even if all I used it for was connection pooling.

    SQLAlchemy’s design offers many layers. There’s the connection management and pooling. There’s schema definition and mapping (basically table, column, and index definitions) in Python, which can be reflective or explicit. There’s querying, which can be done with raw SQL or with Python objects, operators, and expressions. And then there’s the object-relational-mapping layer. You can map classes to simple tables, or to arbitrary selects – including ones with SQL functions (select count(orders) as order_count). You can map across multiple tables. There’s support for many kinds of inheritance, including polymorphic single table, polymorphic multi-table, etc. And when doing queries with the mapped objects, you can still do quite wild queries and enable / disable eager loading and other mapped features, and still get the object graph back.

    It’s a far more complex system than ActiveRecord, and it offers no “DSL Cuteness” by default (there’s no “belongs_to_and_has_many”), but there are extensions that offer that kind of functionality. But as a complex system, it stays within its native language (Python) – there’s no XML sit-ups or anything like that.

    I think it’s a terrifically designed system and it’s the first RDBMS toolkit that’s given me everything I want, from raw SQL and result sets to complex object mappings. I think it would be a good to look at for ideas.

    SQLAlchemy ORM Tutorial: http://www.sqlalchemy.org/docs/04/ormtutorial.html

  13. Thanks for comment #6 above. I hear often from Railists blogs that SQL is bad. Some Rails user (I use it myself) think its not … I think the camp is sorta split in this respect. But, when keynote speakers at RailsConf essentially say “I dont like SQL” of course the public at large will think they speak for the majority of Rails developers.

    just my 2 cents.

  14. >Maybe you should also try & test Ambition
    I saw that a while back. It doesn’t strike me as useful.

    From that page:
    >Ambition has a simple goal: making you stop writing SQL in your queries and only stick to Ruby.

    At last, an even less expressive abstraction. Rapture!

    >For now, Ambition is still just wrapping ActiveRecord::Base#find

    No help with the hard stuff, just more syntactic sugar (and icing and sprinkles) on the easy stuff.

  15. DHH does not think any such thing. He explicitly says in the Rails-book that SQL is great for what it’s for (which is why dropping down to find_by_sql should not be frowned upon), SQL is just a bit too verbose in the simplest cases.

  16. This is what DHH wrote:

    But Isn’t SQL Dirty?

    Ever since developers first wrapped relational databases with an object-oriented layer, they’ve debated the question of how deep to run the abstraction. Some object-relational mappers seek to eliminate the use of SQL entirely, hoping for some object-oriented purity by forcing all queries through an OO layer.

    ActiveRecord does not. It was built on the notion that SQL is neither dirty nor bod, just verbose in the trivial cases. The focus on removing the need to deal with the verbosity in those trivial cases (writing a 10-attribute insert by hand will leave any programmer tired) but keeping the expressiveness around for the hard queries — the type SQL was created to deal with elegantly.

    Therefore, you shouldn’t feel guilty when you use find_by_sql to handle either performance bottlenecks or hard queries. Start out using the object-oriented interface for productivity and pleasure, and then dip beneath the surface for a close-to-the-metal experience when you need to do so.

  17. >This is what DHH wrote:
    I have that book (Agile Web Development with Rails) and I’ve read that section too. I agree with it.

    Maybe it’s a vocal minority who are anti-RDMBS and/or anti-SQL – it’s hard to tell. I can change the wording to something other than “groupthink” if it’ll help.

    The issue that matters to me is the API and how it could be better. If there are people who would reject patches to make using SQL with Rails easier due to an ideological disdain of SQL, that needs to be addressed in order to make the API better. if not, well, joy and celebration.

  18. Jamie: There are of course lots of different opinions about AR, especially now that Rails has become so popular and there are countless of groupings and opinions on the matter.

    Perhaps it is no longer even possible to speak of any “groupthink” pertaining to the Rails-community as a whole, on any topic whatsoever. That is, unless one refers to the _core_ of the Rails community, in which case there definately are no “SQL is dirty” sentiments.

  19. Ouch, I’ve been struggling with unit tests that load and use fixtures – and also rely on callbacks – for the past couple of days. I just read this part of your post:

    “…A secondary issue with Fixtures is that they go directly from YAML text files to SQL INSERT statements, bypassing the ActiveRecord Model classes….this model-skipping fixture loading implementation means that any code in your model object (validations, before_save filters, etc.) will not be executed when loading fixtures”

    In my case, there are no foreign key dependencies, but I’m using a legacy database that doesn’t include autoincrement or support sequences, so I need to generate primary key id’s in the before_create callback. Since fixtures bypass the model, can’t use them in the tests.

    Very informative post. Thanks.

Leave a Reply

Your email address will not be published. Required fields are marked *