Two design episodes in Rails
Friday, February 25th, 2011After the fact, it’s easy to find better solutions. I present two examples from my experience with an ecommerce Rails application.
How to avoid joins, part I
We needed to make both product categories and discounts appear or disappear at specific times. The canonical solution seemed, at the time, to declare
class Schedule < ActiveRecord::Base belongs_to :schedulable, :polymorphic => true def active? Time.now.between?(self.start, self.end) end end class Discount < ActiveRecord::Base has_one :schedule, :as => :schedulable # ... end class Category < ActiveRecord::Base has_one :schedule, :as => :schedulable # ... end
The schedules table contains the two columns “start” and “end”. The logic for a schedule is pretty simple: it’s “active?” if “now” is between “start” and “end”.
A lot of thought and discussion went into deciding when to save a Schedule object in the schedules table; we wanted to write
class Category < ... def active? self.schedule.active? end end
but this presumes that there indeed is a schedule saved in the database for a_product, so it had to be
class Category < ... def active? self.schedule && self.schedule.active? end end
and similar complication happened when adding or changing the schedule. Selecting all the active categories, got more complicated for we needed an extra join, and a left outer join at that. That impacted all queries on categories or discounts, in particular the free search query that really didn't need to get more complicated. We decided to add a post_create callback so that every new discount or category would always have an associated schedule. The decision to go with the "polymorphic has-one association" led to complification and increased coupling.
So what is a simpler, caveman's solution? Well, why not add "validity_start" and "validity_end" columns to both categories and discounts tables? Our Ruby code would become:
module Schedulable def active? Time.now.between?(self.validity_start, self.validity_end) end end class Discount < ActiveRecord::Base include Schedulable # ... end class Category < ActiveRecord::Base include Schedulable # ... end
So all it takes to make a model "schedulable" is to include the Schedulable module, and add two more columns to the model table.
Analysis of the "caveman" solution:
- Duplication? In the definition of the two tables, perhaps a bit. But no duplication in Ruby code, and far less code to write.
- Denormalized? Actually no. A "schedule" is not something relevant to the business; it doesn't need an identity. It is not a business entity so it's proper that we implement it as a collection of attributes rather than with its own table.
- Good Rails design? I think it is. The decision to make Schedule a model was bad design, as a Schedule is not a business entity.
So the "caveman" solution actually is what a good data modeler would have chosen from the start. We were fascinated by how easy it was to use the "polymorphic association" to remove the duplication of the two extra columns, that we ended up complicating our life for no good reason.
Lesson learned: always consider what a caveman would do. He might be smarter than you!
Lesson learned: Rails is an effective way to put a web GUI in front of a database. Think like a data modeler. Think Entity-Relationship.
How to avoid joins, part II
Later in the same project, we needed to make the website respond in English or Italian. Rails 2 is well equipped for localizing the GUI out-of-the-box; but it will not deal with the problem to translate the properties of your model. In our case, we needed to translate the names and descriptions of products.
The canonical Rails solution at the time was to use the Globalize2 gem. It's actually a good gem, but in retrospect it was not a good fit for our problem.
Fact: we didn't need to support 1000 languages. Just 2. Maybe 3 or 4 in the next 5 years.
Fact: we didn't need to translate 100 attributes in 100 models. Just 3 properties in 2 models.
The Globalize2 gem adds a join to a "globalize_translations" table to every query. That didn't do any bit of good to the free search query, that was awfully complicated already! So while in theory Globalize2 is transparent, in practice you have to modify many queries to take it into account.
Once again, what would our friend the caveman do? You guessed it, add extra columns instead of a join. Replace columns "name" and "description" with "name_en", "name_it", "description_en", "description_it". You add a bit of drudgery to the schema definition, but your queries turn out to become simpler. Taking advantage of the fact that schema migrations in Rails are very easy, it would not be a big problem to add a new language by adding the few extra columns.
And all the Ruby code it would take to produce the globalized descriptions in the web pages would be to define
class Product < ... def globalized_name if self.attributes.include? "name_#{current_locale}" self.attributes["name_#{current_locale}"] else # fallback to Italian self.name_it end end end
My solution is not transparent, but it's simple and easy to understand; while Globalize2 tries to be transparent but does not quite succeed.
Lesson learned: less magic please. If you can't achieve complete transparency, then go for being explicit *and* simple.