Thursday, February 25, 2010

The Holy Grail Of Persistence?

One of the very first CTO-grade decision I had to take in the making of Snoget was to pick what would become our main transactional persistence engine. Since we're using Erlang exclusively for our production servers, the solution seemed easy: use Mnesia. But I settled for PostgreSQL.

At this point, anyone who's been dealing with O/R mapping (like Ted Neward who said: "Object/relational mapping is the Vietnam of Computer Science"), should cry fool: Mnesia would offer me persistence without any impedence mismatch with the application runtime environment and I preferred a SQL database to it? Actually, to someone who has used an O/R mapper before and who switched to Erlang, discovering Mnesia for the first time is a sheer heavenly moment similar to that:


The Holy Grail Of Persistence

Though Mnesia is very clearly not presented as a replacement for general purpose RDBMSes, one can not avoid to seriously consider using it, just because there is such a low cost into moving data from and to an Erlang application.

As a developer, I already had my share of joys and pains from working with non-standard persistence engines (like Tamino and X-Hive). I also learned from others who did the same, in much greater scale than me, and who shared their experience about it. So it is with great circumspection that I approached the decision of using a niche database engine instead of a mainstream one.

That being said, here are the four key decision points that made me favor PostgreSQL:

  • Schema Migration - For a startup, it's critical to be able to evolve a database schema with the less friction possible as features are often in a state of flux.

    Using a standard DB like PostgreSQL allowed us to leverage Ruby's ActiveRecord Migration, which is not only handy for migrating forward (as you do in production) but also backwards (as you sometimes have to do in development). Though Mnesia record evolution is possible, the fact that data migration concerns permeate into the application code is very unpleasant. Going schema free was a tempting option but would not have come close to the flexibility ActiveRecord and PostgreSQL gave us.

  • Supporting Resources - Being able to solve problems quickly is essential for a startup: for everything that is not your core business, you usually rely a lot on the information available out there.

    PostgreSQL has an extensive body of knowledge available online and in print. When things go haywire or in case of doubt, you're pretty much guaranteed that a Google search will bring you at least a couple of pages where people asked the exact same question and got answers for them. With Mnesia, the amount of available information is way reduced, simply because it's still very much a niche database.

  • Standard Connectivity - When you're focused on building something new, the last thing you want is wasting time in re-inventing the wheel: interoperable building blocks are key.

    Using an standard database like PostgreSQL gave us immediate access to tools like Pentaho's Data Integration, which we use to massage data. Though we could have built an army of supporting tools to perform the same on Mnesia, it's always better to use something that's already there. I has also allowed us to fully leverage Ruby On Rails to build an awesome back office in no time. Though there are some Ruby-Erlang bridges out there, none gives you all the RAD features you get when plugging Rails to a standard database.

  • Operational Simplicity - In a startup, there's no DBA to nurse your database engine: you have to deal with it so it better be simple to operate.

    Installing, upgrading, backing-up, restoring PostgreSQL databases are all well defined operations, supported by a wealth of tools. The security model is straightforward too. And there are plenty of options for monitoring what's happening under the hood and analyze and tune performances. I have no doubt all this is possible with Mnesia, but in a less familiar and straightforward manner.
Of course, there is a downside in using PostgreSQL with Erlang, and a pretty big one: there is no official driver for it so you're fully subject to the talent of the developer whose driver you'll be using. For us, it quickly turned out that the driver we started with was the Achilles' Heel of our application and we had to switch to another implementation, which turned out to be very solid. The switch was painful because there is no such thing as edbc, i.e. a standard for database connectors in Erlang. If you switch driver, you get a new API!

At this point, some pundits must be fuming and asking why SQL? What about NoSQL? Partially for the same reasons quoted above. But more importantly, we're not locked with PostgreSQL: we mainly rely on this database engine for its transactional capacities, not for its relational ones. If the need arise, the way our application is architectured would allow us to swap-in another persistence engine, provided it's transactional, one functional domain at a time and this without too much pain.

Finally, if you wonder if I picked up PostgreSQL because I was familiar with this database, the answer is that I never used it before. But nothing looks like a RDBMS than another RDBMS. Granted they don't shine like the Holy Grail, but still they'll happily power your software house.