Friday, March 20, 2009

Application Pwnership

Who owns this application? What can possibly be complicated about such a simple and innocent question?

Unfortunately, the answer to such a question is not that easy. Or at least, we have created software organizations that make it hard to answer.

Though it makes sense to have a division of labor between different teams qualified on certain aspects of software applications, the main problem resides in the partial system views that such a division creates.

Out of developers' hands, an application is oftentimes perceived by QA, DBAs or Operation teams as a giant black box:
(yes, it is a canonical black monolith of 1-4-9 proportions)

In fact, this giant black box has a few keyholes on it and, through these keyholes, they can barely peer into it. Consequently, the black box effect leads to QA teams seeing applications as sets of buttons to push, DBAs seeing them as data and table spaces and Operation teams dealing only with cryptic log files and alarms of all sorts.

No-one knows what is inside this darn black box. And when somethings goes haywire, only the Magician of Oz is deemed able to do something:

Now that everybody is so agile, this kind of concern may sound irrelevant. After all, developers are now generalizing specialists, so the barrier between them and other teams is now significantly lowered. Or is it?

The reality remains for the most part a difficult hand-off of applications between teams. I have seen different tentatives to improve things (including detailed operational manuals), but at the end of the day most of these attempts amounted to shallow knowledge transfers.

Ownership needs more.

Do you have any success story where applications have been successfully pwned by different teams?

Wednesday, March 11, 2009

Hot reload and the SRP

Not so long ago, I have been tasked with the development of an in-memory IP address geolocation library. Yep, that was pretty cool and challenging at the same time (well, the challenge made it cool, right?).

In this short post, I want to share how the design of one component, the data driver, has evolved over time and how the Single Responsibility Principle (SRP) inspired this refactoring. The data driver was the poor guy charged of loading millions of geolocation data entry with the smallest possible memory footprint on a JVM (it did his job quite well, as the sizes of the zipped raw data and its in-memory form were of the same magnitude).

Like in any projects, everything started nice and simple. I have removed many moving parts and simplified the design to focus only on the discussion point, but what you see hereafter is pretty similar to where I started:

Quickly enough though, the rosy picture turned to a slightly less appealing color (kinda sorta brownish).

Loading all this data in memory takes time (around 15 seconds). It quickly became unacceptable to further slow down the usual crippled bootstrap of a JEE application server by holding the initialization thread longer than necessary. Consequently, the data driver had to delegate its actual initialization to another thread in order to free the main thread so it could perform its lengthy EJB bootstrapping business.

On top of that, because IP address blocks get re-assigned regularly, a geolocation database must be refreshed frequently, else you end-up with clients that appear to be in Antarctica instead of Kansas (that would be tough on the penguins). So the data driver had to be capable of hot reloading its data at any point of time while still being responsive (i.e. by keeping to use the old data until the new ones were fully loaded).

So here I went adding all these features and I ended up with that:

In complete violation of the SRP, my original jolly little driver had become the Mother Of All Drivers, capable of doing everything and even more.

At this point, my virtual green wristband started to burn my wrist pretty badly. I was hearing Uncle Bob's voice threatening me of disasters of cosmic proportions if I would not live up to my professional standards and refactor the code right away.

So I came up with this refactoring:

I gave the ReloadingDriver the single responsibility of hot reloading. It delegated the data access operation to the actual driver, to which it kept a private reference. Instead of dealing with state with flags, I refactored it according to the state pattern and used a null object to represent the "not ready" state.

To give you a better idea of how the state of the ReloadingDriver evolves, I have added a vertical time-line to the following diagram:

Interestingly, the mechanism that performs the regular reload is the same the performs the initial load that transitions from "not ready" to "Version1".

As a closing note, nothing of this would have been possible without my best friends: the Executor and the AtomicReference. I want to thank them here for their constant support in my concurrency software development endeavors.

Tuesday, March 10, 2009

SOA's Eulogy is Liberating

One of the thoughts I gathered from last night's panel on the possible death of SOA, pertains to the natural consequence of the push back on the WS-DeathStar and the spike of interest in the REST architecture.

So what is the consequence of dropping the dream of web-level distributed transactions and beyond-the-firewall data consistency?

Except for Juval Löwy (who I guess was playing devil's advocate with a mischievous wit), the panelists were unanimous on the need to create systems that handle failures gracefully, can self-heal and achieve eventual consistency.

David Platt said it very clearly: we need to design systems so they handle failures gracefully at each level of their architecture. Michele Leroux Bustamante added that this obviously does not remove the need for reliable storage (or messaging) at some point, but never beyond the firewall.

They presented these concepts with a baffling unabashed attitude, which I immensely appreciated. There is nothing more liberating than thought leaders debunking FUD.