Wednesday, December 16, 2009

Is Test Overlap A Necessary Evil?

In a recent blog post titled "The Limitations of TDD", Jolt Awards colleague Andrew Binstock shared some reservations Cédric Beust has about TDD. When a person of extensive experience like Cédric speaks about testing, you pay attention. And I did.

Among the very interesting quotes from Cédric that Andrew has reproduced, the following really struck me:
Another important point is that unit tests are a convenience for *you*, the developer, while functional tests are important for your *users*. When I have limited time, I always give priority to writing functional tests. Your duty is to your users, not to your test coverage tools.

You also bring up another interesting point: overtesting can lead to paralysis. I can imagine reaching a point where you don't want to modify your code because you will have too many tests to update (especially in dynamically typed languages, where you can't use tools that will automate this refactoring for you). The lesson here is to do your best so that your tests don't overlap.
Trust me, as a test-infected developer, I would love to stay in a state of self-delusion and pretend that test-induced paralysis doesn't exist. But that would be a lie: the reality is grimmer than the wonderland of testing I would wish to live in. The reality is that tests both encourage and resist change.

On the one hand, tests encourage and support refactoring: when the behavior of the application should not change but the code needs to be re-organized, tests are a blessing. They give you the courage to dare changing code because of the immediate feedback they give when you've been refactoring a little too aggressively. And this is priceless.

On the other hand, tests resist behavioral changes. Because tests have captured all the nitty-gritty of your application, when comes the time to change its behavior, you will need to invest time to adapt your tests accordingly, and this whether you rework the tests first or not. As Cédric pointed out, in a dynamically typed language, this is immensely painful as development tools are almost useless in assisting you with the required changes. Similarly, if you use mock objects, you are good for going down a deeper Circle of Hell, where more painful and frustrating manual fixes await you.

So, is there any hope out of this love / hate relationship? Knowing that "the only way to go fast is to go well" dumping tests altogether is certainly not an option. Could the solution lies in Cédric's very last words: "do your best so that your tests don't overlap"?

At this point, I don't know yet but I've decided that, as a starting point, I should start to estimate the amount of overlap I'm dealing with in the Erlang game server I'm working on. Interestingly, what I've found could pretty much apply to the vast majority of Java projects I've been previously working on. Maybe it applies to your projects too?

The first thing I've looked at is the testing overlap that exists between two layers of our application:

As you can see, the overlap exists because tests of the upper layer rely on mocks to simulate all the happy paths and most of the unhappy paths of the underlying layer. The overlap is not total because a layer tend to reduce the granularity of the unhappy paths it faces internally in order to expose the upper layer to a limited amount of bad situations to deal with. Hence the limited amount of mocked features in the overlap area.

When applied to a typical vertical slice of our system, it looks like this:

This is not too bad. Until the wind of feature change comes blowing on this mock-based card-house of tests, life is peachy.

Until now, the tests I have been looking at were only unit and database ones. If I add our functional tests on top of the overlap diagram, here is what I get:

Now the application container is also tested, plus we get an insane amount of overlap.

But the amount of overlap is not what I want to discuss first: it's the test coverage profile that I want to look at first. Notice how the functional tests explore less unhappy paths as they exercise deeper application layers. This can be explained simply: some unhappy paths are very hard to reproduce via the reduced set of functionalities exposed at the top level, oftentimes because they require a very specific and complex state to be established beforehand or conditions that could only be met in case of low level failures (loss of networking, for example).

It's obviously out of the question to consider dropping functional tests in order to reduce the testing overlap. As Cédric said, they are the only tests that have a true value for the end user of the system. My experience confirms that you can reach a nearly flawless first-time client integration if your functional tests have a coverage profile that is similar to the one in the last figure above.

The only problem lies in the quality of feedback you get from functional testing: because it's impossible to make the gory details of the errors encountered when exploring unhappy paths surface at the uppermost level, your system must have a solid logging strategy that allows you to precisely track issues, should you decide to code using functional tests as your only safety net.

So are the unit tests overlapped by the functional tests the ones that must go? Cédric again gives the answer: if time is short, it's better to focus on the functional tests. Of course, if you have a battery of unit tests in place, keep them.

But, maybe, just maybe, as you move to your next project, consider writing functional tests firsts? That way you would have built first the tests that truly matter and, if time permits, write unit tests as you implement the features expected by the functional tests.

Sunday, November 15, 2009

Zulu Zabbix

I am posting this mainly for the sake of reference and, maybe, helping others with the same problem.

If, like us, you're running the Zabbix monitoring platform in Zulu time (aka UTC), you should have noticed a time glitch when displaying historical graphs.

The cause of this problem is simple: the fancy controls in the browser-based user interface are rendered using JavaScript, hence based on the time of the machine used to browse the graphs.

Though we are strict in running all our servers in Zulu time, we haven't crossed the chasm and decided to run all our workstations and the rest of our life in UTC. So here is the simple fix you can apply to js/sbinit.js

The idea is to simply add the local browser time offset to the Unix time. With this fix in place, you will enjoy good looking graphs and correct navigation in them.

Time is really the stumbling block of software engineering...

Wednesday, November 11, 2009

Meeedia Playeeer

I've been caressing the idea to buy a Wi-Fi enabled media player in order to tap into the gigabytes of (legal) music that sits in my NAS. I've considered investing into a Logitech Squeezebox, or a similar product, but I wasn't sure such a device would be able to play directly from an NFS share, without any music server running somewhere.

Just when I started to consider building a player out of a SheevaPlug, I remembered of the ultimate source of cheap hardware, ready to be repurposed: eBay. $125 and a few days later I had a like-new black Asus Eee PC 2G Surf waiting to be turned into a music player.
The unit came with Ubuntu Eee Hardy Heron on it and only 50MB of free space left on its 2GB solid state drive. After a merciless review of all the installed applications, I ended up with 200 MB of free space, ready to host a music player.

Finding the right music player was no small feat.

I really enjoy Audacious on my work laptop because it's plain simple and is able to play music directly from an NFS mount without any glitch. But it lacks an integrated library manager, which is a must for any software powering a machine dedicated to playing music.

So I went on trying all the players with integrated music library manager I could find in the Heron standard software repository (I won't quote names because most of these applications have now better versions available). All of them were suffering from multiple woes rooted in their bad handling of network fluctuations. The most common issue was a too short not-configurable music buffer, leading to broken music replay. The worst issue was with a library manager that was not only taking ages to scan my 16+GB of music but also, on the first network glitch, would start to delete songs, one by one, from the partial library it had created (talk about defensive programming gone bad).

So I ended up installing Amarok. The reason why I didn't immediately install it, knowing it has been my favorite player for all the time I was on Kubuntu (until the KDE 4 debacle), is its sheer size. It's a 120 MB install and on an almost full drive it didn't feel like a good idea to try it first.

This turned out to be the perfect match! Not only Amarok plays music from my NFS mount without a glitch, but its music library is totally unaffected by disturbance in the Wi-Fi signal.

All in all, my Eee Music Player is doing great. It only takes a few seconds to be resurrected after being suspended and music starts playing soon after.

Do you think repurposing full fledged computers into single application hosts is a crazy idea? Is it something you've considered or done already?

Saturday, October 31, 2009

Software Manifestos: A Matter Of Trust?

As software manifestos have started to proliferate these past months, I have started to wonder what could be the root cause for their creation. Why would thought leaders gather, assert a small set of values and shrink-wrap them as a manifesto, calling for others to sign it? My feeling is that these manifestos are the expression of a pushback on a particular aspect of software development that went insane.

Here is a little game: match the manifestos with the software insanities they push back on:

Big methodology and design up-front
Software craftsmanship manifesto
Army of flying monkeys testing

Agile manifesto
Snake-oil vendors and ivory tower architecture
QA manifesto
Reckless programmers and incompetent coders
SOA manifesto

(One manifesto I see missing here is the "recruiter manifesto", which should push back on inane keyword-driven head hunting schemes solely able to put the wrong people at the wrong spots)

If we dig deeper, we become tempted to ask why is our industry suffering from such insanities? What does make software different? Could it be because of complexity?

Complexity. Software entities are more complex for their size than perhaps any other human construct because no two parts are alike (at least above the statement level). If they are, we make the two similar parts into a subroutine--open or closed. In this respect, software systems differ profoundly from computers, buildings, or automobiles, where repeated elements abound.
Frederick P. Brooks, Jr., No Silver Bullet

The natural reaction to complexity is to try to escape it at all cost, even if it means wilfully practising self-deception. Hence silver bullets, hence snake oil vendors, hence all these methodologies, governance committees and ivory towers that are there to nurse the insecurity of higher levels of management by giving them the impression software creation is under control and, finally, out of the hands of programmers.

Of course, it doesn't work that way: years and millions of dollars later, reality comes knocking at the door, manifestos are getting written and everyone is sent back to the same fundamental question they've been trying so hard to avoid: how to build trust in software developers?

And that's of course a question for us, software developers. How can we build such a trust in us when so many forces are pushing towards the opposite?

Granted that software development is unpredictably complex and that this complexity reveals itself when the devil shows up (those pesky details), it is clear that the overall battle of trust is fought during each decision, when tackling each detail and writing each line of code.

I think we could learn a few lessons from the world of aviation, where trust in pilots has been built progressively and methodically. When you fly an airplane, you have plenty of decisions to make and losing any of these battles can end up very badly for everyone. So why are pilots trusted? Aren't they fully superseded by ATC anyway? Answer is no: even if ATC has authority, the PIC (Pilot In Command) has the last word because he is the one out there dealing with the ultimate reality of flight. Despite its authority, ATC doesn't micro manage the pilot: the pilot is in-command.

To have the privilege to be a PIC, you have to remain current and regularly prove that you can be trusted for your judgement based on your skills, experience and training.

If the acronyms didn't sound so bad, I would dare suggesting programmers should become DICs, ie Developers In Command. Though working under different forms of authority, DICs would be fully trusted for taking the final decisions in the daily battle of writing code. In this world, it wouldn't be an heresy to say that developers could build large and complex software systems from the ground up, without the need for snake oil, committees or big design.

When trust will be manifested, we won't need manifestos anymore.

Monday, September 07, 2009

Looking for my seams

Like any test infected programmer switching to a new development platform, I have spent my first days working with Erlang looking for my seams. Here, I am talking about seams as defined by Michael Feathers in Working Effectively with Legacy Code: "A seam is a place where you can alter behavior in your program without editing in that place." As such, seams are key enablers for unit testing as they allow you to redirect calls leading out of your SUT to mocks or stubs or any kind of test double you tend to favor.

In object oriented programming, this is a given thanks to polymorphism and dependency injection. But in Erlang, where SUTs are MUTs (modules under test) and the common idiom for invoking a method is module:function(parameters), things are a little less obvious. Indeed, hard-wired function calls from one module to another don't leave much room for any kind of substitution. Without the capacity to fully test my modules in independence, I quickly started to feel uneasy. After a few days, it felt like free-falling without a parachute.

Then I started to seriously investigate my options...


Macros allow you to define blocks of instructions that the pre-compiler will substitute for you at the different places you refer them. When used in conjunction with flow control statements, macros can be used to switch one code fragment with another one by passing a parameter to the compiler. This seems to fit the bill as you can use conditional macros to alter behavior without editing the places where the macro is used.

This said, I have quickly ruled out the use of macros as a valid seam. Imagine having to do this for all the function calls leading out of the MUT:

Moreover, if a mistake exists in the non-unit test wiring part of the conditional macro, I would have had to wait for integration tests or actual deployment to get feedback on the issue.


Though the common idiom is to early bind the module and function you want to call, Erlang is fully capable of late binding and dynamic invocation, as this very crude example illustrate:

This opens interesting possibilities for MUTs that expose higher order functions. If the function that must be tested accepts one or several functions, passing a mock implementation is just a matter of providing an anonymous function of the same arity. This mock would perform nothing besides storing the received parameters in a shared storage, like the process dictionary, for later inspection.

Unfortunately, not all functions receive their dependencies as parameters but instead perform direct calls to other functions in other modules. It could be a plausible and drastic design decision to forbid all direct inter-module calls in favor of passing dependencies as anonymous functions via additional arguments. Some have suggested to use a record to pass around all your application dependencies as a single extra argument added to all functions.

Interesting but the idea of polluting all functions with additional arguments is less than palatable. In fact, it would great if these extra arguments could be defined module-wise and implicitly added to each of its functions... Rejoice! Parameterized modules have been introduced to perform exactly this delicious syntactic sugar trick!

Parameterized modules

I have discovered parameterized modules while writing controllers for Mochiweb. In this pretty cool HTTP server, the request reference that your processing function receives points to a parameterized module, allowing this kind of neat syntax:

Though this may feel object oriented, don't get fooled: behind the scene, there is no instance of anything. The Request reference contains all the hidden parameters that the get function needs besides the atom specifying what you want to get. Behind the scene, what really happens is more likely something like that:

But because the Mochiweb Request is a parameterized module, all the extra parameters have been specified once, packed in the reference and stay hidden there for your utmost convenience!

From there, it's easy to see how to write stubs for parameterized modules: just write another parameterized module that export functions with the same signature as the ones you use in the real module. Here is a very incomplete but fully working request stub for Mochiweb:

Note how I use the process dictionary to store values that I will later retrieve for asserting everything went as expected. By using parameterized modules, I have been able to reach near 100% code coverage. Does this mean parameterized modules are the best thing since sliced bread?

Well, so much for the free lunch as there are some drawbacks to consider:
  • Experimental - Parameterized modules are still officially considered as an experimental feature of Erlang, hence subject to change. Unlike the Java world where everything is kept for ever just in case, Erlang doesn't patronize developers, so if this feature is one day bound to oblivion, it will be tossed out. And quickly.
  • Unchecked - Unlike with a direct module's function reference, compile-time checking is not available, leading to possible bad surprises at runtime. If the parameterized module reference your code uses does not expose the expected function, you're in for a nasty error. In fact, you can totally pass a reference to a Foo module while your function expects a totally unrelated Bar module. As a tentative mitigation, I have added a verification function in my modules so they ensure at start-up time they are correctly wired. This feels like framework-envy,so I'm not fully satisfied with this approach.
  • Confusing - Because the actual module is not directly referred to, reading such code becomes more complicated. You have to infer from the context (or some coding conventions, or even comments) what is the module that will actually be wired-in at runtime. Decreasing understandability is definitively not a good thing.

Besides these downsides, I still believe that the complete MUT isolation and behavior swapping facilities offered by parameterized modules make them a very interesting tool for the test-minded Erlang developer.

Closing notes

MUTs have other kinds of dependencies that you will want to substitute at unit testing time. To name a few:
  • Process dependencies - A MUT can contain functions that directly depend on other processes via their PIDs (process IDs). An interesting seam here is the local registry of processes (and ports) that you can use to set-up test processes and register them under the same name as the ones used at runtime.
  • Mnesia - Stubbing out calls from the controllers to the DAO is a good strategy but what about the DAO itself? Instead of stubbing out each Mnesia call, I have opted for running it in-memory at unit test time (à la hsqlddb) and activating file persistence only at runtime. This is extremely fast so very well suited for the task.

Finally, if you wonder what unit testing framework I am using, I have opted for etap, which I find very simple and powerful enough for my needs. If you want something more structured and feature-rich, EUnit is the answer.

Free fall is over: I have found my seams and landed seamlessly. Please share your own test infected adventures in Erlang.

UPDATE 23-SEP-2009: Hot code swapping is also a very powerful seam, that has been smartly leveraged to create ErlyMock, a quite capable mock framework for Erlang.

Saturday, August 15, 2009

Why Software Craftsmanship?

If you wonder why is the Software Craftsmanship movement valuable, Calvin and Hobbes have the answer for you:

© 1996 Bill Watterson

Wednesday, August 12, 2009

Zombie ESBs and the Integration Craftsman

During the past months, ToughtWorkers have been regularly pounding on ESBs in a manner that Martin Fowler has neatly summarized like this:

"Hang around my colleagues at ThoughtWorks and you soon get the impression that the only good Enterprise Service Bus (ESB) is a dead ESB. Jim Webber refers to them as Egregious Spaghetti Boxes. So it's not uncommon to hear tales of attempts to get them out of systems that don't need them."

The reasons for such a reaction to ESBs are multiple and, more often than not, very valid. I think they stem from two main issues: the proprietary nature of such platforms (see Ford's "Standards Based versus Standardized") and the architectural quagmire an excess of "enthusiasm" towards them can entail (see Dörnenburg's "Making ESB pain visible" and Webber's "Guerilla SOA").

The only problem I have with thought leaders pounding on ESBs is the negative aura it can create around developers involved in integration projects.

What? Why do I dare talking about integration while the subject is about ESBs? Well, both subjects have become intertwined because many so-called ESBs out there are simply re-purposed integration platforms. And by re-purposed I really mean deployed as an ESB topology because ESB is first and foremost a topology and not a product (as Ross Mason pointed out in "To ESB or not to ESB").

So can developers working on integration projects be real craftsmen? I think they can and I think they should.

This may sound a little naive but it's not. Consider the following:

  • Integration has patterns. Thanks the work of Gregor Hohpe and Bobby Woolf, developers have access to a vendor-independent semantics under the form of the Enterprise Integration Patterns. Being able to model and discuss integration without referring to a particular implementation is invaluable for craftsmen.
  • Integration has testing. It is oftentimes a challenge to test complex integration project and developers could be tempted to skip it altogether. Once again thanks to Gregor Hohpe, but with Wendy Istvanick this time, testing at all level of an integration project has been proven possible and documented.
  • Integration does not preclude SDLC practices: one point we tried to make in Mule in Action, is that even if your project consists in configuring an integration tool, you should not cease to be a craftsman and you should exercise good judgment and abide by your professional standards. You want to shoot for no less than reproducible builds and deployments in your integration projects.

So, whether ESBs are better dead or undead, developers dealing with integration projects should strive to be software craftsmen above anything else.

PS. Isn't it ironic that Gregor is an ex-Thoughtworker?

Saturday, July 18, 2009

Mule in Action: Now Treeware!

I feel a little like George McFly, now...

Trees had to die to get us there by here we are: Mule in Action is now treeware. And in case you missed it, the making of was here.

Enjoy the reading!

Tuesday, July 07, 2009

Legacy Code Sonar Signature

In "Working Effectively with Legacy Code", Michael Feathers gives this definition:
To me, legacy code is simply code without tests.
He also adds:
I've gotten some grief for this definition.
Indeed, defining legacy code is hard.

After purging one of our project from code that we consider legacy (deprecated, EJB2...), we noticed this interesting trend in our Sonar dashboard:

Notice how code coverage is way less affected than complexity and rules compliance. Our legacy code had test coverage, not the greatest ever, but it had some.

Therefore, I am wondering if we could then postulate that legacy code is code that has fallen short of software development standards?

This is actually what is expressed by the above graph: ditching legacy code drastically improved compliance to standards expressed as acceptable complexity and compliance to coding rules. These standards evolve over time: as legacy code is left abandoned, it slowly drifts away and gets less and less compliant by just standing still.

Have you noticed similar trends?

Monday, June 29, 2009

Demeter's Wrath and Angry Monkeys

Transgressing the Law of Demeter can not only attract the grain goddess' wrath on you but can also turn classes into angry monkeys. Let's see how.

Consider this freshly created method and notice how it asks for more than it needs, setting the stage for the upcoming drama that involves a fuming Demeter in her heavenly spa:

This method needs a country code to perform its unspeakable computation but asks for a User object. Why would anyone do that? Well, if User objects are common goods around the class that holds this method, it may look like a natural thing to do.

Inevitably, methods in other classes will have the need for a schniblitz to be computed for them. They will use the above method and will, therefore, need to provide a User object for them. And they may well be cool with that.

Now enter the angry monkeys. From dependency to dependency, the fact that a User object is needed to compute a schniblitz will get propagated. After a while, once the client call will be far enough from the original method, in another module or project, handled by a different team in another site, no-one will have the slightest clue why a User object is needed. But a barrage of classes will angrily ask for it.

So when an innocent victim comes around, it will be badly beaten if it doesn't present a valid User object in exchange for a nicely computed schniblitz.

But, for this victim, this may well be a big deal. Perhaps, it does not have a full User object handy, so it will need several database calls to completely build it so it can pass it and get its schniblitz. Or perhaps, it deals with NewUser objects, which are the next great things on its side of the world, forcing it to create complex and error-prone object converters in order to turn a NewUser into a User and... get the darn schniblitz!

The kicker is that the innocent victim had a contextually-valid country code. So should the original method had reduced its needs and simply had asked for what it needed, the angry monkeys would have had no "Thou Shalt Provide A User" mantra to sing and life would have been beautiful.

If you find that this whole story sounds too made-up to be true, take a closer look at your code. I bet that, if your code base has a few gray hairs and a bunch of modules, you're more than likely to have a few angry monkeys here and there.

Eek. Eek.

Tuesday, June 09, 2009

Mule + Groovy = REST

Groovy's MarkupBuilder makes outputting REST microformats a bliss.

Read more about this in my guest blog entry "Having Some REST with Mule’s Power Tools" that MuleSource has just published on "From the Mule’s mouth".

UPDATE AUG-2009: InfoQ has just published a longer version of this blog entry.

Monday, May 25, 2009

Strict Unit Tests for Public Data Contracts

Suppose we have the following code:

When testing such a code, it is tempting to modify the visibility of RESULT to package protected in order to write tests that share the constant value:

After all, reuse is good, right?

Well, in this case, I think that reusing this constant is not a good idea if your API is a public one (or if this code gets exposed as a service, which is practically the same thing).

In fact, I advocate to write the test like this:

But why the duplication?

The catch with public APIs is that they create long lasting expectations in an uncontrolled number of users and systems. Consequently, stability should be their essential characteristic. Through interfaces, it is easy to provide an illusion of stability: as long as the API is backwards-compatible binary-wise (or operation-wise for services), it can change at will and life is peachy.

So what is so risky with the static field above?

Well, the fact of the matter is that the value returned by the doThing() method is also part of the contract. Indeed, beyond the object-oriented concept of interface, data is also part of the overall contract with a particular class (or service). So data should exhibit the same stability as the interface itself.

When sharing a constant in the unit test, it is possible to modify the data contract without noticing. Suppose I change the value of RESULT from "Joy" to "Happy". The first test will give me a green light, while the second will be red. And it is the latter that I am looking for: I want my unit tests to tell me that I have broken the data contract of my class.

Not its users...

Monday, May 11, 2009

Mule and the Home Service Bus

While following the discussions on Oasis Blue's SmartGrid Interest List, I noticed that smart device makers quickly reacted to the draft charter for the proposed OASIS Energy Market Information Exchange (eMIX) Technical Committee by stating that their capacity to implement full-fledged SOAP clients was limited. Looking at the bare-metal specifications of protocols like Zigbee, it is easy to understand that SOAP would be another board game.

Of course, when I heard about this need for protocol adaptation, my favorite quadruped quickly came to mind (I know, when you have a hammer...). So I came up speculating about this:

Yes, that is Mule ESB running on a Sheeva Plug. In fact, I should say Mule HSB, as in this case the platform would serve as a Home Service Bus.

What could be the role of such an HSB?

Protocol adaptation comes first to mind. Allowing all your home devices to interact together internally but also with the outside world with "higher order protocols", like SOAP or anything more RESTful. What would happen if your smart meter starts talking with your dryer?

Orchestration would be another benefit. Here I am not thinking in term of the classic and already solved home automation problem. Imagine orchestrating your home devices in order to satisfy some predefined power consumption patterns. What would happen if a BPEL engine was running your house?

Home rules application would be the ultimate benefit. By plugging a rules engine to Mule, one could define advanced scenarios when a house would automatically negotiate energy purchases and sales based on inference rules and facts (like: do you have an plug-in hybrid sleeping in your garage). What would happen if your house was smart enough to make money while you sleep?

I agree, this is a lot of speculation. But I reckon that a capable Home Service Bus will one day become a must in our habitations.

PS. Google just told me that Bert Boerland coined the Home Service Bus term. We must be up to something ;-)

Sunday, May 10, 2009

Organic Distributed Systems

Migrating monolithic systems to distributed ones is probably one of the most exhilarating tasks in software development.

Monolithic systems, even if they engage in interconnected relationships, remain pretty much like silos (I like compare a network of monolithic systems to silos connected by monkey bridges). Reflecting on the work I have been doing in this domain for the past years, I came to realize how much an IT landscape of distributed systems ends up resembling a living organism.

Indeed, some proprieties of the organic world emerge in a system that evolves towards distribution.

Distributed systems are more resilient. Local issues in living organisms tend to remain local instead of endangering the whole system. This is achieved via redundancy, heterogeneity and a limited coupling between each part of an organism. Interestingly, the same applies to distributed software systems: if properly decoupled (interface-wise and time-wise), a particular system can be in pain without taking down the whole operation.

Distributed systems are harder to diagnose. Rare or complex diseases are uneasy to diagnose and often require many analysis to be performed. Distributed systems present the same challenge, complicating forensics tasks when something went haywire. Using tracing mechanisms, like correlation IDs, can simplify such diagnostics, the same way DNA-tracing can help figuring out the spread of a particular gene (or virus!).

Distributed systems can self-heal. Living organisms embark all sorts of self-healing mechanisms, from localized ones (cicatrization) to global ones (fever). Because each member of a distributed system focus on specific tasks and has reduced coupling to the rest, it has more freedom to perform recovery operations in isolation without needing the involvement of any other member. This said, it will still need an escalation mechanism in case the issue can not be treated locally.

There are surely other qualities distributed systems exhibit that make them look like living things. Do you think of any?

Thursday, April 09, 2009

Jitter Mule Functional Tests With Jitr

Embedding Mule in a web application allows you to tap the Servlet layer of your favorite web container, which is a good thing as you are supposedly very familiar with its behavior and tuning.

When it comes to writing functional tests for such an application, my strategy was to replace the Servlet endpoints with stock HTTP ones, a substitution that is trivial to perform thanks to the modularity of Mule configuration files: I simply loaded a slightly different set of files at functional test time than the actual file configured in my web.xml file. Since I was writing the tests as subclasses of org.mule.tck.FunctionalTestCase, all the goodness of Mule was readily available as protected methods or members.

Still, I was not very satisfied with this approach because of the discrepancy between what my functional tests were exercising (the stock HTTP transport) and what I was actually using (the Servlet transport). This discrepancy bit me badly in the past weeks with a problem only showing up with the Servlet transport.

Then came Jitr, the latest creation of international software samurai Josh Devins:
Jitr (pronounced "jitter") is a JUnit Integration Test Runner. It allows your web application integration tests to easily run against a lightweight web container in the same JVM as your tests.
It happens Jitr is extremely well fitted for functionally testing a Mule instance embedded in a web application. Here is the structure of such a Jitr powered test:

Thanks to the access to the MuleContext, all the internals of the loaded Mule instance are available, which means that I am free to use my asynchronous testing strategies. It's party time. really!

Note that Jitr tests are pure JUnit 4 tests, without any sub-classing needed.

Because I am still loading a slightly different configuration at test time, as I am replacing file endpoints with VM ones for easier testing, I have to load a different web.xml file. Nothing to worry, as Jitr offers the possibility to load this file from an alternate location by means of configuration only. So here is the actual beginning of my test class:

Jitr is a capable new tool that will complement the tool box of any developer having to run functional tests on web services.

Friday, March 20, 2009

Application Pwnership

Who owns this application? What can possibly be complicated about such a simple and innocent question?

Unfortunately, the answer to such a question is not that easy. Or at least, we have created software organizations that make it hard to answer.

Though it makes sense to have a division of labor between different teams qualified on certain aspects of software applications, the main problem resides in the partial system views that such a division creates.

Out of developers' hands, an application is oftentimes perceived by QA, DBAs or Operation teams as a giant black box:
(yes, it is a canonical black monolith of 1-4-9 proportions)

In fact, this giant black box has a few keyholes on it and, through these keyholes, they can barely peer into it. Consequently, the black box effect leads to QA teams seeing applications as sets of buttons to push, DBAs seeing them as data and table spaces and Operation teams dealing only with cryptic log files and alarms of all sorts.

No-one knows what is inside this darn black box. And when somethings goes haywire, only the Magician of Oz is deemed able to do something:

Now that everybody is so agile, this kind of concern may sound irrelevant. After all, developers are now generalizing specialists, so the barrier between them and other teams is now significantly lowered. Or is it?

The reality remains for the most part a difficult hand-off of applications between teams. I have seen different tentatives to improve things (including detailed operational manuals), but at the end of the day most of these attempts amounted to shallow knowledge transfers.

Ownership needs more.

Do you have any success story where applications have been successfully pwned by different teams?

Wednesday, March 11, 2009

Hot reload and the SRP

Not so long ago, I have been tasked with the development of an in-memory IP address geolocation library. Yep, that was pretty cool and challenging at the same time (well, the challenge made it cool, right?).

In this short post, I want to share how the design of one component, the data driver, has evolved over time and how the Single Responsibility Principle (SRP) inspired this refactoring. The data driver was the poor guy charged of loading millions of geolocation data entry with the smallest possible memory footprint on a JVM (it did his job quite well, as the sizes of the zipped raw data and its in-memory form were of the same magnitude).

Like in any projects, everything started nice and simple. I have removed many moving parts and simplified the design to focus only on the discussion point, but what you see hereafter is pretty similar to where I started:

Quickly enough though, the rosy picture turned to a slightly less appealing color (kinda sorta brownish).

Loading all this data in memory takes time (around 15 seconds). It quickly became unacceptable to further slow down the usual crippled bootstrap of a JEE application server by holding the initialization thread longer than necessary. Consequently, the data driver had to delegate its actual initialization to another thread in order to free the main thread so it could perform its lengthy EJB bootstrapping business.

On top of that, because IP address blocks get re-assigned regularly, a geolocation database must be refreshed frequently, else you end-up with clients that appear to be in Antarctica instead of Kansas (that would be tough on the penguins). So the data driver had to be capable of hot reloading its data at any point of time while still being responsive (i.e. by keeping to use the old data until the new ones were fully loaded).

So here I went adding all these features and I ended up with that:

In complete violation of the SRP, my original jolly little driver had become the Mother Of All Drivers, capable of doing everything and even more.

At this point, my virtual green wristband started to burn my wrist pretty badly. I was hearing Uncle Bob's voice threatening me of disasters of cosmic proportions if I would not live up to my professional standards and refactor the code right away.

So I came up with this refactoring:

I gave the ReloadingDriver the single responsibility of hot reloading. It delegated the data access operation to the actual driver, to which it kept a private reference. Instead of dealing with state with flags, I refactored it according to the state pattern and used a null object to represent the "not ready" state.

To give you a better idea of how the state of the ReloadingDriver evolves, I have added a vertical time-line to the following diagram:

Interestingly, the mechanism that performs the regular reload is the same the performs the initial load that transitions from "not ready" to "Version1".

As a closing note, nothing of this would have been possible without my best friends: the Executor and the AtomicReference. I want to thank them here for their constant support in my concurrency software development endeavors.

Tuesday, March 10, 2009

SOA's Eulogy is Liberating

One of the thoughts I gathered from last night's panel on the possible death of SOA, pertains to the natural consequence of the push back on the WS-DeathStar and the spike of interest in the REST architecture.

So what is the consequence of dropping the dream of web-level distributed transactions and beyond-the-firewall data consistency?

Except for Juval Löwy (who I guess was playing devil's advocate with a mischievous wit), the panelists were unanimous on the need to create systems that handle failures gracefully, can self-heal and achieve eventual consistency.

David Platt said it very clearly: we need to design systems so they handle failures gracefully at each level of their architecture. Michele Leroux Bustamante added that this obviously does not remove the need for reliable storage (or messaging) at some point, but never beyond the firewall.

They presented these concepts with a baffling unabashed attitude, which I immensely appreciated. There is nothing more liberating than thought leaders debunking FUD.

Friday, February 27, 2009

Conversation with a Web Thread

DD: Hi Mr. Web Thread and thanks for joining us.
WT: My pleasure. Do you mind if I stay in the pool?

DD: Hmm? Sure, why not. So, can you please tell us how is your life nowadays?
WT: Life has been pretty good. I have become very popular recently and came to perform some massive gigs in highly trafficked web sites. I really like this pool.

DD: Mmhh, okay. How do you think developers treat you, nowadays?
WT: Well, I am glad you ask. I think things have improved a lot, thanks to the emergence of concepts like continuations and AJAX. Still, I sometimes get badly beaten by some reckless coders. This pool is awesome.

DD: So, if you were to give a piece of advice to these programmers, what would it be?
WT: I think that these developers only need to understand what is the ultimate goal of my life.

DD: Which is?
WT: Coming back to that pool!

DD: Pardon me?
WT: You see, I get pulled of this pool very often and sometimes I am forced out of it for too long. In that case, I am like a fish out of water. And when I suffer like that, the whole application suffers.

DD: So what you are saying is that developers should thrive to let you return to the pool as quick as possible?
WT: Absolutely.

DD: Whatever the cost may be?
WT: Well, it depends of course, but if they can afford letting me return with slightly stale data, that is the thing to do.

DD: Where would they get such a stale data from?
WT: Oh, I realize I never introduced you to Mrs. Web Thread, née Cache. She usually takes care of this.

DD: But ladies, sorry, caches are complex. They require eviction strategies, invalidation messages broadcasts, etc.
WT: Or not. You can use a simple time-evicted opportunistic cache, provided it matches your business needs.

DD: Ah, I see, something like a 5 minutes cache.
WT: Or 5 seconds.

DD: You got to be kidding me, what's a 5 seconds cache worth?
WT: Do you have any idea of all the things I can do in 5 seconds? Do you realize that it is an agonizing eternity for me?

DD: Well, uh, I guess not. Let us lighten up the debate a little. Now that you are a rock star, thanks to your success in multi-million users web sites, do you get to sign a lot of autographs? Do people recognize you a lot?
WT: Sure, I can hardly go anywhere without being spotted. But, believe it or not, it still happens that I get mixed up with my cousins.

DD: Oh, you have cousins?
WT: Yes, Worker Thread and Background Thread. Sometimes, I get confused with one of them and receive way too much work to do or work that simply does not concern me at all.

DD: What do you do in that case?
WT: As I said before: I become grumpy and the whole web tier suffers with me too. I kind of like to share my pain!

DD: OK, Mr. Web Thread, I think it is time we wrap up now. Thanks a lot for your time and sharing your experience with us.
WT: You are welcome. Now back to the pool.


Just Read: Statistics Hacks

More than twenty years after my last statistics class, this book really tasted like a rejuvenating read. It is well structured, with an opening focused on theory followed by numerous applications in all sorts of domains (yes, including poker, though my preferred subject was the Drake Equation). As such, the book will stand as a quick reference guide to which the reader will return every now and then.

Recommended for aging minds in need of a refresher (like mine) or curious minds wanting to learn more about statistics and how relevant they are to their every day's life.

As a side note, this is the first e-book I bought: I got it under EPUB format, which renders almost flawlessly on my Sony PSR-700 (texts, figures, schemas are good, very large tables are cut though). Kudos to O'Reilly who do a truly great job with their e-books, providing them in three formats and not limiting you in the number of times you download them (so I don't even bother backing-up my purchases).

Wednesday, February 25, 2009

Standards vs. Upgrade pains

In Standards Based vs. Standardized Neal Ford develops a very interesting rhetoric that is mainly focused on what he sees being "wrong with SOA (Service Oriented Architecture) in the way that it's being sold by vendors", but really touches a vast subject that has many thought provoking ramifications.

While considering the application upgrades I am involved in or that occur around me, it is clear that applications deployed on standardized platforms have easier upgrade paths. Or have they?

Let me consider the whole spectrum of applications I am involved with and discuss the related upgrade pains, how standards play a role there and if they are worth the effort. I won't discuss the necessity of upgrading application platforms: I reckon everybody feels the same itch when they see an application running on a platform that is many releases behind its current revision.


Neal mentions ESBs as canonical non-standardized platforms, so I will start with them and consider my experience with following Mule versions. Many aspects of Mule are standards-based, from its protocol stacks to the zillions of libraries it is built upon (which I consider as being industry standards). Even its configuration mechanism piggybacks on Spring's schema driven configuration. What is not standardized are Mule's API and configuration syntax. Therefore they tend to evolve quickly, and sometimes drastically, as Mule expands and refines its feature base. Consequently, following Mule versions is not a small feat: it requires a pretty advanced knowledge of the platform to be performed right (and the assistance of extensive unit and integration tests, but everyone has this already, right?). Why would anyone willingly accept to jump through such hoops? Because it is worth it: being exposed to Mule's non-standard aspects is rewarded by a full access to its complete feature set and its full power.


At the other end of the spectrum lies web applications that fully pack their dependencies and can be deployed on pretty much any container that understands their standardized configuration. A typical example is a Spring/Hibernate application deployed on JEE web container, like Tomcat or Jetty. It is to be noted that, even if this kind of applications can have very complex needs, their reliance on the JEE standard is actually quite small: a few entries in web.xml that hook some basic application entry points to the container and that is pretty much it. The rest is not standardized but fully resides in the realm of what the application controls (Spring and Hibernate configurations for example). Hence migrating between different versions or even different implementations of the JEE container is usually a no-brainer.

Somewhat standardized

Finally, in the middle of the spectrum, we find the applications deployed on platforms where standards fall short of allowing them to easily follow releases. Consider a full fledged JEE application, say deployed on JBoss. In this case, two forces act against the standard. The first one is the limitation in scope of this standard: anything that has been left to the interpretation of the implementor will likely differ enough between major releases to cause troubles. The second force is the fact that the standard itself is a keyhole applied against the full range of features of the platform itself: these features are there, accessible, often compelling, and it requires a lot of willpower to refuse to use them directly (by shielding them behind a custom facade) or not at all (by using extra dependencies to get these needed features). These two forces combined usually lead to full-range JEE applications that do not evolve gracefully when their underlying standardized platform is upgraded.

So, there is a clear trade off between having full access to a platform and being isolated from the disruptive aspects of its evolution over time. It is also true that standards with more teeth are probably needed (OSGi is an interesting one in this matter).

But everybody knew this already.

Sunday, February 22, 2009

Two minutes on

Always interested in sharing the futility of my own existence, I thought that I could use to broadcast my boredom level (i.e. LinkedIn status) on different channels.

All went fine until I hit the page for connecting my account to my LinkedIn one. See for yourself:

Seriously, guys, you want the password of my LinkedIn account? Is this really the best we can do for integrating social networks? Are we waiting until the Wrath of OWASP falls on us to fix this?

Then I closed my account.

Thursday, February 19, 2009

Integration Testing Asynchronous Services

Whether you use Mule or another integration framework, writing integration tests for asynchronous services can be a little tricky, as you may start running assertions in your main test thread while messages are still being processed.

Suppose you want to ensure that your content based router sends messages to the expected target services. Your integration test suite will consist in sending valid and invalid messages to the input channel in front of this router and check they end-up in the expected destination services.

But how to be sure that the delivery has happened and it is safe to start asserting values?

The naive solution to this problem consists in:

It is naive because it arbitrarily slows down your test suite without giving any guarantee that the message will be delivered. It may work on your machine but the same suite running on a busy integration server will behave very differently: one second may not be enough a wait if the threads handling the message delivery behind the scene are slower than usual.

A problematic fix to this broken solution consists in:
while (messageNotReceived) Thread.sleep(1000);

This is problematic because if, for any reason, the message does not hit the expected service, this loop will hang the test suite for ever. Of course, you can add a counter to limit the loop and reduce the sleep time to be more reactive, but this amounts to re-inventing the wheel when higher level concurrency primitives are ready-made for that.

Indeed, we can use a countdown latch to hold the main test thread until the message hits the expected service. Here is the code I use for my Mule integration testing:

I leverage the notification mechanism of Mule and the capacity its functional test component has to broadcast such an event when it is hit by a message. Though a Mule specific implementation, the design principle I discuss here is applicable to other situations.

Notice how I pass the triggering action of the test (like sending a message with the MuleClient) as a Runnable to this method. This allows me to prepare a latch and register a listener before calling this Runnable and, then, to wait on the latch until it is lowered. When the execution comes back to the caller method, it can safely assess properties on the received message.

Obviously, the test is designed so its normal execution leads to a message always hitting the right service. Hence, in practice, the wait on the latch is very short, way under the safeguard timeout of 15 seconds.

But what if we want to test that no message will hit a particular service?

This is when things get tricky with asynchronous testing, because there is not systematic approach to testing the absence of an event when it can occur unpredictably in the future.

One strategy I use is to turn the problem around and change the negative assertion into a positive one. For example, by always ensuring a message will go somewhere, even if it is a service that sends messages to oblivion. By doing so, instead of testing that a bad message never hits a good service, I test that a bad message hits the oblivion service.

Do you have other advices to share about asynchronous testing?

Friday, January 23, 2009

The Sin of Generalization

The more I progress in my journey as a software engineer, the more I realize how our profession suffers from the Sin of Generalization. The good news is that this sin can be fought with a rope and a mast. Let me explain.

This almost deadly sin manifests itself in two ways:
  • A tool or a methodology that has been sucessful in one context gets generalized and becomes applied to other contexts, where it ill fits.
  • A generic platform or framework is created to solve all sorts of needs, while ad hoc solutions would have been more straight to the point.
This may sound funny from me, as I am the lead of NxBRE, a .NET generalized rule engine! The reality is that NxBRE, like all rules engines and al generalized tools, has a sweet spot where its usage makes sense. Outside of this sweet spot, using a generalized tool is in fact counterproductive, because the indiosyncracies it is bringing (limitations, bugs, specific configuration...) outweight its benefits.

Of course, there is value in re-use and generic tools. The point of the matter is that it is always worth considering the relevance of a generic tool or methodology everytime you intend to use it. This sounds like an obvious statement but experience shows that generalized solutions are as attractive and noxious as sirens.

Hence the rope and the mast.

Sunday, January 18, 2009

Mule in Action, the Making of

Now that dust has started to settle, I have some time to reflect on writing Mule in Action and share my experience with this process.

I will spare you the "it's a long and exhausting process that is a true trial for one's will", because you already know that such an endeavor requires a personal investment that is beyond whatever time-consuming hobby you could have (that is basically about sacrificing all evenings and Saturdays on the altar of authorship for a solid 6 months).

A technical book like Mule in Action gets its relevance from the richness and correctness of the examples it contains. Inspired by Brian Goetz's experience about writing Java Concurrency in Practice, John and I have decided to treat our examples like a full fledged project. For this we have created a Google code project and built all the samples with Maven. These examples are sometimes started by a command line script, but the true value resides in the unit and integration tests that cover them.

These tests are a life saver because the community edition of Mule 2, especially in its early days, was a moving target, with non-backward compatible API changes and deep behavioral alteration occurring between releases. Any time a new version of Mule is released, we upgrade to it and run a full build of the book examples. Anything that breaks signals a potential change in the book itself. But how do we know if the book needs to be changed and where?

We did not use a code extraction/insertion script, like Brian did, but we decided to enable a complete traceability by tagging our code fragments and keeping this tag along them in a hidden field in the body of the book. This way, whenever we modify the source code, we know if and where the book is affected.

This leads us to the format of the book itself. We opted for DocBook, which is "just" XML backed by a schema. What does this gives us? First, the possibility to grep for a specific code tag or sentence and replace using simple commands. Second, the possibility to diff two versions of the text (because, of course, we store all the chapters and related images in Subversion): this is extremely convenient in a multi-author environment like ours. And, finally, the possibility to use a simple but strict writing environment. Indeed, you do not need a word processor for writing a book: it is just a useless collection of bells and whistles. Any editor with schema validation and a basic spell checker is all you need.

That is for the tooling but what about the practice itself? I have quickly realized that, like a movie, a book is produced in an out-of-order manner. Trying to write a whole chapter in the reading order is futile and could lead to a very frustrating writer's block. Once I started to accept this idea, writing became way easier, with chapters appearing in the order inspiration was leading me to and backing code samples where being created. Even for a purely technical book, inspiration matters: it is a pretty stubborn animal that goes only the way it wants (am I talking about Mule?) and must not be forced otherwise.

Finally, I have found that music was an extremely powerful productivity booster. Maintaining a high productivity is essential for such a technical book, which is condemned to premature obsolescence if it is not written within a short period of time. Listening to lyric-less music was for me a way to boostrap my writing sessions, while oftentimes I would have preferred to call the day over and get some rest!

We struggled hard to stay away from writing a user guide, which is available on-line anyway, and opted to focus on practical examples touching most of the common and advanced usage of this ESB. Hopefully, the book will prove good enough for our readers to get both a deep and wide coverage of Mule.