Monday, September 07, 2009

Looking for my seams

Like any test infected programmer switching to a new development platform, I have spent my first days working with Erlang looking for my seams. Here, I am talking about seams as defined by Michael Feathers in Working Effectively with Legacy Code: "A seam is a place where you can alter behavior in your program without editing in that place." As such, seams are key enablers for unit testing as they allow you to redirect calls leading out of your SUT to mocks or stubs or any kind of test double you tend to favor.

In object oriented programming, this is a given thanks to polymorphism and dependency injection. But in Erlang, where SUTs are MUTs (modules under test) and the common idiom for invoking a method is module:function(parameters), things are a little less obvious. Indeed, hard-wired function calls from one module to another don't leave much room for any kind of substitution. Without the capacity to fully test my modules in independence, I quickly started to feel uneasy. After a few days, it felt like free-falling without a parachute.

Then I started to seriously investigate my options...


Macros

Macros allow you to define blocks of instructions that the pre-compiler will substitute for you at the different places you refer them. When used in conjunction with flow control statements, macros can be used to switch one code fragment with another one by passing a parameter to the compiler. This seems to fit the bill as you can use conditional macros to alter behavior without editing the places where the macro is used.

This said, I have quickly ruled out the use of macros as a valid seam. Imagine having to do this for all the function calls leading out of the MUT:

Moreover, if a mistake exists in the non-unit test wiring part of the conditional macro, I would have had to wait for integration tests or actual deployment to get feedback on the issue.


Funs

Though the common idiom is to early bind the module and function you want to call, Erlang is fully capable of late binding and dynamic invocation, as this very crude example illustrate:

This opens interesting possibilities for MUTs that expose higher order functions. If the function that must be tested accepts one or several functions, passing a mock implementation is just a matter of providing an anonymous function of the same arity. This mock would perform nothing besides storing the received parameters in a shared storage, like the process dictionary, for later inspection.

Unfortunately, not all functions receive their dependencies as parameters but instead perform direct calls to other functions in other modules. It could be a plausible and drastic design decision to forbid all direct inter-module calls in favor of passing dependencies as anonymous functions via additional arguments. Some have suggested to use a record to pass around all your application dependencies as a single extra argument added to all functions.

Interesting but the idea of polluting all functions with additional arguments is less than palatable. In fact, it would great if these extra arguments could be defined module-wise and implicitly added to each of its functions... Rejoice! Parameterized modules have been introduced to perform exactly this delicious syntactic sugar trick!


Parameterized modules

I have discovered parameterized modules while writing controllers for Mochiweb. In this pretty cool HTTP server, the request reference that your processing function receives points to a parameterized module, allowing this kind of neat syntax:

Though this may feel object oriented, don't get fooled: behind the scene, there is no instance of anything. The Request reference contains all the hidden parameters that the get function needs besides the atom specifying what you want to get. Behind the scene, what really happens is more likely something like that:

But because the Mochiweb Request is a parameterized module, all the extra parameters have been specified once, packed in the reference and stay hidden there for your utmost convenience!

From there, it's easy to see how to write stubs for parameterized modules: just write another parameterized module that export functions with the same signature as the ones you use in the real module. Here is a very incomplete but fully working request stub for Mochiweb:

Note how I use the process dictionary to store values that I will later retrieve for asserting everything went as expected. By using parameterized modules, I have been able to reach near 100% code coverage. Does this mean parameterized modules are the best thing since sliced bread?

Well, so much for the free lunch as there are some drawbacks to consider:
  • Experimental - Parameterized modules are still officially considered as an experimental feature of Erlang, hence subject to change. Unlike the Java world where everything is kept for ever just in case, Erlang doesn't patronize developers, so if this feature is one day bound to oblivion, it will be tossed out. And quickly.
  • Unchecked - Unlike with a direct module's function reference, compile-time checking is not available, leading to possible bad surprises at runtime. If the parameterized module reference your code uses does not expose the expected function, you're in for a nasty error. In fact, you can totally pass a reference to a Foo module while your function expects a totally unrelated Bar module. As a tentative mitigation, I have added a verification function in my modules so they ensure at start-up time they are correctly wired. This feels like framework-envy,so I'm not fully satisfied with this approach.
  • Confusing - Because the actual module is not directly referred to, reading such code becomes more complicated. You have to infer from the context (or some coding conventions, or even comments) what is the module that will actually be wired-in at runtime. Decreasing understandability is definitively not a good thing.

Besides these downsides, I still believe that the complete MUT isolation and behavior swapping facilities offered by parameterized modules make them a very interesting tool for the test-minded Erlang developer.


Closing notes

MUTs have other kinds of dependencies that you will want to substitute at unit testing time. To name a few:
  • Process dependencies - A MUT can contain functions that directly depend on other processes via their PIDs (process IDs). An interesting seam here is the local registry of processes (and ports) that you can use to set-up test processes and register them under the same name as the ones used at runtime.
  • Mnesia - Stubbing out calls from the controllers to the DAO is a good strategy but what about the DAO itself? Instead of stubbing out each Mnesia call, I have opted for running it in-memory at unit test time (à la hsqlddb) and activating file persistence only at runtime. This is extremely fast so very well suited for the task.

Finally, if you wonder what unit testing framework I am using, I have opted for etap, which I find very simple and powerful enough for my needs. If you want something more structured and feature-rich, EUnit is the answer.

Free fall is over: I have found my seams and landed seamlessly. Please share your own test infected adventures in Erlang.


UPDATE 23-SEP-2009: Hot code swapping is also a very powerful seam, that has been smartly leveraged to create ErlyMock, a quite capable mock framework for Erlang.