Saturday, April 26, 2008

Not Dash Bored

I love dashboards. You probably got that from my previous post. If I had enough screens, I would be surrounded by dashboards of all kinds: work tasks lists, continuous integration server control panel, project metrics sites and server monitors.

Maybe this comes from the time when I was handling another kind of dashboard, from which my life was directly depending!

(Yes, I am a pre-glass cockpit flying dinosaur)

Dashboards are great because they present a synthetic view of a situation in a form that is visually expressive and does not require a lot of concentrated attention to capture relevant and crucial information.

The most recent dashboard I built exposes particular aspects of several instances of Mule ESB I have under my control. Through JMX, Mule exposes a wealth of statistics about the different components of a particular instance. Here is a small portion of the HTML console that displays this information:


This is too much information the brain, or at least my brain, can digest efficiently and quickly enough. Hence I created a simplified view that represents the variations of message routing statistics on a selection of components:


The components are simply selected by name: I decided to prefix the important ones with "process" and "dispatch" and derived from there a simple selection pattern. I use different colors to show different states:
  • Green: no activity,
  • Yellow: at least one message went through,
  • Orange: the backing event queue has been resized up,
  • Red: an error has been routed,
  • Gray: no delta available (first call of the dashboard),
  • Black: component statistics unavailable.
Extra symbols represent a non active state of a component, like paused or stopped. The dashboard itself is the aggregation (frames, yuck!) of the HTML output of a specific component deployed alongside the other ones.

Of course, this does not compare to the professional grade monitoring tools you can buy from MuleSource, but is already handy for deployments of limited scope and criticality.

I think the most interesting aspect of this dashboard is how quickly you can develop the ability to recognize a normal behavior pattern from a faulty one. It is pretty much like reading the matrix undecoded... Now how could you get bored from such a board!

UPDATE 03-MAY-2008: This dashboard is now available on MuleForge.

Sunday, April 20, 2008

Building Value

This week, one of my colleague (a guy named Josh) was all grumpy about the time he spent adding documentation in his projects Maven sites so operations could deploy his application properly. This made me reflect on how great is this tool not only for building software but value in general.

By giving developers the opportunity to document in the same environment as where they code and by embedding the HTML rendition of this documentation in the project site alongside all the other technical reports, Maven presents to the stakeholders an overview of the value built by a project.

Value is a vague term, so let me be more specific:
  • Intrinsic value: technical metrics, like the ones coming from static bug analysis, package dependencies, test coverage or code style compliance, represent the core value of the code in term of quality, flexibility and maintainability.
  • Business value: results from acceptance testing tools like Selenium or FitNesse represent the capacity of the application to satisfy business requirements.
  • Corporate value: on top of the auto-generated technical documentation from the code base, all the extra documentation that is added, whether it is installation guides, monitoring procedures or deployment diagrams, brings value to the company as a whole, from operation teams to new recruits.
With all this goodness available, I am still baffled by the limited number of managers who pay attention to the reports generated by Maven. I imagine that they might be too technical for "generic" managers who are in the business of software as if they would be in the one of gravel and stone delivery. Moreover, the usual metrics for software development is generally focused only of features delivery and deadlines.

This said, thanks to continuous integration and the dashboard plugin, I believe it is possible to catch the interest of a broader audience because it is now possible to display trends instead of static values: management understands trends.

For example, a flat test coverage value is meaningless but a trend that shows it increases means that quality, thus value, does the same. Similarly, comparing projects based on their metrics is a non-sense, while comparing their trends makes sense.

Did you have any good experience sharing Maven sites to management? Did they get the feeling that value was being built?

Saturday, April 12, 2008

Abstraction First

When designing services, the common wisdom is to opt for a contract-first approach, instead of an implementation-first one. There is no question that this is a valid approach but I think the emphasis should be put on the necessity to design a good abstraction first.

Consider this: technical implementations, especially in strongly typed languages, often result into a pollution of the client model by the service model. Interfaces or stubs used by a client to perform remote invocations can very easily become mixed with its own domain.

This creates a tension between a service provider and its consumers, as they often tend to drive the contract too much on their side because they consider the local crystallization of the contract as a part of their model. Exacerbated, this tendency can result in tight coupling between both parties.

Let me give you an old but typical example from my tumultuous past.

A little more than a decade ago, I worked on a corporate centralized contact management system. It was supposed to serve contact details (persons, organizations, addresses and whatnot) to all the applications in use in the company, including secretaries' word processors. Admittedly an interesting project, it in fact quickly turned to be a death march. I soon learned this was the sixth attempt of such an endeavor and that people wanted to see how a n00b like me would fare.

I then realized that each department wanted very different things out of this system and their views would not be reconcilable. I ended shipping a version that was only usable by the secretaries, which I think was a smart move as you must
always be good to them!

The next n00b was assigned the creation of version 7 of the contact manager, built on what I did with the goal to generalize it to all departments. Of course it failed and the project disappeared for ever. Maybe the mythical number 7 was to be reached before the whole stuff could have been killed.


At that time, if I would have known better, I think the best I could have done would have been to advocate for the deployment of an LDAP provider. Indeed a directory server accessed via this protocol does not try to be everything for everybody and does not present a contract that any application would consider using in its own domain model. Yet it offers a simple and powerful abstraction that an application can use to query directory information and then tie them with its own object model.

Let me quote Uncle Bob:
"Abstraction is the elimination of the irrelevant and the amplification of the essential"

To me this sounds almost like a caricature, where the most prominent features are made so obvious and visible that there is no doubt left about what really matters. A good abstraction for a service should then translate in clear intents and well defined boundaries, which would guide the creation of valuable contracts.

Finally, for all these existing services that want to invade your application domain, there are fortunately ways to remain insulated. For example you could:
  • use dynamic language scriptlets to perform remote invocations and set values on your local domain,
  • use Dozer to tie client stubs and your objects,
  • consider invocation responses as raw XML and extract values out of them with XPath.

Saturday, April 05, 2008

Fruit and coffea? I don't think so...

Last year, a guy named Marc Fleury ranted about how Macintosh was not a suited platform for software development. At the time I thought he just hated it because he is French, and French people hate everything. Let me quote him:

"a mac is like a bimbo". It looks good and shiny from a distance, you think you really want to try it. But once you do, after 2 weeks of "doing it" you are bored, bored to tears. Tired of everything, the pretty animation stuff, the big tatas, the transparent look and feel, the big tatas, the genie bullshit animation, the big tatas, the stuff that is different from windows "just to be different", the big tatas and the empty brains.

Nowadays, I am the one who daily (hourly) complain about how clunky and inappropriate is Mac OS X for Java developers. People around me roll their eyes and probably assume I just hate it because I am French too, and French people hate everything.

But, boy, since I started to use this platform professionally, how much do I find Mac to be a hindrance to my development activities! How often the OS gets in my way and pulls me out of the state of flow I am in...

Here is a non-exhaustive list of my grievances, so you can decide if it is mere ranting or if there are some valid reasons for my grumbling:
  • Bitter Java: official JDK support is way behind other OSes. For example, the Apple JDK 6 was beta when I was on Tiger and disappeared on Leopard. Sure I could follow the great work of Landon Fuller with Open JDK, but I honestly do not have time to invest in such endeavors.
  • Keyboard support is a joke: even if I am using QuickSilver, I constantly have to grab the mouse to click this, highlight that or dismiss a pop-up. The last one really drives me nuts: why is it that the escape key does not always cancel a pop-up dialog?
  • Focus messy: there is always an application that steals the focus out of my working window. Maybe this is the fault of badly behaving applications, but it is the first OS where this happens to me so often that I get annoyed by it. Is this OS making it easier for applications to become focus-rude?
  • Prompt to freeze: the UI freezes very easily, to the point I can not even switch applications or invoke the task killer. It seems that a badly behaving application can very easily mess-up the whole user interface, hence the whole OS.
  • Hidden BSD: it is hard to say this but as far as the overall stability is concerned, OS X reminds me of Windows ME sans Blue Screen of Death. I have to do hard reboot at least once a week, whether the screen saver locks me out for ever or the UI freezes to the point I can not do anything except sitting on the power button. I have not seen this in any OS for almost a decade. And having this kind of behavior on an almost newly re-installed machine, with minimal applications running, is a plain disappointment.
  • Finder sucks big time: for an OS that is supposed to be all about user experience, I find the Finder to be a complete disgrace. Try to create a new directory: it never ends up where you want it. Try to shift-select files with the keyboard: going up and down performs some counter-intuitive file selections. Then use the mouse to drag the selected files: they might end-up where you drop them, or not. Instead of adding a new view in Leopard (cover flow), if only Apple could have fixed the existing ones so they become really usable (who uses the insane column view?).
  • Flaky AirPort: I am constantly losing connectivity with my Wifi router. Maybe my cheap D-Link router is the issue, but then why my other non-Mac machines have no problem maintaining their connections up and running for hours?

I will not mention Eclipse that dies unexpectedly on Leopard while it was stable on Tiger (yes I have configured the JVM memory parameters, thank you). I will not mention the disgrace that is Entourage, because it is a Microsoft product and Apple can not be blamed for it. And I will not mention that my MacBook Pro hard drive fried just after a year (bye-bye guarantee), which is the first time something like this happened to me for the past ten years that I have been working on laptops.

Nuf' said! So what are my platforms of choice for Java development? In order of preference: Kubuntu, Windows XP and... Mac OS X. But at home, I am very happy with the little white Mac Book we use for browsing, e-mailing and managing photos. For this kind of home activity, having an OS that shows off is acceptable. For professional usage, the less the OS gets in your face, the best it is.

Wednesday, April 02, 2008

Healthy Health Checks?

Here is a little story that happened to a friend of mine a few years ago. Users of his web application started to complain about the system being broken and not responding anymore. The curious thing was that the operation team was not aware of any issue. After checking with them, it appeared that the monitor they had in place for this web application was simply checking if an HTTP response was received. Any response. Even a 500 one!

This sounds naive and ridiculous but setting up application monitoring is a subject that is a little more hairy than it appears at first glance. Consider another more recent case that came to my attention: in this case, the application was still replying positively to its health check monitor but was not functioning properly, as it was unable to access required file system resources. Again, the end users were affected while the monitoring was happily receiving correct responses from the application.

So how can we, software developers, create health checks that operations can rely on?

Taking the canonical multi-tiered web application as an example, the following schema shows an health check that is too shallow to be useful (in red) and one that exercises the full layer depth (in green).
While it is clear that the shallow approach brings little value, as far as end user quality of service is concerned, why do not we always shoot for the deep approach then?

Well, if you consider how a serious load balancer appliance (like BIG-IP) works, you will realize that if performs health checks very regularly (by default every 5 seconds) in order to have the most up to date view of the sanity of the members of the pools it handles. Bearing this mind, if an health check request would exercise the full depth of an application, you would have a permanent load added to your system, which would increase the strain on your diverse resources, down to the database itself. With a farm of n servers, the cumulated strain induced by the health check requests on all the members of it would start to be non negligible on any shared resource.

My take on this would be the following: create an internal watchdog that evaluates the sanity of the application at a reasonable pace and report the current state of this watchdog when a monitor requests a health check from the application.

As shown in the above schema, the watchdog life cycle is uncoupled from the health check one, which allows to reduce strain on the underlying resources while allowing the monitoring environment to become aware of an application issue almost as soon as the application realizes it itself (because the monitor polling frequency will be kept high).

What is your own experience in this field and what is the path you have followed in order to build dependable health checks?