Saturday, October 18, 2008

Two-Minutes Scala Rules Engine

Scala is really seductive. To drive my learning of this JVM programming language, I thought I should go beyond the obvious experiments and try to tackle a subject dear to my heart: building a general purpose rules engine.

Of course, the implementation I am going to talk about here is naive and inefficient but the rules engine is just a pretext here: this is about Scala and how it shines. And it is also a subset of what Scala is good at: I mainly scratch the surface by focusing on pattern matching and collections handling.

The fact representation I use is the same than the one used by RuleML: a fact is a named relationship between predicates. For example, in "blue is the color of the sky", color is the relationship between blue and sky. The rules engine matches these facts on predefined patterns and infers new facts out of these matches.

In Scala, I naturally opted for using a case class to define a fact, as the pattern matching feature it offers is all what is needed to build inference rules:
Normally you do not need to implement your own hashcode functions, as Scala's default ones are very smart about your objects' internals, but there is a known issue with case classes that forced me to do so.

For rules, I opted to have them created as objects constrained by a particular trait:
The method of note in this trait is run which accepts a list of n facts and returns zero (None) or one (Some) new fact ; with n equals to the value of the arity function.

So how is the list of facts fed to the rule run method created? This is where the naive and inefficient part comes into play: each rule tells the engine the number of facts it needs to receive as arguments when called. The engine then feeds the rule with all the unique combinations of facts that match its arity.

This is an inefficient but working design. The really amazing part here is how Scala shines again by providing all what is required to build these combinations in a concise manner. This is achieved thanks to the for-comprehension mechanism as shown here:
Notice how this code is generic and knows nothing about the objects it combines. With this combination function in place, the engine itself is a mere three functions:
With this in place, let us look at the facts and rules I created to solve the classic RuleML discount problem (I added an extra customer). Here is the initial facts definition:
And here are the two rules:
With this in place, calling:
produces this new fact base with the correct discount granted to good old Peter Miller but not to the infamously cheap John Doe:
This is truly a two minutes rules engine: don't ship it to production! Thanks to the standard pattern matching and collection handling functions in Scala, this was truly a bliss to develop. On top of that, Scala's advanced type inference mechanism and very smart and explicit compiler take strong typing to another dimension, one that could seduce ducks addicts.

So why does Scala matter? I believe the JVM is still the best execution platform available as of today. It is stable, fast, optimized, cross-platform and manageable. An alternative byte-code compiled language like Scala opens the door to new approaches in software development on the JVM.

Closing note: an interesting next step will consist in parallelizing the engine by using Scala actors. But this is another story...