Scala is well known as a concise, readable programming language. One of the reasons is that many popular libraries offer their users DSLs to work with. These convenient APIs make creating programs simpler by saving our keystrokes and (most importantly) by improving readability. But it’s not always the case … Sometimes we have to work with what we were given and sometimes it might be a verbose Java API. I will show you how you can make development easier by abstracting away rough edges of the underlying code.
To demonstrate my point I will use Encog. Encog is a great library for machine learning and neural networks created by Jeff Heaton. It provides C#, C++ and Java bindings. We will work a bit with the last one. Encog is powerful and I would really like to see it as part of the Scala ecosystem. Of course we can work with the Java bindings directly, but their verbosity makes it hard to use both for Scala hAkkers (accustomed to simple, convenient APIs) and newcomers without a lot of programming experience.
Our DSL will be incomplete and only cover a small portion of Encog. Our main goal will be to make the code simple, both for maintainers and end users. Note, we won’t resort to any magic here. We won’t use macros, byte code manipulation or any other “magic techniques”. We will only use simple features, the Scala type system and composition.
First let us take a look at the Encog XOR example
As you can see it’s precisely the same Java example that you can find in the wiki but written in Scala. I even left the semicolons in place. It’s a lot of code for a simple XOR network. It’s also a bit hard to read if you are not accustomed to Encog. What does
null mean in this example? Is it safe to pass it there? Why didn’t we pass null into the other layers? Why is the role of that boolean flag we are passing with
ActivationSigmoid? What does 2 mean in the first layer? Is this noisy
do ... while loop really needed?
That’s a lot questions… let’s make things simpler using Scala.
This is the same example written using our DSL
I think we can agree that this example is a bit shorter and also much simpler. Our new example does exactly what is says:
input matrix and put it
network we defined a bit earlier, make sure it is trained
using resilient propagation
until the error is smaller than 0.01, hopefully
ideal matrix in result.
We will dissect this example into smaller pieces and explain the implementation through the rest of this article
We added custom syntax to simple Scala
Arrays. Why is that? First of all we want the Arrays to be dense. Using the classic approach it was easy to create something like this
As you can see each sub-array in a has a different number of columns… But a was supposed to be a matrix! Code expecting it to be one will crash at runtime. This clearly isn’t type-safe…
And here is our code
| operator separates values in a row and
\\ – rows within the matrix. The parentheses are there for grouping. The code won’t compile if any of the rows is of a different length than the others. How do we achieve this? The idea behind the implementation is ridiculously simple.
The whole magic starts with defining an class
EncogArray1 which holds 1 element and defines the concatenation operator
|. This operator is also very simple - the only thing it does it create and instance of
EncogArray2 and pass his value and the argument into it. All of the classes do exactly the same thing … simple as that. Of course we didn’t define classes up to a thousands, but a limit of 5 will work fine for our requirements. We made the first class in the chain implicit so users don’t have to bother with explicitly instantiating the class, instead they can use
| right away.
Here’s a simplified version - I did remove all the methods I use internally and the type level stuff
The other operators
\\ (for 2d concatenation) and
\\\ (for 3d) are defined using the same trick as
\ . As a result there’s no real difference between them and you can use all of them in the same “level” like this
1.0 | 2.0 \\ 3.0 \\\ 4.0. But I would advise not to mix them up or it will get messy. Another thing - remember to use parentheses to denote dimensions. One corner case you probably noticed is the 4 rows with a single column matrix from the example. In Scala expressions like
(1) are equivalent, so we had to come with an convenient syntax for that - if you specify a
Tuple1(1) our Encog DSL will understand that you want a row with a single element.
Full code is available in PrimitiveValuesImplicits
Our goal here is to redefine the way networks are created so that, that it’s easier to read and harder to make a mistake. Let’s look at the code first
In this example
network is an instance of
LayersHolder class, which you can think of as a thin wrapper over
Seq[EnrichedBasicLayer]. It provides helper methods to combine (via the
+ operator) and transform (internal
network method) provided layers. So what is an
EnrichedBasicLayer? It’s an implicit class defining few overloaded methods named
having wrapped around an
ActivationFunction (you don’t have to now what it is, but for now you can assume
ActivationSigmoid is one). A notable exception is the
InputLayer which is simply a function returning an
EnrichedBasicLayer prepared to be the input layer. Each
having creates and returns a modified copy of
this. This way we can achieve a fluent English-like API. We are returning the same class over and over again, because we want to give the user more flexibility. Soon enough we will explore another way of handling things.
Again nothing fancy here. You can browse the full code in EncogImplicits
Now let’s discuss the backbone of our DSL. Up to this point we were lazy. We defined some helper structures and some syntax, we created a bunch of things and prepared the data. But no actual computation did happen. Until now.
The “magic” here is defined in two parts. First we define the way we want to compute the result (
procedure). Then we force the computation using
get(). The long sentence we used to define the computation is really a chain of function invocations. The
input variable is implicitly transformed into
EmptyNeuralNetworkStructure behind the scenes. The road from that point is simple.
EmptyNeuralNetworkStructure defines an
into method that returns an instance of a class defining
using. Which in turn returns an instance of another class with another method and so on. At the end of this chain we will have a
EmptyNeuralNetworkStructure ready to be trained. Training occurs when
get() is called and a pair of trained network and training data set is returned wrapped in a case class. In this case we used a more strict approach to create a fluent API than with layers. In this case DSL forces us to compose functions in a specific way. We do that because this use case is far more complex and we wanted to avoid confusion among the users.
Full source code for flow is StructureImplicits
And this is it. Although it might look hard at first look, defining your own (basic) DSL is very often quite easy. One thing to notice is that the only Scala features that we used here are o overloaded operators and implicits. We could achieve much more using advanced features of Scala … but this is a topic of Part 2 :)
Note 1. Before you jump into writing you very own DSL for every piece of code please note that the main purpose of creating DSLs is to improve readability and reduce boilerplate code. Taken too far Domain Languages can bring the opposite effect. Keep that in mind.
Note 2. Again special thanks to Jeff Heaton and everyone on Encog team. You rock!