The VAPOR principles for functional DSL design

Posted: September 24, 2012 in coding, design, Uncategorized
Tags: , , ,

The SOLID principles are recognised for encapsulating (pun intended) many important aspects of OOP design.

I have recently been exploring some similar principles for designing software components in the form of functional DSLs in Clojure. I’m provisionally calling these the VAPOR principles.

The principles are:

  • Value-based programming
  • Abstraction maximization
  • Plugable / composable
  • Orthogonal
  • Readable expression

These are intended to apply to the general design of Clojure software components in the form of DSLs. I’m using the DSL term here quite broadly: not just to mean embedded languages, but the general way that you choose to express higher level constructs in your code base. You can regard it as API design if you like.

Below I will go into a little detail of what I mean by each of these principles:

Value-based programming

The first principle is that the entities representing / produced by your API should be values – in the sense of immutable values.

I’m using the word “entities” in a deliberately vague way because the concept is independent of the specific concrete implementation. Examples would be:

  • Pure functions – any pure function implementing clojure.lang.IFn is a potentially good candidate for an entity in Clojure. Much of the clojure.core functionality is provided using higher order functions in this way.
  • Maps – are a good example of using Clojure’s built-in immutable data structures. Ring is a great example of using maps to represent requests / responses for web server applications.
  • (Lazy) sequences – make sense as values (assuming the generation of the sequence does not have side effects!). Potentially very useful for applications producing arbitrary-sized result sets (potentially even larger than memory)
  • Records – A good choice expressing DSL values. As an example, Clisk uses records to represent intermediate AST nodes for image generators. These can be used to generate images, or combined to create more sophisticated image generators.

The use of immutable values is a fundamental principle of functional programming. For a great discussion of the principles and motivation behind values, the talk by Rich Hickey is excellent:

In the design of DSLs/APIs, I think that this has a few special implications:

  • API entities should be expressed as values. In Clojure, this likely means that you should represent them as a map, a record or a pure function.
  • There should be no mutable state hidden in your API. No mutable globals, accumulators etc.
  • If you do any mutation internally (e.g. while calculating a result) then this should be invisible to the user. So using Clojure transients is OK.
  • If you follow this rule, then you should always be able to assign the result of a DSL/API operation to a symbol for later use. This gives you flexibility in composing elements of your DSL.

There may be some exceptions  to this rule, but they should be used very sparingly. In particular, I think that if you are going to break this principle then you should only do so when all the following conditions are met:

  • There is an absolute requirement for mutability. Potential reasons might include having to interact with an external mutable object (e.g. a database or a Java API you don’t control).
  • The mutable entities are are passed in as a parameter by the user of your code. This way the user of the API gets to control the scope of mutability explicitly.

Abstraction maximisation

A central principle of good software design is that you should program against abstractions, and this deserves a place as our second principle.

Maximising the level of abstraction means minimising the unnecessary details – the abstraction should have the minimum possible surface area necessary to express the desired concept. There is a quote attributed to Einstein that runs along these lines:

Everything should be made as simple as possible, but no simpler.

A good example is Clojure’s own sequence abstraction that at it’s heart is based on just two core operations – first and next . But because this abstraction is so small and universal, it can be used in a huge variety of ways.

A good video to watch on the topic of abstraction is

Pluggable / composable

The third principle relates to how entities in your DSL/API can be combined.

Pluggable combination is necessary for several reasons:

  • DSLs gain power by being able to compose simple primitives to create much more powerful operations. Consider Ring again as an example: by wrapping middleware around a simple web request handler, you can add complex functionality like cookie handling, security etc.
  • You often need to create functionality that integrates different APIs. It is then enormously helpful if these APIs have been written in a way that facilities easy composability.

Achieving plugability is strongly linked to the proper use of abstractions – if you build your APIs around the right abstractions, then it will be substantially easier for users to plug in other code that implements the abstraction.

It also implies that a DSL should come with some set of operators that are appropriate for plugging together entities in the DSL.

Again Clisk also provides an example of operators being used to plug functionality together in a DSL: all of the functions in the Clisk DSL are really just about taking one or more image generating nodes and combining them to product a more sophisticated image generating node. For example, the offset operator can be used with a noise generator to distort a checker pattern:

(offset vnoise (checker black white))

The power from such DSLs all comes from plug-ability of the base abstractions. I’ll finish this section with another famous programming epigram that is closely related to this principle:

It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures.


This principle reflects the fact that you want your DSL to focus on doing one thing well, and not get mixed up in trying to solve different problems along different dimensions.

In some way this reflects Rich Hickey’s famous talk “Simple Made Easy” – by keeping things simple and focused  you avoid “complecting” different issues that are better handled separately.

Often, failure to follow this principle works OK in the short run but ultimately results in various nasty symptoms:

  • It”s harder to test code because you also need to setup an environment that replicates or configures the orthogonal features.
  • If you have a complected solution A+B, and then find a library C that does the same as B but better, then you will find it hard to make A+C work. You are probably going to be stuck with A+B
  • Plugability gets compromised – if DSLs are not orthogonal then they are much less likely to plug together smoothly.

Taken to the extreme, failure to follow this principle leads to “frameworkitis” where a single framework attempts to do everything. I’ve written before about this problem in Composition over Convention, but basically you end up with a complex tangle of functionality that cannot be accessed independently.

Readable expression

The final principle is about making your DSL natural and usable for humans, not just machines.

Now readability can be somewhat subjective, but I think there are still some important techniques that should be applied:

  • Maximise expressiveness – Everything that is written should be there because it has meaning, not because it is part of some unnecessary ceremony. There should be no boilerplate. Note that this is not the same as attempting to minimise the number of characters, which is a recipe for unreadable code in most cases. It is more about the signal/noise ratio.
  • Prefer declarative statements – it’s easier for readers to comprehend the “what” that is intended rather than emulate a machine in their heads to decode the “how” of imperative code. core.logic is a great example of using declarative statements to express queries that would be extremely complex and difficult to read in imperative code.
  • Choose good names – with good names it should be fairly evident what your code is doing at a high level without readers needing to dive into the details.
  • Allow symbolic substitution – it helps enormously if the user can define an entity in your DSL, give it their own meaningful name and use it later. In Clojure, that means that your DSL should return entities that are values and can be let-bound to symbols (back to the V principle again….)
  • Follow conventions – if there are good conventions in common use in your language, you should stick to them. Example: if you’re writing a Clojure DSL then I’d strongly recommend using vectors [] for anything that works like a binding form rather than inventing some whacky new syntax……

As a good example of readability, I’d like to leave you with some Korma code:

(select user
  (with address)
  (fields :firstName :lastName :address.state)
  (where {:email ""}))

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s