Posts Tagged ‘macros’

“The first rule of macro club is: Don’t write macros”

I wrote this originally as an answer to a StackOverflow question. But the question got closed by some trigger-happy mods before I clicked post. How unfriendly!

Anyway, in case it is useful to people wondering when they should and shouldn’t use macros, I thought I would post it here instead….

(more…)

I discovered an interesting trick today which I thought worth sharing – you can add a side effect to the end of a sequence in Clojure which will only get called after the sequence is fully consumed for the first time.


(defmacro on-consumed [seq code]
  `(lazy-cat ~seq (do ~code nil)))

(def coll
     (on-consumed [1 2 3] (println "DONE")))

(take 2 coll)   ;; not fully consumed
=> (1 2)

(take 3 coll)   ;; not fully consumed (not gone past third item)
=> (1 2 3)

(take 4 coll)   ;; fully consumed past end of sequence
=> (1 2 3)
DONE            ;; printed to *out*

(take 4 coll)   ;; fully consumed for second time, no side effect
=> (1 2 3)

So how does this work?

(more…)

Clojure performs exceptionally well considering it is a dynamic language.

My rule of thumb is:

  • It’s 5-10x slower than optimised Java if you write idiomatic Clojure (with liberal use of higher order functions, lazy sequences and all that goodness)
  • It’s about the same speed as Java if you really optimise hard – you can basically generate approximately the same JVM bytecode.

This StackOverflow question prompted me to write a quicksort implementation in Clojure, which I’ve included here to demonstrate that matching Java speed is entirely possible within Clojure.

Warning against premature optimisation: this is not pretty code. You shouldn’t be writing code like this unless you have profiled your code and identified a real performance issue.

(more…)

Clojure already comes with built-in and and or macros. There are a couple of interesting things to note about these:

  • They are macros rather than functions. This enables them to have short-circuiting behaviour (a function couldn’t do this, since functions evaluate all their arguments).
  • The downside is that you can’t use them in higher order functions – (reduce and some-collection) won’t work since and isn’t a function.
  • They work with “truthiness” rather than boolean values. This is consistent with how the rest of Clojure works, and it has the nice side-effect that it enables you to use the Common Lisp idiom (or value default-value) to provide a default value in case of nils.

Anyway, suppose we want a similar macro that provides the XOR function (which isn’t yet in Clojure core)? How should we do this?

Let’s start by defining the trivial cases. This is often a good idea whenever you suspect that you may want to implement the more general case using recursion, and it can give you a warm glow that you are at least making some progress:

(defmacro xor
  ([] nil)
  ([a] a))

(xor)
=> nil

(xor true)
=> true

(xor nil)
=> nil

Note a few of points about the above:

  • You can have macros with multiple arity, just like you can with functions. No real surprise here since macros are ultimately just functions that get executed at compile-time, but it is useful in many cases.
  • We provide a zero-arity version – it is often worth thinking about the zero-argument case, since it is possible that your macro will get applied to an empty collection of arguments
  • xor with zero arguments evaluates to nil – sometimes it’s hard to immediately see what the zero-arg result should be. with xor it helps to think of xor as “bit parity” – with an even number of set bits in the input the output should be false, and this includes the zero case. So (xor) should be false.

Now we can add in the 2-argument case:

(defmacro xor 
  ([] nil)
  ([a] a)
  ([a b]
    `(let [a# ~a
           b# ~b]
      (if a# 
        (if b# nil a#)
        (if b# b# nil)))))

Again there are a few interesting things to note about this:

  • We use auto-gensyms (a# and b#) in a let statement at the start. This is an important pattern to use whenever you need the macro parameters more than once, as it ensures that the full code for a and b only gets expanded once. If you don’t do this, then you risk expanding the code for a and b multiple times, which in some cases can result in a significant performance and code size overhead (imagine if a and b were both large remotely executed computations….)
  • We define all 4 cases with nested if statements. Since if is the fundamental conditional construct in Clojure, it is both very efficient and correctly understands “truthiness”
  • We return a#, b# or nil – We could also use true/false but in the case it makes sense to use truthy/nil mainly to be consistent with  how and and or work (you don’t want to surprise your users!). true/false is more common when you are defining a predicate.
  • There is no short-circuiting – but this shouldn’t surprise us since (unlike and and or) xor needs to evaluate all inputs before it can determine the final truthiness value.

Finally we can tackle the general case. The key observation here is that xor can be defined recursively by applying xor to subsets of it’s own arguments:

(defmacro xor 
  ;; ......... other cases ..........
  ([a b & more]
   `(xor (xor ~a ~b) (xor ~@more))))

This feels good – our general case is a pretty simple recursive application of the basic cases. Does it work?

(xor nil 1 1 1)   => 1
(xor nil 1 1 nil) => nil
(xor nil nil)     => nil
(xor 1 1 1 1 1)   => 1
(xor 1 1 1 1 1 1) => nil

This is not the most comprehensive test suite ever – but it looks like we have success!

EDIT

As amalloy and Sgeo have rightly pointed out in the comments, xor doesn’t need to be a macro (as it doesn’t have the short-circuiting behaviour that necessitates the use of a macro with and and or). I wrote it as a macro for illustrative purposes (this post was after all intended as an exploration of the ideas involved in writing a simple recursive macro), but there are pros and cons of doing it either way:

  • As a macro, you gain consistency with and and or plus a minor performance benefit from avoiding function call overheads at runtime. This might be necessary for code generation or if you  have identified a particularly performance-sensitive piece of code that you need to optimise.
  • As a function, you get more flexibility in terms of use in higher order functions and the implementation itself is a bit simpler / more maintainable. As a consequence, using a function is usually the better solution in general purpose code.

I’ll leave it to the reader’s discretion to decide which approach is most appropriate in your particular case.

Clojure is a functional programming language at heart, but there are times when it is useful to have some imperative programming tools at hand – sometime imperative style is a good fit for the problem you are trying to solve, and at other times it is helpful to squeeze that last bit of performance out of critical code.*

One such construct is the humble “for” loop, which in Java looks like:

for (int i=0; i<10 ; i++) {
  System.out.println(i);
}

Here’s how we might like it to look in Clojure:

(for-loop [i 0 (< i 10) (inc i)]
  (print i)
)

Unfortunately, a for-loop similar to the one defined above doesn’t exist in Clojure. Fortunately we are in the Land of Lisp, so the lack of a language feature should never be an impediment – macros to the rescue!

A good piece of general advice is that you shouldn’t use macros unless you really need to. However, we have no option in this case. for-loop cannot be implemented as a function for at least the following reasons:

  • It needs to bind a symbol “i” at compile time
  • We don’t actually want to construct a vector as an argument to for-loop at runtime. This would be a significant overhead given our stated goal was to create an imperative for-loop suitable for micro-optimisations.
  • Even if bound, (< i 10) and (inc i) would execute just once as arguments to the function. There would be no way to run the repeatedly on each iteration of the loop as we require
  • The body of the for loop would also be executed just once as a parameter to the function

The common theme in also of these is that we need to perform some transformation of the code at compile time rather than following the normal function evaluation rules. Most languages don’t allow you to do this, but we have macros at our disposal in Clojure and can escape the normal constraints that programmers face.

So we have concluded that we definitely need a macro if we want to enable our for-loop syntax. How do we go about building one?

A very useful technique is to write the code that you would like the macro to produce for a specific case. Here’s how you might write the example above, using loop/recur to ensure that Clojure does an efficient tail-recursive loop:

(loop [i 0]
  (if (< i 10)
    (do
      (println i)
      (recur (inc i)))))

To make this into a macro you need to do two things:

  • Quote your example code with the syntax-quote (`)
  • Pull out the parameters for the macro, refering them in your macro code with unquote (~) for single forms and unquote-splice (~@) where you have a sequence of forms (e.g. multiple statements in the for-loop)

So the resulting macro would look like:

(defmacro for-loop [[symbol initial check-exp post-exp] & code]
 `(loop [~symbol ~initial]
    (if ~check-exp
      (do
        ~@code
        (recur ~post-exp)))))

Voila – we have extended the language with an efficient for-loop syntax in six lines of code.

*I’m not advocating premature optimization. This applies only to situations where you have indentified a genuine bottleneck and determined that the benefits of doing extra optimization outweight the costs in terms of your time and the potential impact on maintainability.