A fast quicksort in Clojure

Posted: August 29, 2012 in Uncategorized
Tags: , , ,

Clojure performs exceptionally well considering it is a dynamic language.

My rule of thumb is:

  • It’s 5-10x slower than optimised Java if you write idiomatic Clojure (with liberal use of higher order functions, lazy sequences and all that goodness)
  • It’s about the same speed as Java if you really optimise hard – you can basically generate approximately the same JVM bytecode.

This StackOverflow question prompted me to write a quicksort implementation in Clojure, which I’ve included here to demonstrate that matching Java speed is entirely possible within Clojure.

Warning against premature optimisation:┬áthis is not pretty code. You shouldn’t be writing code like this unless you have profiled your code and identified a real performance issue.

So here’s the code:

(set! *unchecked-math* true)

(defmacro swap [a i j]
 `(let [a# ~a
        i# ~i
        j# ~j
        t# (aget a# i#)]
    (aset a# i# (aget a# j#))
    (aset a# j# t#)))

(defmacro apartition [a pivot i j]
 `(let [pivot# ~pivot]
    (loop [i# ~i
           j# ~j]
      (if (<= i# j#)
        (let [v# (aget ~a i#)]
          (if (< v# pivot#)
            (recur (inc i#) j#)
            (do
              (when (< i# j#)
                (aset ~a i# (aget ~a j#))
                (aset ~a j# v#))
              (recur i# (dec j#)))))
        i#))))

(defn qsort
  ([^longs a]
    (qsort a 0 (alength a)))
  ([^longs a ^long lo ^long hi]
    (let [lo (int lo)
          hi (int hi)]
      (when
        (< (inc lo) hi)
        (let [pivot (aget a lo)
              split (dec (apartition a pivot (inc lo) (dec hi)))]
          (when (> split lo) (swap a lo split))
            (qsort a lo split)
            (qsort a (inc split) hi)))
      a)))

The key tricks are:

    • Do everything with primitive arithmetic
    • Use ints for the array indexes (this avoids some unnecessary casts, not a big deal but every little helps....)
    • Use macros rather than functions to break up the code (avoids function call overhead and parameter boxing)
    • Use loop/recur for maximum speed in the inner loop (i.e. partitioning the subarray)
    • Avoid constructing any new objects on the heap (so avoid vectors, sequences, maps etc.)

Finally the benchmark code and test runs. Note that when testing this kind of code you want to run it a few times to get a "steady state" timing which is free of JVM and possibly processor cache warm-up effects (as you can see the first couple of runs are noticeably slower):


(dotimes [i 10]
  (let [ys (long-array xs)]
    (time (qsort ys))))

"Elapsed time: 17.392009 msecs"
"Elapsed time: 13.020082 msecs"
"Elapsed time: 11.264932 msecs"
"Elapsed time: 11.974766 msecs"
"Elapsed time: 11.805373 msecs"
"Elapsed time: 11.925635 msecs"
"Elapsed time: 11.799507 msecs"
"Elapsed time: 11.567418 msecs"
"Elapsed time: 11.297931 msecs"
"Elapsed time: 11.367228 msecs"

EDIT

dnolen created an improved version of the quicksort that uses functions rather than macros. This is much nicer code - thanks dnolen for providing something much more beautiful! The macro version above is still marginally faster (on my machine at least) but the difference isn't big enough to justify the extra code complexity that this brings.

For anyone reading this post in the future, I'd recommend studying this version before resorting to macros.

About these ads
Comments
  1. Cool. One note – you can also use `definline` instead of `defmacro` – it creates a normal function, but signals the compiler to inline it at compile time, and spares some of the annoyance of working with macros.

    • mikera7 says:

      Interesting – didn’t know about definline before!

      Just had a quick experiment and can’t figure out how to make it work. Maybe that’s why it is marked “experimental” :-) with have a bit more of an experiment and see if I can come up with something that works…..

  2. dnolen says:

    This is a very ugly Clojure qsort and quite unnecessary. I’ve added a version that does not resort to macros and is faster than your version on my machine to the StackOverflow question: http://stackoverflow.com/questions/12176832/quicksort-in-clojure

    • mikera7 says:

      Yeah it’s certainly more of a frankenstein’s monster than a Cinderella :-) I certainly like your improved version better. I’ll add a comment so that future readers find the better version…..

    • mikera7 says:

      P.S. just as an extra datapoint my version is still slightly faster than yours (11ms vs 13ms) on my machine (Clojure 1.4 running in Counterclockwise in Eclipse)

      I benchmarked as follows and looked at the later timings (to try and mitigate JVM warm-up effects….)

      (def xs (let [rnd (java.util.Random.)] 
        (long-array (repeatedly 100000 #(.nextLong rnd)))))
      
      (dotimes [i 10] 
        (let [ys (long-array xs)]
          (time (qsort ys))))
      
  3. Hmm says:

    If you find yourself writing Clojure code like this, you might as well drop into Java. It’s trivial to interrop from Clojure and you’ll end up with a (hopefully) idiomatic solution on both the Clojure and Java side.

    • dnolen says:

      Now this I do not agree with. Clojure gives you good to tools to write very fast code w/o the verbosity of Java. There are good uses for macros when writing fast code – but simulating JVM inlining is not one of them.

    • mikera7 says:

      I personally think it is nice to have a choice – sometimes Java works well but at other times Clojure can be very helpful (e.g. if you are doing some code generation with macros)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s