Achieving awesome numerical performance in Clojure

Posted: March 7, 2013 in Uncategorized

core.matrix is starting to get serious as a tool for Clojure numerics. I’ve been contributing to the effort because I need it for my own machine learning work and also for some of my more fun / artistic projects like enlight and Clisk.

I’m pretty pleased with how it is working so far:

  • The API is proving to be a convenient way of doing matrix / vector maths in Clojure
  • Most of the usual matrix / vector operations are now supported
  • The idea of allowing multiple underlying implementations (using Clojure protocols) seems to be working well. Several people are working on different implementations / wrappers with various goals and levels of maturity
  • Performance is proving to be very good – more on this topic below

core.matrix itself is an API that defines functions to operate on vectors, matrices and other forms of multi-dimensional arrays. The operations are simple to use, and work as you would expect:

(use 'clojure.core.matrix)            ;; the main core.matrix API
(use 'clojure.core.matrix.operators)  ;; for +, -, * etc. operators

;; adding vectors
(+ [1 2 3] [4 5 6])
=> [5 7 9]

;; normalising a vector to unit length
(normalise [3 4 5])
=> [0.4242640687119285 0.565685424949238 0.7071067811865475]

;; matrix multiplication with a vector
(* [[2 0] 
    [0 2]] [1 2])
=> [2 4]

As you can see, all of the core.matrix functions work just fine with regular clojure vectors as parameters. This is one of the virtues of using protocols as a tool for implementing core.matrix – just extend the protocols to clojure.lang.IPersistentVector and voila: Clojure vectors are now first-class core.matrix values.

But in order understand the true performance potential, you would want to use an optimised core.matrix implementation, so I’ve chosen vectorz-clj for this purpose.

;; tell core.matrix to use a specific implementation
(set-current-implementation :vectorz) 

So with that set up complete, here’s a quick benchmark (using criterium) for small-sized vector addition:

(require '[criterium.core :as c])

;; Adding two regular Clojure vectors with clojure.core/+
(let [a [1 2 3 4 5 6 7 8 9 10]
      b [1 2 3 4 5 6 7 8 9 10]]
  (c/quick-bench (dotimes [i 1000] (vec (map clojure.core/+ a b)))))  
;; => Execution time mean per addition : 1308 ns

;; Adding two core.matrix vectors (pure functions, i.e. creating a new vector)
(let [a (matrix :vectorz [1 2 3 4 5 6 7 8 9 10])
      b (matrix :vectorz [1 2 3 4 5 6 7 8 9 10])]
  (c/quick-bench (dotimes [i 1000] (+ a b))))
;; => Execution time mean per addition: 68 ns

;; Adding two core.matrix vectors (mutable operation, i.e. adding to the first vector)
(let [a (matrix :vectorz [1 2 3 4 5 6 7 8 9 10])
      b (matrix :vectorz [1 2 3 4 5 6 7 8 9 10])]
  (c/quick-bench (dotimes [i 1000] (add! a b))))
;; => Execution time mean per addition: 36 ns

;; Adding two core.matrix vectors using low level Java interop
(let [a (Vectorz/create [1 2 3 4 5 6 7 8 9 10])
      b (Vectorz/create [1 2 3 4 5 6 7 8 9 10])]
  (c/quick-bench (dotimes [i 1000] (.add a b))))
;; => Execution time mean per addition: 11 ns

It’s only a simple test, but hopefully indicative of the kind of performance gains you can now expect to achieve with core.matrix : something like 20-100x the performance of the equivalent unoptimised Clojure code

Here’s another example to show that large vectors can also run very efficiently – adding together 30 million random numbers in a vector:

(let [v (new-vector 30000000)]
  (dotimes [i 30000000] (mset! v i (Math/random)))
  (time (esum v)))
=> "Elapsed time: 39.48065 msecs"

Here you can see that the overall time averaged about 1.3 nanoseonds per element added. I think that is pretty near optimal on my machine, i.e. about the same performance you would expect from optimised native code. But all this is running in pure Java/Clojure on the JVM.

This has only been a short article, but I hope it demonstrates that Clojure has the potential to be extremely effective in the numerics space.

  1. Jonas says:

    Nice! I am doing research in computer vision and I am highly interested in doing linear algebra with Clojure instead of Matlab, for various reasons… Currently I have a Clojure DSL wrapped around EJML but I may also look at this library.

    • mikera7 says:

      Great stuff, please join the core.matrix effort if you can! Our best chance of getting a really good set of linear algebra libraries in Clojure is to work together on this stuff.

      Also if you like EJML then there is no reason why core.matrix can’t support EJML as a matrix backend as well : it is designed to be agnostic regarding the underlying implementation.

      • Jonas says:

        Would be nice to contribute but I don’t think I have the time now :-( … maybe in future. Anyway, it would be nice with a library with a basic prototyping mode, where you just type in S-expressions like

        (solve A B)

        and a compiled type-hinted mode for performance reasons where the S-expression is wrapped inside the compile-matrix-expression-macro:

        [A (double-matrix 3 3) B (double-matrix 3 1)] (solve A B))

        Also worth looking at how linear algebra is implemented in other languages, e.g. C++. See for instance Armadillo:

      • mikera7 says:

        Yeah, I’ve been playing with some ideas like that but haven’t got very far (just some idea notes so far….):

        My idea was to have expression data structures that can be used for different purposes: equation solving, optimised function compilation etc.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s