core.matrix is starting to get serious as a tool for Clojure numerics. I’ve been contributing to the effort because I need it for my own machine learning work and also for some of my more fun / artistic projects like enlight and Clisk.
I’m pretty pleased with how it is working so far:
- The API is proving to be a convenient way of doing matrix / vector maths in Clojure
- Most of the usual matrix / vector operations are now supported
- The idea of allowing multiple underlying implementations (using Clojure protocols) seems to be working well. Several people are working on different implementations / wrappers with various goals and levels of maturity
- Performance is proving to be very good – more on this topic below
core.matrix itself is an API that defines functions to operate on vectors, matrices and other forms of multi-dimensional arrays. The operations are simple to use, and work as you would expect:
(use 'clojure.core.matrix) ;; the main core.matrix API (use 'clojure.core.matrix.operators) ;; for +, -, * etc. operators ;; adding vectors (+ [1 2 3] [4 5 6]) => [5 7 9] ;; normalising a vector to unit length (normalise [3 4 5]) => [0.4242640687119285 0.565685424949238 0.7071067811865475] ;; matrix multiplication with a vector (* [[2 0] [0 2]] [1 2]) => [2 4]
As you can see, all of the core.matrix functions work just fine with regular clojure vectors as parameters. This is one of the virtues of using protocols as a tool for implementing core.matrix – just extend the protocols to clojure.lang.IPersistentVector and voila: Clojure vectors are now first-class core.matrix values.
But in order understand the true performance potential, you would want to use an optimised core.matrix implementation, so I’ve chosen vectorz-clj for this purpose.
;; tell core.matrix to use a specific implementation (set-current-implementation :vectorz)
So with that set up complete, here’s a quick benchmark (using criterium) for small-sized vector addition:
(require '[criterium.core :as c]) ;; Adding two regular Clojure vectors with clojure.core/+ (let [a [1 2 3 4 5 6 7 8 9 10] b [1 2 3 4 5 6 7 8 9 10]] (c/quick-bench (dotimes [i 1000] (vec (map clojure.core/+ a b))))) ;; => Execution time mean per addition : 1308 ns ;; Adding two core.matrix vectors (pure functions, i.e. creating a new vector) (let [a (matrix :vectorz [1 2 3 4 5 6 7 8 9 10]) b (matrix :vectorz [1 2 3 4 5 6 7 8 9 10])] (c/quick-bench (dotimes [i 1000] (+ a b)))) ;; => Execution time mean per addition: 68 ns ;; Adding two core.matrix vectors (mutable operation, i.e. adding to the first vector) (let [a (matrix :vectorz [1 2 3 4 5 6 7 8 9 10]) b (matrix :vectorz [1 2 3 4 5 6 7 8 9 10])] (c/quick-bench (dotimes [i 1000] (add! a b)))) ;; => Execution time mean per addition: 36 ns ;; Adding two core.matrix vectors using low level Java interop (let [a (Vectorz/create [1 2 3 4 5 6 7 8 9 10]) b (Vectorz/create [1 2 3 4 5 6 7 8 9 10])] (c/quick-bench (dotimes [i 1000] (.add a b)))) ;; => Execution time mean per addition: 11 ns
It’s only a simple test, but hopefully indicative of the kind of performance gains you can now expect to achieve with core.matrix : something like 20-100x the performance of the equivalent unoptimised Clojure code
Here’s another example to show that large vectors can also run very efficiently – adding together 30 million random numbers in a vector:
(let [v (new-vector 30000000)] (dotimes [i 30000000] (mset! v i (Math/random))) (time (esum v))) => "Elapsed time: 39.48065 msecs"
Here you can see that the overall time averaged about 1.3 nanoseonds per element added. I think that is pretty near optimal on my machine, i.e. about the same performance you would expect from optimised native code. But all this is running in pure Java/Clojure on the JVM.
This has only been a short article, but I hope it demonstrates that Clojure has the potential to be extremely effective in the numerics space.