The sequence abstraction

Posted: August 27, 2012 in Uncategorized
Tags: , ,

The sequence abstraction brings me much joy.

In honour of this humble construct I thought I’d write a quick blog post: covering what it is and some examples of how elegant it can make your Clojure code.

Basics

At it’s core, the sequence in Clojure is something that provides two fundamental operations:

  • The ability to obtain the first item in the sequence
  • The ability to obtain the next items in the sequence (as a sequence, or nil if no more items exist)

Some quick examples of basic usage:

;; call first on a vector (implicitly treating it as a sequence)
(first [1 2 3 4])
=> 1

;; call next on a vector (implicitly treating it as a sequence)
(next [1 2 3 4])
=> (2 3 4)

Treating different things as sequences:

You can call seq to convert any “sequential” thing into a sequence, which works on a whole host of different types: any type of collection, Java arrays, Strings, etc.

;; get a sequence from a vector
(seq [1 2 3 4])
=> (1 2 3 4)

;; call seq on a java.lang.String, get a sequence of chars
(seq "foo")
=> (\f \o \o)

;; call seq on a map, getting a sequence of key-value pairs
(seq {:a 1 :b 2 :c 3})
=> ([:a 1] [:c 3] [:b 2])

;; call seq on a Java collection
(def mylist (java.util.ArrayList.))
(.add mylist "foo")
(.add mylist "bar")
(seq mylist)
=> ("foo" "bar")</pre>

Note that seq often isn’t needed explicitly since sequence functions implicitly call seq on their arguments – e.g. you can call first on a map directly if you just want the firt key-value pair.

The sequence library

Sequences in themselves are all well and good. But what makes them powerful is the extremely large and flexible library of functions that work upon sequences.

It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures.

Here are just a few examples:

;; treat a vector as a sequence and reverse it
(reverse [1 2 3 4 5])
=> (5 4 3 2 1)

;; take the first few items from a sequence
(take 3 [1 2 3 4 5])
=> (1 2 3)

;; count the number of items in a list
(count '(:a :b :c :d))
=> 4

;; apply a function to every element of a sequence, return the results
(map (fn [x] (* x x)) [1 2 3 4 5])
=> (1 4 9 16 25)

;; sort a sequence
(sort "abracadabra")
=> (\a \a \a \a \a \b \b \c \d \r \r)

;; pour a sequence into a new collection
(into [1 2] (list 3 4 5 6))
=> [1 2 3 4 5 6]

These are just the tip of the iceberg – there are literally hundreds of functions in the core Clojure library that work on sequences, and many thousands more in the Clojure library ecosystem. It is a pretty ubiquitous abstraction.

Lazy sequences

Sequences have the nice property that they can be lazy – i.e. they are only calculated upon demand. This follows from the fact that they only need to provide the first and next operations, and it is open to the sequence implementation to defer the calculation of the rest of the sequence until it is required (via successive calls to next).

Lazy sequences have a few nice properties:

  • You can work with very large sequences, including sequences that exceed the total size of memory, providing you don’t try to hold the whole sequence in memory at once. The trick here is to avoid keeping a reference to the head of the sequence, as long as you do this then the garbage collector will clean up the earlier part of the sequence while you continue to process the rest.
  • At the extreme, they can even be infinite. Obviously you will never be able to fully realise an infinite sequence, but infinite sequences make it very easy to express certain algorithms elegantly (see the fibonacci example below) )and they work fine providing you only take as much of the sequence as you need.
  • They provide a convenient representation for many otherwise complex operations, such as a stream of messages resulting from IO

Some lazy sequence examples:

;; Take 10 items from an infinite sequence (infinite range of numbers)
(take 10 (range))
=> (0 1 2 3 4 5 6 7 8 9)

;; add up a larger-than-memory lazy sequence
(reduce + (range 0 10000000000))

;; Define an infinite lazy sequence of fibonacci numbers, take the first 10
(def fibs (concat [0 1] (lazy-seq (map + fibs (rest fibs)))))
(take 10 fibs)
=> (0 1 1 2 3 5 8 13 21 34)

Sequence tricks

Finally here are some assorted nice tricks you can use with sequences.

;; Use a threading operator to perform multiple operations on sequences
(->> [1 2 3 4 5 6 7 8]
     (take 5)
     (map (partial * 2))
     next)
=> (4 6 8 10)

;; get the frequencies of different elements in a sequence
(frequencies "abracadabra")
=> {\a 5, \b 2, \r 2, \c 1, \d 1}

;; get the most common element in a sequence (mode)
(first (first (sort-by second > (frequencies "abracadabra"))))
=> \a

;; transpose a matrix by treating it as a sequence of sequences
(apply mapv vector [[1 2] [3 4]])
=> [[1 3] [2 4]]
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s