While I have presented a number of topics in the past, this was the first time that I found myself being very sceptical on the way I should present things. I've tried asking the organizers to get a sense about the audience, but the organizers were unsure. I thought that I had to be somewhat generic on my approach and that made things a bit harder. Senior people are intrigued by other things compared to junior people and people coming from the Java world have different interests compared to people that are coming from the Javascript world. As I would mainly focus on Clojure and I was planning to show host interop, that could potentially be of not much interest to people from non JVM backgrounds, even though those concepts are not strictly tied to the JVM See for example Clojurescript.
With those things in mind, I wanted to have a mix of trivial and non-trivial examples to explain some concepts, because learning concepts can be much more interesting compared to showing how to do X or Y. At the same time a talk has to be entertaining, so some examples would need to be more fun even if I had to leave more important stuff out of the agenda. This is how I settled on having four small groups of examples - even though some grew quite lare - and leaving some time in the end for a more complete thing - which now I'm realizing that I most likely won't have time to go through.
Also, I've decided to have almost the entire presentation with a fired REPL and writing the code on-the-fly. After all, after all this years, that's the nicest thing about Clojure for me. At the same time I've decided to allow people to interrupt me at any time so we can experiment with whatever comes to their minds.
Below are those four sections with some of the examples I'm planning to show, and some that I will most likely be leaving out, but are worth mentioning here.
And of course all that needs to fit in, more or less, 45 minutes.
Let's go!
The good thing with Clojure is that explaining the syntax takes just a couple of minutes.
Creating a hello world can be as simple as;
(defn hello
"says hello to someone"
[name]
(println (str "hello " name)))
and calling this function is just:
(hello "world")
hello world
=> nil
It's worth mentioning and explaining the REPL, how Cursive works which I'll be using. Highlighting editor support in general and show some basic functions.
This will most likely be the shortest section.
Even though there were efforts in the Java world to get functional pesistent data structures, it's hard to leave what the SDK provides and what everything is using. Java Streams improved a lot the way to work with collections, but the basic interfaces are still there and those are mutable.
Clojure comes with persistent functional data structures which are immutable with a very flexible and extensive API.
What is particulary interseting about the data structures, which leads very quickly to the value of values is this:
(def list1 (list 1 2 3))
Lists are not the data structure Clojure devs usually work with (does the source code counts?), as that would be the vector, but it's easier to present what follows:
list1
=> (1 2 3)
Let's create a 2nd list by adding 4 to it:
(def list2 (conj list1 4))
Is this a surprise:
list2
=> (4 1 2 3)
The function conj
will add an element and get a new data structure. Where the element
is added is determined by the nature of the data structure. For linked lists it's more
natural to add this item to the front, for vectors it is to the end, and for sets you
don't know as those are unordered:
(conj [1 2 3] 4)
=> [1 2 3 4]
(conj #{1 2 3} 4)
=> #{1 4 3 2}
Now if we get back to our original lists. Let's create a 3rd one
(def list3 (conj list1 5))
Now calling the function rest
, will give us all the elements but the first:
(rest list3)
=> (1 2 3)
That's expected.
What if we now compare the lists 2 and 3 without the first element:
(= (rest list2) (rest list3))
=> true
That's expected. The =
functions compares if the values are equal. For example
(= [1 2 3] [1 2 3])
=> true
That's also expected. But are those two vectors identical?
(identical? [1 2 3] [1 2 3])
=> false
They are not, because those are not the same object. If someone checks under the hood,
a reference comparison (Java's ==
) will be found.
But:
(identical? (rest list2) (rest list3))
=> true
Aha! Even though we have created two new lists, both still share their structure
with list1
.
For more about this someone can start digging from section 3.4.1
in the
Clojure history paper.
It's time to create something more interesting. Factorial and Fibonacci, as common as they are, are my favorite things to showcase.
Let's start with a simple recursive factorial implementation:
(defn fact [n]
(if (= n 0)
1
(* n (fact (- n 1)))))
Built-in if
just joined the party, and there are a number of new guests
just outside the door. That works as expected:
(fact 5)
=> 120
Or does it?
(fact 120)
Execution error (ArithmeticException) at java.lang.Math/multiplyExact (Math.java:1032).
long overflow
Ah, that's a big number and thankfully Clojure doesn't allow it to overflow. We can easily
fix this, by changing to the *'
which supports arbitary precision.
(defn fact [n]
(if (= n 0)
1
(*' n (fact (- n 1)))))
(fact 120)
=>
6689502913449127057588118054090372586752746333138029810295671352301633557244962989366874165271984981308157637893214090552534408589408121859898481114389650005964960521256960000000000000000000000000000N
That works but what is this? Let's check:
(type (fact 5))
=> java.lang.Long
(type (fact 120))
=> clojure.lang.BigInt
As one could expect clojure.lang.BigInt
is built on top of java.math.BigInteger
.
So that works. Or ... does it?
(fact 12000)
Execution error (StackOverflowError) at clj-intro.demo/fact (demo.clj:4).
null
Unfortunately, our recursive definition is consuming the stack.
How about this:
(defn fact [n]
(loop [n n
r 1]
(if (= n 0)
r
(recur (- n 1) (*' r n)))))
That works for 12000. Trust me.
While loop/recur looks like a recursive thing, it's not.
We need to pull clj-java-decompiler that will helps us checking what is going on.
First we need to update our namespace definition:
(ns clj-intro.demo-final
(:require [clj-java-decompiler.core :as dec]))
Now we can decompile the loop
:
(dec/decompile
(loop [n 5
r 1]
(if (= n 0)
r
(recur (- n 1) (*' r n)))))
Oh!
// Decompiling class: clj_intro/demo$fn__2355
package clj_intro;
import clojure.lang.*;
public final class demo$fn__2355 extends AFunction
{
public static Object invokeStatic() {
long n = 5L;
Object r = RT.box(1L);
while (n != 0L) {
final long minus = Numbers.minus(n, 1L);
r = Numbers.multiplyP(r, n);
n = minus;
}
return r;
}
@Override
public Object invoke() {
return invokeStatic();
}
}
Yes, Clojure complies to JVM bytecode, and now are are looking under the hood.
But recur
can also work for functions, and not just loops. E.g:
(defn foo [x]
(when (not= x 0)
(println x)
(recur (- x 1))))
And
(foo 5)
5
4
3
2
1
=> nil
So why we didn't recur in our factorial example? I'll leave that as an exercise for the reader.
A more Clojure like way of defining factorial would be:
(defn fact [n]
(reduce *' (range 1 (inc n))))
It's finonacci time now!
Let's start simple and use the iterate
and take
functions:
(take 10 (iterate inc 0))
=> (0 1 2 3 4 5 6 7 8 9)
That souldn't provoke any suprices.
What about this:
((fn [[a b]]
[b (+ a b)]) [3 5])
=> [5 8]
We are defining an anonymous function, invoking it with the typle [3 5], destructuring the params,
and returning a new tuple of b
and a + b
.
Let's connect those two pieces:
(take 10 (iterate (fn [[a b]]
[b (+ a b)]) [0 1]))
=> ([0 1] [1 1] [1 2] [2 3] [3 5] [5 8] [8 13] [13 21] [21 34] [34 55])
Aha! Now we are providing a tuple of [0 1] as the iteration seed and generating one new tuple each time.
That's it really. Then we just take the first of the tuples.
(map first (take 10 (iterate (fn [[a b]]
[b (+ a b)]) [0 1])))
=> (0 1 1 2 3 5 8 13 21 34)
Let's rewritte it in a simpler way:
(->> [0 1]
(iterate (fn [[a b]]
[b (+ a b)]))
(take 10)
(map first))
And we are done:
(defn fibo-seq [n]
(->> [0 1]
(iterate (fn [[a b]]
[b (+' a b)]))
(take n)
(map first)))
And:
(drop 100 (fibo-seq 105))
=>
(354224848179261915075N 573147844013817084101N 927372692193078999176N 1500520536206896083277N 2427893228399975082453N)
But is it performant?
(drop 1000000 (fibo-seq 1000005))
Exception in thread "nREPL-session-f6706ab3-a5aa-4fcd-8161-1a89974bdd8c" java.lang.OutOfMemoryError: Java heap space
...
It's not. For a performant solution we would need to approach this differently which wasn't originally in the initial scope of this presentation. However, when chatting about this with a friend, he pointed out that it would be nice to present higher order functions.
I had thoughts about this, and my initial reaction was to not include higher order functions. Blasphemy, I know. My main reasoning behind this is that higher order functions are complex, and if someone is not accustomed to functional thinking it is not an easy-to-grasp concept.
I'm still not sure if I'll demo this, but as I said previously, here I can include outtakes!
So here it goes.
Let's get back to our fibonacci example, and create a recursive version.
(defn fibo [n]
(condp = n
0 0
1 1
(+ (fibo (- n 2)) (fibo (- n 1)))))
The new thing here is the condp
macro (more on macros below), which just allows
us having sort of switch
but really compact.
This works, but it's slow. Too slow actually.
(time (fibo 45))
"Elapsed time: 24255.134298 msecs"
=> 1134903170
Can we do better by using memoization?
This example and an explanation of atoms
can be found here.
Lets create the memoization higher order function:
(defn memo [f]
(let [mem (atom {})]
(fn [& args]
(if-let [mem-v (@mem args)]
mem-v
(let [v (apply f args)]
(swap! mem assoc args v)
v)))))
There are a lot of things going on here. So let's break things one by one.
atom
hash-map that stores for a given sequence of param args
the outcome of f
invocation on them.fn
that will be returned will enclose the mem
binding
and it will exist in its context.& args
. Those arguments are provided to
us in a array sequence.mem
atom for this key and if there is something there we provided it back. Otherwise:v
, store it in the atom and returning the v
back.For people new to Clojure or functional programming this may require some mental effort to digest. But it should be noted, that this is not a trivial example.
Now we can define the memo-fibo
by simply:
(def memo-fibo (memo fibo))
And, here it goes, it's way way faster this time:
(time (memo-fibo 45))
"Elapsed time: 0.027833 msecs"
=> 1134903170
That's all good, but will this work?
(time (memo-fibo 1000005))
Spoiler, no, but why?
Just for catharsis, we can once again implement fibonacci with a loop:
(defn fibo [n]
(loop [a 0
b 1
n n]
(if (= n 0)
a
(recur b (+' a b) (dec n)))))
This time it works:
(time (fibo 1000005))
"Elapsed time: 10082.013057 msecs"
=>
a very very long number
The JDK comes with tons of built-in things one may use. For example:
(slurp (new java.io.File "/dev/null"))
=> ""
Well, of course.
A Java engineer should wonder here: What happened to my checked exception?
When using clojure you are not forced to check for checked exceptions.
Checking for exceptions is done with the try
special form:
(try
(/ 1 0)
(catch ArithmeticException e
(println "oh no" (.getMessage e))))
oh no Divide by zero
=> nil
Clojure provides many interesting ways to interact with the host language and
what it provides.
One nice way, is using the doto
macro which can help make our code look like
the builder pattern:
(doto (java.util.HashMap.)
(.put "foo" 1)
(.put "bar" 2)
(println))
#object[java.util.HashMap 0x2b2c500e {bar=2, foo=1}]
=> {"bar" 2, "foo" 1}
But wait, what is a macro?
As the reference says:
Clojure has a programmatic macro system which allows the compiler to be extended by user code.
I don't intend getting into macros mainly because of time constraints, but it's still interesting to check what does this macro do:
(macroexpand-1 '(doto (java.util.HashMap.)
(.put "foo" 1)
(.put "bar" 2)
(println)))
=>
(clojure.core/let
[G__2049 (java.util.HashMap.)]
(.put G__2049 "foo" 1)
(.put G__2049 "bar" 2)
(println G__2049)
G__2049)
By expanding this macro we see that a hash map is contructed and bound to the symbol G__2049
.
Then there are the .put
calls, then a println
and in the end the symbol is returned.
OK, let's get back to the hash map. An interesting thing one could observe is that it looks
like an actual clojure hash map which can be constructed with the literal form {:a 1}
. But
does that mean that we can compare it to a Java hash map?
Indeed:
(= {"foo" 1 "bar" 2}
(doto (java.util.HashMap.)
(.put "foo" 1)
(.put "bar" 2)))
=> true
We can. This is most exciting.
It's time to check polymorphism and even though there are multiple things to show here, my main focus will be showing protocols and ad-hoc implementations of Java interfaces. It's also worth mentioning the expression problem.
Let's assume we want to create a generic mechanism for concatenating things.
Let's create a protocol, which kinda looks like a Java interface:
(defprotocol IConcat
(ccat [this that]))
This protocol can now be extended to existing types in the running VM.
I'm not using the name concat
it already exists in the Clojure core and it will
be used below.
We can try to:
(ccat "foo" "bar")
Execution error (IllegalArgumentException) at clj-intro.demo-final/eval2064$fn$G (demo_final.clj:107).
No implementation of method: :ccat of protocol: #'clj-intro.demo-final/IConcat found for class: java.lang.String
The compiler explains that there is no implementation of the method ccat
for the String
class.
Let's fix that:
(extend-type java.lang.String
IConcat
(ccat [this that]
(str this that)))
Let's try again:
(ccat "foo" "bar")
=> "foobar"
That looks good. Let's try with vectors now:
(ccat [1 2] [3 4])
Execution error (IllegalArgumentException) at clj-intro.demo-final/eval2064$fn$G (demo_final.clj:107).
No implementation of method: :ccat of protocol: #'clj-intro.demo-final/IConcat found for class: clojure.lang.PersistentVector
We can do the same thing as before:
(extend-type clojure.lang.PersistentVector
IConcat
(ccat [this that]
(concat this that)))
And now:
(ccat [1 2] [3 4])
=> (1 2 3 4)
Which alos looks good.
What if we mix things? Such as:
(ccat "foo" [1 2])
=> "foo[1 2]"
(ccat [1 2] "foo")
=> (1 2 \f \o \o)
Again that's something that may look confusing but it's relatively easy to explain. Main thing
to take into account here is that the first param to the ccat
function is the dispatching type,
so in the first case we will be using the str
function and in the second, the concat
, as in
the 1st case we are dispatching on java.lang.String
and in the 2nd on clojure.lang.PersistentVector
. The later,
concatenates collections and a String is just a collection of characters.
It's time to check some Java Streams interactions. Let's see:
(-> (java.util.stream.Stream/of (to-array ["12.43" "1.123" "0.54"]))
(.map (reify java.util.function.Function
(apply [this x]
(parse-double x))))
(.collect (java.util.stream.Collectors/toList)))
=> [12.43 1.123 0.54]
I know that people can get upset with the fully qualified names, so let's do some imports
by updating our namespace
declaration:
(ns clj-intro.interop
(:import (clojure.lang IPersistentCollection)
(java.util.function Function)
(java.util.stream Collectors Stream)))
Now:
(-> (Stream/of (to-array ["12.43" "1.123" "0.54"]))
(.map (reify Function
(apply [this x]
(parse-double x))))
(.collect (Collectors/toList)))
=> [12.43 1.123 0.54]
This looks much better. The key takeaway here is that we can use reify to implement Java interfaces in an ad-hoc way.
I hope this was instructive and fun. Even if I have gone through some of those examples many times, I still find it
interesting and entertaining to go through them again. And each time I go through them I'm finding
things I didn't know e.g. what type is & args
?
I was probably over ambitions to also include a more end-to-end examples i.e. building something that actually does something, so I'll keep those as pointers here to potential next blog posts: