My friend Andrew recently dicovered the joys of programming, and shared his cool Python code for testing numbers for primality. I wanted to encourage this enthusiasm, and make sure that he learned about functional programming before his mind got too stuck in the ruts of C-style imperative code (which makes up the bulk of examples out there), so I decided to play around with writing some of my own examples for him.
I think one of the best languages for exploring functional programming today is probably Racket, but I haven't had time to learn that yet myself. The pinnacle in many ways is Haskell but that's not nearly as approachable, and I would be even more lost trying to introduce someone there. But I have had the privilege of using Clojure for my professional work for the past several years, and find it a lovely, practical modern Lisp with a focus on simplicity and functional programming, and with a great interoperability story for the Java Virtual Machine environment, which is the indutrial-strength platform we use to deliver our products. (It's also been perfect for my open-source projects in areas like light show control and DJ performance synchronization.) So I'm well equipped to guide a tour through finding prime numbers in Clojure!
Since Clojure is a Lisp, it looks very different than C-like languages (including Python). The first thing to get used to is that pretty much everything is an expression, formed by grouping a list of symbols inside parentheses. The first symbol specifies what you want to do and is generally a function name (or a macro, or a special form, but we won't need to worry about those details in this discussion). Then you follow that with what you want to do it to. So for example, to divide 10 by 2, you would write:
(/ 10 2)
And you would see the result, 5
. Not too surprising. A bit more
surprising is what happens if you try dividing 10 by 3 (from now on,
to save space in these examples, I am going to show what I type at the
Clojure REPL prompt, followed by a comment prefixed with ;; =>
that
shows the response):
(/ 10 3)
;; => 10/3
Wait, 10/3
, what witchcraft is this? It didn't give me an answer, or
at least not the kind I expected! In fact, Clojure is biased towards
retaining as much precision as possible, so when you ask it to perform
a division whose result does not have an exact integer representation,
it returns a rational value which is exactly equal to what you wanted.
You can continue to use that result in further mathematical operations
with no loss of precision, or you can tell Clojure you want a
floating-point approximation of the result, like you would get in most
languages, like so:
(float (/ 10 3))
;; => 3.3333333
So, for finding primes, what we want to know is whether a number
evenly divides another. The rem
function helps with that, giving us
the remainder:
(rem 10 2)
;; => 0
(rem 10 3)
;; => 1
We can combine that with the zero?
predicate (a predicate is a
function that returns true
or false
given its input) to build up
our test for whether one number evenly divides another:
(zero? (rem 10 3))
false
(zero? (rem 10 2))
true
Great, we get true
when the second number evenly divides the first,
and false
otherwise. This is useful enough that I want to give it a
name, to be able to use it in other expressions. So I will define it
as a function, using the defn
macro:
(defn divides?
"Returns true when the numerator can be evenly divided by the
denominator."
[numerator denominator]
(zero? (rem numerator denominator)))
;; => #'user/divides?
There is a little more structure to this than we've encountered so
far, but hopefully it's still easy enough to figure out what is going
on. We are defining a new function named divides?
. The string that
follows the function name is an optional "doc string", which helps
explain how to use the function, both when reading the definition, and
also when you are typing at the REPL (this word comes from "Read,
Eval, Print, Loop", the core of interactive programming with a Lisp
like Clojure). At a REPL prompt you can type (doc divides?)
and get
that explanatory text:
(doc divides?)
;; => -------------------------
;; => user/divides?
;; => ([numerator denominator])
;; => Returns true when the numerator can be evenly divided by the
;; => denominator.
Following the doc string is the parameter list, which specifies what
the function expects to be given when called, in our case two values
which we name numerator
and denominator
. And then finally, the
last line is the actual body of the function, which uses the zero?
predicate and rem
function as we did in our earlier examples. The
return value we saw at the REPL simply reported that the function
divides?
had been created in the user
namespace, which goes a bit
far afield from what we are talking about today.
With this in place, we can try out our new function. It works just as you'd expect:
(divides? 10 2)
;; => true
(divides? 10 3)
;; => false
Incidentally, we can use doc
on the other functions we've been using
up to this point:
(doc zero?)
;; => -------------------------
;; => clojure.core/zero?
;; => ([num])
;; => Returns true if num is zero, else false
And now we are ready to build our prime-tester! As in Andrew's example
I want to start by verifying that we were given a positive integer.
There are two built-in predicates we can use for that, integer?
and
pos?
. All that remains is to check whether there is any number
ranging from two up to the square root of our candidate number that
evenly divides it. If not, then we can be sure it is prime.
While in imperative languages you do this kind of thing by writing an
explicit loop that assigns a variable to hold each value you want to
consider, in a functional language you instead filter sequences of
values using predicates. So let's start by figuring out how to express
the sequence of values we want to check. To test if ten is prime, the
range of values we need to test if it is divisible by starts at two,
and goes up to three (the largest integer that is less than the square
root of ten). There is a built-in function range
that gives a range
of values. If we try it out with the square root of our candidate
value, we see it seems almost give us the numbers we want:
(range (Math/sqrt 10))
;; => (0 1 2 3 4)
We have a couple of extra values at the beginning, because range
defaults to starting with zero, and although it goes far enough in
this case, because it gives you values up to but not including the
upper bound you supply, it would not work if we gave it a perfect
square as its input. Let's tackle the first problem first: We could
use the drop
function to get rid of the first two values, but if we
check the doc
for range
we find that there is also another way to
call it, which lets us supply the starting value too:
(doc range)
-------------------------
clojure.core/range
;; => ([] [end] [start end] [start end step])
;; => Returns a lazy seq of nums from start (inclusive) to end
;; => (exclusive), by step, where start defaults to 0, step to 1, and end to
;; => infinity. When step is equal to 0, returns an infinite sequence of
;; => start. When start is equal to end, returns empty list.
So (range 2 (Math/sqrt 10))
gets us almost there; it works great for ten:
(range 2 (Math/sqrt 10))
;; => (2 3)
But if we try it with 9, which is a perfect square, we don't get the value 3, which is its square root, and which we definitely do need to test as a factor:
(range 2 (Math/sqrt 10))
;; => (2)
To fix that, we will round the result of our square root operation
down to the closest integer less than or equal to it, using the
Math/floor
function, and then add one to that:
(range 2 (inc (Math/floor (Math/sqrt 9))))
(2 3)
That slightly more complex formula also works for ten:
(range 2 (inc (Math/floor (Math/sqrt 10))))
(2 3)
Ok, that gives us the numbers we want to check if ten is divisible by.
Ten is prime if it is not divisible by any of those numbers. And the
way we express that in Clojure is almost shockingly concise and
readable, once you get over all the parentheses. At this point I want
to show you the finished prime?
function, and explain how the final
pieces fit together:
(defn prime?
"Returns true when `n` is prime."
[n]
(and (integer? n)
(pos? n)
(not-any? (partial divides? n) (range 2 (inc (Math/floor (Math/sqrt n)))))))
We already looked at defn
, the doc string, and argument list. Our
prime?
function takes a single argument n
, and (as suggested by
the question mark at the end of the name) is a predicate, returning
true
or false
. In the body, we use a new and
function, which, as
its name suggests, returns true
if all of its arguments are
themselves true. In fact, it stops evaluating and returns false as
soon as it encounters a false argument, so we don't have to worry
about division by zero or other problems, because the first two things
we check are that n
is a positive integer.
That last line is where the rubber meets the road. And at first glance
it might seem confusing. But notice that the second half, the range
expression, we have already explored. That returns a sequence of
integers ranging from 2 to the square root of n
. It is used as the
second argument of the not-any?
function. not-any?
takes a
predicate and a collection, and returns true
if no elements of the
collection satisfy the predicate. The collection we gave it is the
list of numbers we want to check as potential divisors of our
candidate prime number. So we need a predicate that returns true
if
the potential divisor evenly divides n
.
In other words, we want a function that takes a single number, and
returns true
if that number evenly divides n
, whatever n
happened to be when prime?
was called. One of the really powerful
things about functional languages is that they make it very easy to
create, modify, and pass around functions, and our little bit of code
here is a perfect example of that. The partial
function is an
example of a higher order function, which means a function that
manipulates other functions. You give partial
the name of a function
that you want to work with, and some arguments, and it returns the
result of partially applying those arguments to the function.
What?
Well, let me be more conrete. We have already defined a function
divides?
that takes two arguments. We want a function that acts just
like that, only with the first argument, the numerator, already filled
in, because we want a predicate that takes one argument and tests
whether that value evenly divides n
. We can do exactly that:
(partial divides? n)
creates a new function that calls divides?
with the values n
and whatever you pass to this new function. So
that is exactly what we need as a predicate for not-any?
to use. Our
prime?
function will return true
if no integer from 2 to the
square root of n
can evenly divide n
. And in so few words! The
explanation is far bigger than the code itself, even counting the doc
strings. Let's try it out!
(prime? 10)
;; => false
(prime? 11)
;; => true
(prime? 1001)
;; => false
user=> (prime? 1009)
;; => true
So that was to show Andrew what his prime tester would look like in
Clojure. But this is just where the fun begins! Now that we have a
nice prime?
predicate, we can use it to easily build the list of all
prime numbers:
(def primes
"The prime numbers."
(filter prime? (range)))
The filter
function takes a predicate and a collection, and returns
a new collection that contains only those elements for which the
predicate was true. So in this case, we supply all the natural numbers
for our range (when you give no arguments to range
, the lower bound
is zero, and there is no upper bound), and filter out anything that
isn't prime.
Whoa, James, you're crazy. That can't work. I know I bought a fancy computer, but it doesn't have enough memory to store all numbers, and even if it could, even with my fast processor, it would take too long to figure out which elements in that infinite sequence were prime!
Here's the crazy thing, it does work! And the reason it does is that
Clojure supports lazy sequences, meaning the values are not produced
until you actually ask for them. So we can quite happily define
infinite sequences, and only pay for the production of the values we
actually want to work with. So yes, if you tried to print out the
value of primes
, your computer would chunk away until it ran out of
memory, but we can use the take
function to look at as many as we
want to without that problem. Here are the first ten primes:
(take 10 primes)
;; => (1 2 3 5 7 11 13 17 19 23)
And, actually, this reveals a mistake we made early on. Technically, 1
is not considered a prime number, and including it in this list is
going to break some of the fancy things we are about to do, so let's
go back and fix this. (I left this change until now, because I wanted
the code to exactly parallel Andrew's Python example, but we are going
to start diverging a bit.) Here is the revised version of prime?
:
(defn prime?
"Returns true when `n` is prime."
[n]
(and (integer? n)
(> n 1)
(not-any? (partial divides? n) (range 2 (inc (Math/floor (Math/sqrt n)))))))
I will include a complete copy of the final version of all this code at the end of this article for people who want to examine it as a whole, or load it into their own REPL and play.
Reloading everything with that change in place, we get this more traditional list of the first ten primes:
(take 10 primes)
(2 3 5 7 11 13 17 19 23 29)
And we can combine take
with drop
to skip to the part of the
sequence we're interested in. Here's how we would find out what the
thousandth prime is:
(take 1 (drop 999 primes))
;; => (7919)
Or what about the ten-thousandth?
(take 1 (drop 9999 primes))
;; => (104729)
But that was starting to make my computer work hard. There was a
noticeable delay between when I hit return
and when I got the answer
back. Interestingly, however, if I ask it a second time, I get the
answer instantly. That's because the primes
sequence remembers
values that have aleady been calculated, and only has to do work when
you ask it to go out beyond what it has already figured out.
Let's take a look at exactly how hard it is working, and start
exploring some interesting ways we can calculate primes faster. If I
quit the Clojure REPL, then reload my functions, I can time how long
getting the ten-thousandth prime took using the time
function.
At first I was getting utterly bogus values by trying it directly like this:
(time (take 1 (drop 9999 primes)))
;; => "Elapsed time: 0.02719 msecs"
But then I remembered one of the pitfalls of working with lazy
sequences. In the above attempt, we never did anything with the
expression, so it stayed lazy, and the prime number sequence never got
built. So to force it to really do the work, I added a call to
vec
, which converts the result to a non-lazy vector, thereby forcing
evaluation. Armed with that better approach, and starting with a fresh
Clojure REPL, I got the following results:
(time (vec (take 1 (drop 9999 primes))))
;; => "Elapsed time: 28318.878726 msecs"
;; => [104723]
The square brackets around the result show it is a vector rather than a sequence, and you can see it took nearly thirty seconds to come up with the answer.
Running it a second time it already knew the primes up to that point, so it went vastly faster:
(time (vec (take 1 (drop 9999 primes))))
;; => "Elapsed time: 1.320859 msecs"
;; => [104723]
No noticeable delay. Lazy sequences are cool! And we can use them in really powerful ways, as we will now explore.
That was some very compact code that gave us a nice list of prime
numbers. But we can do even better! Now that we have this sequence of
prime numbers, we can make use of it in our definition of prime?
itself, because it is actually redundant to try dividing n
by
every number from 2
to sqrt(n)
, we only need to test the prime
numbers in that range.
To make this convenient, let's define a couple of new functions. First, a predicate that will tell us whether the prime factor we are considering is small enough that we need to test it:
(defn sqrt-or-less?
"Returns true when `candidate` is less than or equal to the
square root of `n`."
[n candidate]
(<= candidate (Math/sqrt n)))
Then, a function that returns the set of prime numbers we need to try
dividing our number by, in order to see if it is prime. In other
words, all prime numbers from 2 up to the square root of our number.
Because this function is going to use our primes
sequence, but we
can't define primes
until later because it needs to use this
function, we have to promise the Clojure compiler that primes
is a
value we will define later on. Then we can write the new function
using it:
(declare primes)
(defn potential-prime-factors
"Returns the prime numbers from 2 up to the square root of `n`."
[n]
(take-while (partial sqrt-or-less? n) primes))
This introduces another new function, take-while
, which is like
take
but instead of knowing in advance how many values we want, we
can supply a predicate, and we will get back all the values up to the
one for which that predicate is no longer true. The predicate we want
is one that is true when the number is small enough to potentially be
a prime factor of n
, so that means it is less than or equal to the
square root of n
. We once again use the partial
higher-order
function to build that predicate, by "pre-filling" in the value of n
as the first argument of our sqrt-or-less?
predicate, and then we
keep feeding it prime numbers until we reach one that is too big.
I could have done this next bit without adding those two new helper functions, but that would require using Clojure's "function literals" to define functions right where I am using them, and that is more of a detour than makes sense for this introduction. And sometimes spelling things out and naming them makes for a more readable solution to my future self anyway.
Finally, a redefinition of our prime?
function, taking advantage of
these new helper functions:
(defn prime?
"Returns true if `n` is prime."
[n]
(and (integer? n)
(> n 1)
(not-any? (partial divides? n) (potential-prime-factors n))))
It's now even more self-explanatory, thanks to the nicely named helper functions. But more importantly, is it faster? If I start in a fresh Clojure REPL, load the code, and time it, I see...
(time (vec (take 1 (drop 9999 primes))))
;; => "Elapsed time: 121.442485 msecs"
;; => [104729]
Indeed! A tenth of a second is definitely faster than thirty. Nice. And what a fun exploration this has been. Thanks, Andrew! Even though I kind of knew where I was hoping to end up, I was impressed by just how perfectly lazy sequences fit this problem. I was able to define the prime number sequence using a function that depends on the prime number sequence, and since things are not evaluated until they are needed, that works perfectly, and only does work once, the first time it is needed.
When I first posted these examples, I had misremembered how big you need to go when testing potential prime factors, and was going all the way up to half of
n
, rather than to its square root. That still saved time, taking just over six seconds, but a tenth of a second is way better. Thanks to my newest coworker Sarah for reminding me of this important fact.
Here is the full code in its final version, as promised:
(defn divides?
"Returns true when the numerator can be evenly divided by the
denominator."
[numerator denominator]
(zero? (rem numerator denominator)))
(defn sqrt-or-less?
"Returns true when `candidate` is less than or equal to the
square root of `n`."
[n candidate]
(<= candidate (Math/sqrt n)))
(declare primes)
(defn potential-prime-factors
"Returns the prime numbers from 2 up to the square root of `n`."
[n]
(take-while (partial sqrt-or-less? n) primes))
(defn prime?
"Returns true if `n` is prime."
[n]
(and (integer? n)
(> n 1)
(not-any? (partial divides? n) (potential-prime-factors n))))
(def primes
"The prime numbers."
(filter prime? (range)))
Can you improve this documentation?Edit on GitHub
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close