Array manipulation library for Clojure with "sweet" array type notation and more safety by static types
Clojure has built-in support for Java arrays and provides a set of
facilities for manipulating them, including make-array
, aget
, aset
and so on.
However, some of their difficulties like the following tend to lead users to
write verbose or unexpectedly inefficient code:
[[D
, [Ljava.lang.String;
, etc.) and occasionally pretty hard for humans to write manuallyThese issues have been pointed out by various Clojurians out there in the past:
sweet-array
aims to provide solutions for them. Contretely:
As a result, we can write code like the following using sweet-array
:
(require '[sweet-array.core :as sa])
(defn ^#sweet/tag [[double]] array-mul [a b]
(let [a (sa/cast [[double]] a)
b (sa/cast [[double]] b)
nrows (alength a)
ncols (alength (sa/aget b 0))
n (alength b)
c (sa/new [[double]] nrows ncols)]
(dotimes [i nrows]
(dotimes [j ncols]
(dotimes [k n]
(sa/aset c i j
(+ (* (sa/aget a i k)
(sa/aget b k j))
(sa/aget c i j))))))
c))
Instead of:
(defn ^"[[D" array-mul [^"[[D" a ^"[[D" b]
(let [nrows (alength a)
ncols (alength ^doubles (aget b 0))
n (alength b)
^"[[D" c (make-array Double/TYPE nrows ncols)]
(dotimes [i nrows]
(dotimes [j ncols]
(dotimes [k n]
(aset ^doubles (aget c i) j
(+ (* (aget ^doubles (aget a i) k)
(aget ^doubles (aget b k) j))
(aget ^doubles (aget c i) j))))))
c))
Note that all the type hints in this code are mandatory to make it run as fast as the above one with sweet-array
.
Add the following to your project dependencies:
(new [T] n1 n2 ... nk)
The simplest way to create an array using this library is to use
the sweet-array.core/new
macro. The new
macro is a generic array constructor
which can be used to create both primitive and reference type arrays:
(require '[sweet-array.core :as sa])
(def xs (sa/new [int] 3))
(class xs) ;=> [I, which means int array type
(alength xs) ;=> 3
(def ys (sa/new [String] 5))
(class ys) ;=> [Ljava.lang.String;, which means String array type
(alength ys) ;=> 5
The first argument of the new
macro is what we call an array type descriptor.
See the Array type notation section for more details, but roughly speaking,
an array type descriptor [T]
denotes an array type whose component type is T
(e.g. [int]
denotes the int array type and [String]
denotes the String array type).
The new
macro can also be used to create multi-dimensional arrays.
The following example creates a two-dimensional int array:
(def arr (sa/new [[int]] 2 3))
(class arr) ;=> [[I, which means 2-d int array type
(alength arr) ;=> 2
(alength (aget arr 0)) ;=> 3
In general, (sa/new [[T]] n1 n2)
produces a 2-d array of type T
of size n1
xn2
and (sa/new [[[T]]] n1 n2 n3)
produces a 3-d array of type T
of size n1
xn2
xn3
,
and so on.
(new [T] [e1 e2 ... ek])
The new
macro provides another syntax to create an array enumerating
the initial elements. (sa/new [T] [e1 e2 ... ek])
creates an array
initialized with the elements e1
, e2
, ..., ek
:
(def arr (sa/new [int] [1 2 3]))
(alength arr) ;=> 3
[(aget arr 0) (aget arr 1) (aget arr 2)] ;=> [1 2 3]
In general, (sa/new [T] [e1 e2 ... ek])
is equivalent to:
(doto (sa/new [T] k)
(aset 0 e1)
(aset 1 e2)
...
(aset (- k 1) ek))
This form can be used to initialize arrays of any dimensionality:
;; 2-d double array
(sa/new [[double]] [[1.0 2.0] [3.0 4.0]])
;; 3-d boolean array
(sa/new [[[boolean]]]
[[[true false] [false true]]
[[false true] [true false]]]
When initializing multi-dimensional arrays, the init expression for each element may itself be an array or an expression that evaluates to an array:
(def xs (sa/new [double] [1.0 2.0]))
(def ys (sa/new [double] [3.0 4.0]))
(sa/new [[double]] [xs ys])
(into-array [T] coll)
Another way to create an array is to use the sweet-array.core/into-array
macro:
(require '[sweet-array.core :as sa])
(def arr (sa/into-array [int] (range 10)))
(class arr) ;=> [I
(alength arr) ;=> 10
Like clojure.core/into-array
, sa/into-array
converts an existing collection
(Seqable) into an array. Unlike clojure.core/into-array
, the resulting array
type is specified with the array type descriptor as the first argument.
sa/into-array
can also be used to create multi-dimensional arrays:
(def arr' (sa/into-array [[int]] (partition 2 (range 10))))
(class arr') ;=> [[I
[(aget arr' 0 0) (aget arr' 0 1) (aget arr' 1 0) (aget arr' 1 1)]
;=> [0 1 2 3]
(into-array [T] xform coll)
The sa/into-array
macro optionally takes a transducer.
This form is inspired by and therefore analogous to (into to xform from)
.
That is, the transducer xform
as the second argument will be applied
while converting the collection into an array:
(def arr (sa/into-array [int] (filter even?) (range 10)))
(alength arr) ;=> 5
[(aget arr 0) (aget arr 1) (aget arr 2)] ;=> [0 2 4]
This is especially useful to do transformations that increase or decrease the dimensionality of an array:
;; 1-d to 2-d conversion
(sa/into-array [[int]] (partition-all 2) (sa/new [int] [1 2 3 4]))
;; 2-d to 1-d conversion
(sa/into-array [double] cat (sa/new [[double]] [[1.0 2.0] [3.0 4.0]]))
(aget array idx1 idx2 ... idxk)
(aset array idx1 idx2 ... idxk val)
sweet-array
provides its own version of aget
/ aset
for indexing arrays.
They work almost the same way as aget
/ aset
defined in clojure.core
:
(require '[sweet-array.core :as sa])
(def ^"[I" arr (sa/new [int] [1 2 3 4 5]))
(sa/aget arr 2) ;=> 3
(sa/aset arr 2 42)
(sa/aget arr 2) ;=> 42
Of course, they can also be used for multi-dimensional arrays as
c.c/aget
& aset
:
(def ^"[D" arr (sa/new [double] [[1.0 2.0] [3.0 4.0]]))
(sa/aget arr 1 1) ;=> 4.0
(sa/aset arr 1 1 42)
(sa/aget arr 1 1) ;=> 42
The difference is that sa/aget
and sa/aset
infer the static type of their
first argument and utilize it for several purposes as follows.
In a nutshell, they are safer and faster:
c.c/aget
& aset
and emit an reflection warning
(set! *warn-on-reflection* true)
(fn [arr] (sa/aget arr 0))
;; Reflection warning, ... - call to static method aget on clojure.lang.RT can't be resolved (argument types: unknown, int).
(fn [arr] (sa/aget arr 0 0))
;; Reflection warning, ... - type of first argument for aget cannot be inferred
(sa/aget "I'm a string" 0)
;; Syntax error macroexpanding sweet-array.core/aget* at ...
;; Can't apply aget to "I'm a string", which is java.lang.String, not array
(sa/aget (sa/new [int] 3) 0 1 2)
;; Syntax error macroexpanding sweet-array.core/aget* at ...
;; Can't apply aget to (sa/new [int] 3) with more than 1 index(es)
sa/aget
& sa/aset
know that indexing [T]
once results in the type T
, and automatically insert obvious type hints to the expanded form, which reduces the cases where one has to add type hints manually
(require '[criterium.core :as cr])
(def ^"[[I" arr
(sa/into-array [[int]] (map (fn [i] (map (fn [j] (* i j)) (range 10))) (range 10)))
(cr/quick-bench (dotimes [i 10] (dotimes [j 10] (aget arr i j))))
;; Evaluation count : 792 in 6 samples of 132 calls.
;; Execution time mean : 910.441562 µs
;; Execution time std-deviation : 170.924552 µs
;; Execution time lower quantile : 758.037129 µs ( 2.5%)
;; Execution time upper quantile : 1.151744 ms (97.5%)
;; Overhead used : 8.143474 ns
;; The above result is way too slow due to unrecognizable reflection
;; To avoid this slowness, you'll need to add type hints yourself
(cr/quick-bench (dotimes [i 10] (dotimes [j 10] (aget ^ints (aget arr i) j))))
;; Evaluation count : 4122636 in 6 samples of 687106 calls.
;; Execution time mean : 139.098679 ns
;; Execution time std-deviation : 2.387043 ns
;; Execution time lower quantile : 136.235737 ns ( 2.5%)
;; Execution time upper quantile : 142.183007 ns (97.5%)
;; Overhead used : 8.143474 ns
;; Using `sa/aget`, you can simply write as follows:
(cr/quick-bench (dotimes [i 10] (dotimes [j 10] (sa/aget arr i j))))
;; Evaluation count : 5000448 in 6 samples of 833408 calls.
;; Execution time mean : 113.195074 ns
;; Execution time std-deviation : 4.641354 ns
;; Execution time lower quantile : 108.656324 ns ( 2.5%)
;; Execution time upper quantile : 119.427431 ns (97.5%)
;; Overhead used : 8.143474 ns
sweet-array
also provides several utilities that are useful for dealing with
array types.
(type [T])
The sweet-array.core/type
macro is convenient to reify an array type object
represented with an array type descriptor:
(require '[sweet-array.core :as sa])
(sa/type [int]) ;=> [I
(sa/type [String]) ;=> [Ljava.lang.String;
(sa/type [[double]]) ;=> [[D
Each form shown above is more concise and straightforward than the corresponding traditional code:
(class (int-array 0)) ;=> [I
(class (make-array String 0)) ;=> [Ljava.lang.String;
(class (make-array Double/TYPE 0 0)) ;=> [[D
(instance? [T] expr)
The sweet-array.core/instance?
macro is a predicate to check if a given value is
of the specified array type:
(sa/instance? [int] (sa/new [int] [1 2 3])) ;=> true
(sa/instance? [Object] (sa/new [int] [1 2 3])) ;=> false
(sa/instance? [String] "foo") ;=> false
(sa/instance? [T] expr)
is just syntactic sugar for (instance? (sa/type [T]) expr)
.
(cast [T] expr)
The sweet-array.core/cast
macro is for coercing an expression to the specified
array type. It's useful for resolving reflection warnings when some expression
cannot be type-inferred:
(defn make-array [n] (sa/new [int] n))
(set! *warn-on-reflection* true)
(sa/aget (make-array 3) 0)
;; Reflection warning, ... - call to static method aget on clojure.lang.RT can't be resolved (argument types: unknown, int).
;=> 0
(sa/aget (sa/cast [int] (make-array 3)) 0)
;=> 0
Note that sa/cast
only has the compile-time effect, and does nothing else at runtime.
#sweet/tag [T]
For those who want to radically eliminate cryptic array type hints (e.g. ^"[I"
and ^"[Ljava.lang.String;"
) from your code, sweet-array
provides reader syntax
that can be used as a replacement for them.
By prefixing #sweet/tag
, you can write an array type descriptor as a type hint:
(defn ^#sweet/tag [String] select-randomly [^#sweet/tag [[String]] arr]
(sa/aget arr (rand-int (alength arr))))
This code compiles without any reflection warning, just as with:
(defn ^"[Ljava.lang.String;" select-randomly [^"[[Ljava.lang.String;" arr]
(sa/aget arr (rand-int (alength arr))))
sweet-array
adopts what we call array type descriptors to denote array types
throughout the library. Following is the definition of sweet-array
's
array type descriptors:
<array type descriptor> ::= '[' <component type> ']'
| <array type alias>
<component type> ::= <primitive type name>
| <reference type name>
| <array type descriptor>
<primitive type name> ::= 'boolean'
| 'byte'
| 'char'
| 'short'
| 'int'
| 'long'
| 'float'
| 'double'
<reference type name> ::= any valid class or interface name
<array type alias> ::= 'booleans'
| 'bytes'
| 'shorts'
| 'ints'
| 'longs'
| 'floats'
| 'doubles'
| 'objects'
An array type descriptor [T]
denotes an array whose component type is T
.
The component type itself may be an array type. For instance, [[T]]
denotes
the two-dimensional array type of T
, [[[T]]]
denotes the three-dimensional
array type of T
, and so on.
Array type aliases, such as ints
and doubles
, may also be used as array type
descriptors. They are completely interchangeable with their corresponding array type
descriptor: ints
is equivalent to [int]
and [doubles]
is equivalent to [[double]]
,
and so on.
Copyright © 2021 Shogo Ohta
This program and the accompanying materials are made available under the terms of the Eclipse Public License 2.0 which is available at http://www.eclipse.org/legal/epl-2.0.
This Source Code may also be made available under the following Secondary Licenses when the conditions for such availability set forth in the Eclipse Public License, v. 2.0 are satisfied: GNU General Public License as published by the Free Software Foundation, either version 2 of the License, or (at your option) any later version, with the GNU Classpath Exception which is available at https://www.gnu.org/software/classpath/license.html.
Can you improve this documentation? These fine people already did:
Shogo Ohta, Ohta Shogo & OHTA ShogoEdit on GitHub
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close