Fast numerical computing for Clojure.
Write math with deftm, get world-class numerical performance on the JVM —
with full REPL interactivity, automatic differentiation, and GPU compilation.
(require '[raster.core :refer [deftm ftm]])
(require '[raster.numeric :refer [+ *]])
(require '[raster.math :refer [sqrt]])
;; Typed function — compiles to primitive JVM bytecode, no boxing
(deftm norm [x :- Double, y :- Double] :- Double
(sqrt (+ (* x x) (* y y))))
(norm 3.0 4.0) ;; => 5.0
The fastest way to see Raster in action:
# Clone and start a REPL
git clone https://github.com/replikativ/raster.git
cd raster
clojure -M:nREPL
Then open one of the interactive notebooks in your editor (Kindly/Clay compatible):
| Notebook | What you'll learn |
|---|---|
| Getting Started | deftm, typed dispatch, value types |
| Automatic Differentiation | Forward-mode, reverse-mode, value+grad |
| ODE Solvers | Lorenz attractor, adaptive solvers, events |
| Linear Algebra | Vec/Mat types, LU, Cholesky, SVD |
| Optimization | L-BFGS, Nelder-Mead, Newton's method |
| Deep Learning | MLP training with compiled AD |
Three games built on the Raster Vulkan engine — playable demos of typed dispatch, parallel primitives, and procedural generation:
![]() | ![]() |
Geometric Asteroids — polygon shapes via defvalue + deftm dispatch. Add new shape types from the REPL during gameplay. | Valley — voxel survival with procedural terrain, crafting, and mobs. Uses raster.noise for terrain generation. |
There's also a Doom-style renderer with WAD loading and BSP traversal.
See examples/README.md for running instructions.
Clojure is great for data-driven applications, but numerical computing has traditionally required dropping to Java or calling out to Python. Raster fills that gap with three ideas:
Typed multiple dispatch (inspired by Julia) — define the same function for different types and the most specific method is selected automatically. The compiler resolves dispatch at call sites when the consumer's argument types are known — either by selecting the closest concrete method or by specializing a parametric template — and emits JVM bytecode with zero overhead.
Parallel combinators (inspired by Futhark) —
express data parallelism with par/map, par/reduce, and par/scan. The
compiler treats these as first-class IR nodes, fuses producer-consumer chains
(SOAC fusion), and lowers them to SIMD loops, OpenCL kernels, or Vulkan
compute — from the same source code.
End-to-end compilation — a transparent nanopass compiler that you own and
can inspect. Because we compile from Clojure source to JVM bytecode directly
(no LLVM dependency), we control every optimization pass and can target
multiple backends: JVM bytecode with SIMD vectorization, OpenCL C, Vulkan
SPIR-V, or Intel Level Zero. The pipeline is fully inspectable via
explain-pipeline. We can also emit C or potentially LLVM IR when needed,
but the default path avoids external toolchain dependencies entirely.
This combination — typed dispatch for abstraction, parallel combinators for structure, and a self-contained compiler for performance — is how Raster aims to match JAX-level deep learning performance while remaining a regular Clojure library with full REPL interactivity.
deftm is a macro, not a DSL. Your functions are
regular Clojure values that work at the REPL, in tests, with your editor.deftm functions.
compile-aot inlines entire call chains into a single JVM method with zero
heap allocations in the hot path.par/map expression runs as a sequential loop
at the REPL, a SIMD-vectorized loop in compiled JVM code, or a GPU kernel on
OpenCL/Vulkan/Level Zero.All benchmarks on Valhalla JDK 27, single-threaded CPU unless noted. GPU benchmarks on Intel Arc A770 (Level Zero).
| Workload | Raster | Reference | vs Reference |
|---|---|---|---|
| ODE solve (DP5 Lorenz, t=100) | 432 µs | Julia DiffEq 583 µs | 1.4x faster |
| MLP 784-128-10 train step (f64) | 136 µs | JAX CPU 86 µs | JAX 1.6x faster |
| MLP 784-128-10 train step (f32) | 77 µs | JAX CPU 50 µs | JAX 1.5x faster |
| LeNet-5 train step (f64) | 222 µs | JAX CPU 370 µs | 1.7x faster |
| LeNet-5 train step (f32) | 148 µs | JAX CPU 356 µs | 2.4x faster |
| AD sensitivity (Lotka-Volterra, Dual4) | 15 µs | Julia ForwardDiff 16 µs | 1.1x faster |
| ABM 10M agents/period (GPU) | 172 ms | CPU-parallel 459 ms | 2.7x faster |
The DL numbers include the full compiled pipeline: forward pass, reverse-mode AD
backward pass, and SGD parameter update — compiled to a single JVM method via
compile-aot. Zero heap allocations in the hot path. Dense layers use fused
par/map + par/reduce SOACs (no BLAS calls) — the same strategy as Futhark
and XLA's CPU backend. BLAS via Panama FFI (cblas_dgemv/cblas_dgemm) is
available for explicit use in large batched GEMM workloads.
deftm and ftmdeftm (typed multi) defines functions with type annotations and multiple dispatch:
;; Multiple dispatch — same name, different types
(deftm norm [x :- Double, y :- Double] :- Double
(sqrt (+ (* x x) (* y y))))
(deftm norm [x :- Float, y :- Float] :- Float
(sqrt (+ (* x x) (* y y))))
(norm 3.0 4.0) ;; => 5.0 (Double path)
(norm (float 3) (float 4)) ;; => 5.0 (Float path, no boxing)
;; Typed lambdas
(deftm apply-twice [f :- (Fn [Double] Double), x :- Double] :- Double
(f (f x)))
(apply-twice (ftm [x :- Double] :- Double (* x x)) 3.0) ;; => 81.0
Use deftm/ftm for all numerical code. Use plain defn for glue code,
configuration, and I/O.
raster.par provides declarative parallel primitives that the compiler can
fuse, vectorize, and lower to different backends:
(require '[raster.par :as par])
;; Element-wise map — returns a new array
(deftm relu [xs :- (Array double)] :- (Array double)
(par/map [i (alength xs)]
(Math/max 0.0 (aget xs i))))
;; Dot product — fused map + reduce
(deftm dot [xs :- (Array double), ys :- (Array double)] :- Double
(par/reduce + 0.0 [i (alength xs)]
(* (aget xs i) (aget ys i))))
When composed, the compiler fuses adjacent par/map and par/reduce operations
into single loops — no intermediate arrays are allocated. This is the same
SOAC (Second-Order Array Combinator) fusion strategy pioneered by
Futhark, applied here within a general-purpose
host language rather than a standalone array language.
raster.numeric provides +, -, *, / and friends that dispatch on
concrete types — Long stays Long, Float stays Float:
(+ 1 2) ;; Long + Long -> Long
(+ 1.0 2.0) ;; Double + Double -> Double
(+ (float 1) (float 2)) ;; Float + Float -> Float
Forward-mode (Dual numbers) and reverse-mode (IR source transformation):
(require '[raster.ad :refer [value+grad grad]])
(deftm rosenbrock [x :- Double, y :- Double] :- Double
(let [a (- y (* x x))
b (- 1.0 x)]
(+ (* 100.0 a a) (* b b))))
;; value+grad returns [f(x), ∇f(x)]
(value+grad rosenbrock 1.0 1.0) ;; => [0.0 [0.0 0.0]]
For maximum performance, compile-aot inlines an entire call chain —
forward pass, AD, optimizer update — into a single JVM method:
(require '[raster.compiler.pipeline :as pipeline])
(def fast-step (pipeline/compile-aot #'train-step!))
The compiler performs buffer fusion (reusing dead arrays), SOAC fusion
(merging parallel combinator chains), and emits SIMD-vectorized loops — all
automatically from your deftm source.
Use (pipeline/explain-pipeline #'my-fn) to see what each compiler pass does.
| Module | Description |
|---|---|
raster.ode | ODE solvers (Euler, RK4, DP5, Tsit5), PDE (method-of-lines), SDE |
raster.sci.optim | L-BFGS, Nelder-Mead, Newton, gradient descent |
raster.linalg | Dense linear algebra, LU, Cholesky, SVD, QR, eigendecomposition |
raster.linalg.iterative | Krylov methods (CG, GMRES, BiCGSTAB) |
raster.linalg.sparse | Sparse vectors and matrices |
raster.sci.interpolation | Linear, cubic spline, Akima, PCHIP, 2D bilinear/bicubic |
raster.sci.fft | Fast Fourier Transform |
raster.sci.quadrature | Numerical integration (Gauss-Kronrod, Simpson) |
raster.sci.roots | Root finding (bisection, Brent, Newton) |
raster.sci.distributions | Normal, Uniform, Exponential, Gamma, Poisson |
raster.sci.stats | t-tests, chi-squared, KS test, correlation |
raster.sci.special | Gamma, beta, error functions |
| Module | Description |
|---|---|
raster.dl.nn | Linear, conv1d/2d, maxpool, normalization, activations |
raster.dl.attention | Scaled dot-product and multi-head attention |
raster.dl.loss | MSE, cross-entropy, Huber, L1 (with AD rules) |
raster.dl.optim | SGD, Adam, AdamW, learning rate schedulers |
raster.dl.einsum | Einstein summation and einops-style rearrangement |
raster.dl.diffusion | DDPM noise schedules and sampling |
All layers are deftm functions on flat arrays — the compiler sees through
layer boundaries and fuses operations end-to-end. See the
LeNet-5 and GPT-2
examples.
| Module | Description |
|---|---|
raster.sym | Symbolic expressions, differentiation, Taylor series |
raster.ga | Geometric algebra Cl(p,q,r) with compiled multivector types |
raster.types.complex | Complex number arithmetic |
The same parallel combinators (par/map, par/reduce, par/scan) that run as
SIMD-vectorized loops on CPU compile to GPU kernels through backend passes.
Backends: OpenCL (NVIDIA, AMD, Intel), Intel Level Zero, Vulkan compute, JDK
Vector API SIMD.
See GPU Computing for the session API and backend details.
raster.abm — a GPU-accelerated agent-based
model of firm formation and labor markets. Same Clojure source runs on CPU and
GPU. Includes a differentiable variant for gradient-based parameter calibration.
Raster's compiler is a nanopass pipeline: each pass transforms a small, well-defined IR dialect. Dispatch resolution, type inference, AD expansion, SOAC fusion, buffer allocation, and backend code generation are all separate passes that you can inspect individually.
| Document | Description |
|---|---|
| Compiler Pipeline | Nanopass architecture, passes, compile-aot, diagnostics |
| Automatic Differentiation | Forward/reverse AD, rrules, sensitivity analysis |
| GPU Computing | Parallel primitives, session API, backends, SoA layout |
| Deep Learning | Layers, loss, optimizers, compiled training |
Built on TypedClojure for type inference and beichte for purity analysis.
src/raster/
core.clj -- deftm, ftm, defvalue, specialize macros
numeric.clj -- polymorphic +, -, *, /, comparisons
math.clj -- sin, cos, exp, log, sqrt, fma, ...
arrays.clj -- polymorphic aget/aset/alength
par.clj -- parallel combinators (map, reduce, scan)
ad/ -- automatic differentiation
compiler/ -- nanopass compiler pipeline
ode/ -- ODE/PDE/SDE solvers
linalg/ -- linear algebra, LAPACK via Panama FFI
sci/ -- special functions, distributions, optimization
gpu/ -- unified GPU session (OpenCL ICD + Level Zero)
dl/ -- deep learning layers, optimizers, training
sym/ -- symbolic computation
ga/ -- geometric algebra
vk/ -- Vulkan rendering engine
deps.ednFor Valhalla value types (Dual, Float16), use JDK 27 early-access.
# Tests
clojure -M:test
# REPL
clojure -M:nREPL
# With Valhalla JDK 27
source valhalla-env.sh
clojure -J--enable-preview \
-J--add-exports=java.base/jdk.internal.vm.annotation=ALL-UNNAMED \
-J--enable-native-access=ALL-UNNAMED \
-M:test:valhalla
MIT License. See LICENSE for the full text.
Third-party notices are in THIRD_PARTY_NOTICES.md.
Can you improve this documentation?Edit on GitHub
cljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs |
| ← | Move to previous article |
| → | Move to next article |
| Ctrl+/ | Jump to the search field |