Local text-embedding inference for Clojure. Runs sentence-transformers ONNX exports on the JVM via ONNX Runtime.
Part of a local RAG substrate stack: pdfplumber-clj (extract) → chunk-clj (split) → tokenizers-clj (tokenize) → embeddings-clj (embed).
deps.edn:
net.clojars.savya/embeddings-clj {:mvn/version "0.1.0"}
Leiningen:
[net.clojars.savya/embeddings-clj "0.1.0"]
Any sentence-transformers-style ONNX export works: a directory containing
model.onnx and tokenizer.json. For example, all-MiniLM-L6-v2:
mkdir -p models/all-MiniLM-L6-v2
curl -fL -o models/all-MiniLM-L6-v2/model.onnx \
https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/onnx/model.onnx
curl -fL -o models/all-MiniLM-L6-v2/tokenizer.json \
https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/tokenizer.json
Compatible model families include all-MiniLM, all-mpnet, BGE, GTE, and E5
(anything whose ONNX graph takes input_ids/attention_mask/token_type_ids
and outputs token embeddings or a pre-pooled sentence embedding).
(require '[embeddings.core :as emb]
'[embeddings.math :as emb.math])
;; load-model returns an AutoCloseable-style handle; with-model scopes it
(emb/with-model [model "models/all-MiniLM-L6-v2" {:pooling :mean :normalize? true}]
(emb/dimension model) ;; => 384
(let [a (emb/embed model "A cat sits on the mat")
b (emb/embed model "A kitten rests on the rug")
c (emb/embed model "The stock market crashed today")]
(emb.math/cosine-similarity a b) ;; => ~0.6+ (similar)
(emb.math/cosine-similarity a c))) ;; => ~0.1 (unrelated)
;; batch (padding is attention-mask aware; results match single embeds)
(emb/with-model [model "models/all-MiniLM-L6-v2"]
(emb/embed-batch model ["first text" "second text" "third text"]))
;; => [float[384] float[384] float[384]]
Options to load-model (defaults shown):
| option | default | meaning |
|---|---|---|
:pooling | :mean | :mean (mask-weighted), :cls, or :max over token embeddings |
:normalize? | true | L2-normalize output vectors (unit length, ready for cosine) |
:max-length | 512 | truncate inputs to this many tokens |
Models whose ONNX graph already outputs a pooled [batch, hidden] sentence
embedding are detected automatically and used as-is (:pooling is ignored).
embeddings.math ships the small vector toolkit: dot, norm,
l2-normalize, cosine-similarity - all on primitive float[].
Errors are ex-info maps keyed :embeddings/error
(:model-not-found, :tokenizer-not-found, :model-closed,
:unsupported-input, :unsupported-output, :dim-mismatch).
clojure -M:test
The unit suite runs against tiny deterministic ONNX fixtures generated by
python3 dev/gen_fixture.py (requires pip install onnx; tests skip cleanly
when fixtures are absent, so the suite is green either way).
The opt-in integration suite exercises a real all-MiniLM-L6-v2 model:
./dev/fetch-model.sh # ~90MB download from HuggingFace
clojure -M:test --focus-meta :integration
Copyright © 2026 Savyasachi.
Distributed under the Eclipse Public License 2.0.
Can you improve this documentation?Edit on GitHub
cljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs |
| ← | Move to previous article |
| → | Move to next article |
| Ctrl+/ | Jump to the search field |