Liking cljdoc? Tell your friends :D

Proximum

Clojars Project Slack GitHub last commit

⚠️ Early Beta: Proximum is under active development. APIs may change before 1.0 release. Feedback welcome!

📋 Help shape Proximum! We'd love your input. Please fill out our 2-minute feedback survey.

A high-performance, embeddable vector database for Clojure and Java with Git-like versioning and zero-cost branching.

Why Proximum?

Unlike traditional vector databases, Proximum brings persistent data structure semantics to vector search:

  • Time Travel: Query any historical snapshot
  • 🌿 Zero-Cost Branching: Fork indices for experiments without copying data
  • 🔒 Immutability: All operations return new versions, enabling safe concurrency
  • 💾 True Persistence: Durable storage with structural sharing
  • 🚀 High Performance: SIMD-accelerated search with competitive recall
  • 📦 Pure JVM: No native dependencies, works everywhere

Perfect for RAG applications, semantic search, and ML experimentation where you need to track versions, A/B test embeddings, or maintain reproducible search results.


Quick Start

Clojure

(require '[proximum.core :as prox])

;; Create identifier of the underlying storage with (random-uuid)
(def store-id #uuid "465df026-fcd3-4cb3-be44-29a929776250") 

;; Create an index - feels like Clojure!
(def idx (prox/create-index {:type :hnsw
                              :dim 384
                              :store-config {:backend :memory
                                             :id store-id}
                              :capacity 10000}))

;; Use collection protocols
(def idx2 (assoc idx "doc-1" (float-array (repeatedly 384 rand))))
(def idx3 (assoc idx2 "doc-2" (float-array (repeatedly 384 rand))))

;; Search for nearest neighbors
(def results (prox/search idx3 (float-array (repeatedly 384 rand)) 5))
; => ({:id "doc-1", :distance 0.234} {:id "doc-2", :distance 0.456} ...)

;; Git-like branching
(prox/sync! idx3)
(def experiment (prox/branch! idx3 "experiment"))

📖 Full Clojure Guide

Java

import org.replikativ.proximum.*;

// Create index with builder pattern
try (ProximumVectorStore store = ProximumVectorStore.builder()
        .dimensions(384)
        .storagePath("/tmp/vectors")
        .build()) {

    // Add vectors (immutable - returns new store)
    store = store.add(embedding1, "doc-1");
    store = store.add(embedding2, "doc-2");

    // Search for nearest neighbors
    List<SearchResult> results = store.search(queryVector, 5);
    // => [SearchResult{id=doc-1, distance=0.234}, ...]

    // Git-like versioning
    store.sync();  // Create commit
    UUID snapshot1 = store.getCommitId();

    store = store.add(embedding3, "doc-3");
    store.sync();

    // Time travel: Query historical state
    ProximumVectorStore historical = ProximumVectorStore.connectCommit(
        Map.of("backend", ":file", "path", "/tmp/vectors"), snapshot1);
    historical.search(queryVector, 5);  // Only sees doc-1, doc-2!

    // Branch for experiments
    ProximumVectorStore experiment = store.branch("experiment");
}

📖 Full Java Guide


Installation

Clojars Project

Clojure (deps.edn)

{:deps {org.replikativ/proximum {:mvn/version "LATEST"}}}

Leiningen (project.clj)

[org.replikativ/proximum "LATEST"]

Maven

<dependency>
  <groupId>org.replikativ</groupId>
  <artifactId>proximum</artifactId>
  <version>LATEST</version>
</dependency>

Gradle

implementation 'org.replikativ:proximum:LATEST'

Key Features

🔄 Versioning & Time Travel

Every sync() creates a commit. Query any historical state:

index.sync();  // Snapshot 1
// ... make changes ...
index.sync();  // Snapshot 2

// Time travel to earlier state
ProximumVectorStore historical = index.asOf(commitId);

Use Cases: Audit trails, debugging, A/B testing, reproducible results

🌿 Zero-Cost Branching

Fork an index for experiments without copying data:

index.sync();
ProximumVectorStore experiment = index.branch("new-model");

// Test different embeddings
experiment.add(newEmbedding, "doc-1");

// Merge or discard - original unchanged

Use Cases: A/B testing, staging, parallel experiments

🔍 Advanced Features

  • Filtered Search: Multi-tenant search with ID filtering
  • Metadata: Attach arbitrary metadata to vectors
  • Compaction: Reclaim space from deleted vectors
  • Garbage Collection: Clean up unreachable commits
  • Crypto-Hash: Tamper-proof audit trail with SHA-512

Integrations

Spring AI

import org.replikativ.proximum.spring.ProximumVectorStore;

@Bean
public VectorStore vectorStore() {
    return ProximumVectorStore.builder()
        .dimensions(1536)
        .storagePath("/data/vectors")
        .build();
}

📖 Spring AI Integration Guide | Spring Boot RAG Example

LangChain4j

import org.replikativ.proximum.langchain4j.ProximumEmbeddingStore;

EmbeddingStore<TextSegment> store = ProximumEmbeddingStore.builder()
    .dimensions(1536)
    .storagePath("/data/embeddings")
    .build();

📖 LangChain4j Integration Guide


Performance

SIFT-1M (1M vectors, 128-dim, Intel Core Ultra 7):

ImplementationSearch QPSInsert (vec/s)p50 LatencyRecall@10
hnswlib (C++)7,84918,205131 µs98.32%
Proximum3,750 (48%)9,621262 µs98.66%
lucene-hnsw3,095 (39%)2,347333 µs98.53%
jvector1,844 (23%)6,095557 µs95.95%
hnswlib-java1,004 (13%)4,3291,041 µs98.30%

Proximum metrics:

  • Storage: 762.8 MB
  • Heap usage: 545.7 MB

Key features:

  • Pure JVM with SIMD acceleration (Java Vector API)
  • No native dependencies, works on all platforms
  • Persistent storage with zero-cost branching

Documentation

API Guides:

  • Clojure Guide - Complete Clojure API with collection protocols
  • Java Guide - Builder pattern, immutability, and best practices

Integration Guides:

Advanced Topics:

Examples:


Examples

Browse working examples in examples/:

  • Clojure: Semantic search, RAG, collection protocols
  • Java: Quick start, auditable index, metadata usage

Demo Projects:

  • Einbetten: Wikipedia semantic search with Datahike + FastEmbed (2,000 articles, ~8,000 chunks)

Requirements

  • Java: 22+ (Foreign Memory API finalized in Java 22)
  • OS: Linux, macOS, Windows
  • CPU: AVX2 recommended, AVX-512 for best performance

JVM Options Required:

--add-modules=jdk.incubator.vector
--enable-native-access=ALL-UNNAMED

License

EPL-2.0 (Eclipse Public License 2.0) - see LICENSE


Contributing

We welcome contributions! See CONTRIBUTING.md for:

  • Code of conduct
  • Development workflow
  • Testing requirements
  • Licensing (DCO/EPL-2.0)

Support


Built with ❤️ by the replikativ team

Can you improve this documentation?Edit on GitHub

cljdoc builds & hosts documentation for Clojure/Script libraries

Keyboard shortcuts
Ctrl+kJump to recent docs
Move to previous article
Move to next article
Ctrl+/Jump to the search field
× close