Liking cljdoc? Tell your friends :D

Stratum Architecture

Stratum is a SIMD-accelerated columnar analytics engine for the JVM, written in Clojure with performance-critical paths in Java. It uses the Java Vector API (JDK 21+ incubator) for SIMD operations and runs entirely on heap memory managed by the JVM garbage collector.

System Overview

                       ┌──────────────┐
                       │  User Input  │
                       └──────┬───────┘
                              │
                 ┌────────────┴────────────┐
                 │                         │
          SQL string              Query map (EDN)
                 │                         │
        ┌────────▼────────┐                │
        │    sql.clj      │                │
        │  (JSqlParser)   │                │
        └────────┬────────┘                │
                 │ Stratum query map       │
                 └────────────┬────────────┘
                              │
                    ┌─────────▼──────────┐
                    │     query.clj      │
                    │  Dispatch + Compile │
                    └─────────┬──────────┘
                              │
          ┌───────────────────┼───────────────────┐
          │                   │                   │
  ┌───────▼───────┐  ┌───────▼───────┐  ┌───────▼───────┐
  │ Fused SIMD    │  │ Dense Group   │  │ Hash Group    │
  │ filter+agg    │  │ (array-idx)   │  │ (radix-part)  │
  └───────┬───────┘  └───────┬───────┘  └───────┬───────┘
          │                   │                   │
          └───────────────────┼───────────────────┘
                              │
                    ┌──────────────▼──────────────┐
                    │       Java SIMD Layer       │
                    │  ColumnOps.java             │
                    │  ColumnOpsExt.java          │
                    │  ColumnOpsChunked.java      │
                    │  ColumnOpsChunkedSimd.java  │
                    │  ColumnOpsAnalytics.java    │
                    └─────────────────────────────┘

Data Representations

Stratum operates on three data representations:

Raw Heap Arrays

The simplest input format. Columns are long[] or double[] arrays on the JVM heap. Dictionary-encoded string columns are represented as a long[] of codes plus a String[] dictionary.

{:price  (double-array [10.0 20.0 30.0])
 :qty    (long-array [1 2 3])
 :region (q/encode-column (into-array String ["US" "EU" "US"]))}

PersistentColumnIndex (Chunked B-Tree)

A persistent sorted set (PSS) tree of ChunkEntry records, each containing:

  • chunk-id: position range [start, end]
  • PersistentColChunk: CoW wrapper around a long[] or double[] (8192 elements default)
  • ChunkStats: per-chunk count, sum, sum-of-squares, min, max

Indices support O(1) fork via structural sharing and copy-on-write on mutation. The query engine can stream over chunks without materializing the full array (64KB per chunk fits L2 cache). When persisted, the PSS tree is stored in konserve and lazy-loaded on demand — opening a billion-row index costs nothing until chunks are actually accessed.

Dictionary-Encoded Strings

encode-column maps String[] to sequential long[] codes plus a reverse String[] dictionary. This enables numeric SIMD operations on string group-by keys, and fast LIKE pattern matching via dictionary pre-filtering.

Module Map

FileResponsibilitySize
src/stratum/api.cljPublic API (q, explain, from-csv, from-parquet, server, iforest)~235 LOC
src/stratum/query.cljQuery compilation, dispatch, execution~6600 LOC
src/stratum/sql.cljJSqlParser AST → query map / DDL translation (SELECT, INSERT, UPDATE, DELETE, UPSERT)~1570 LOC
src/stratum/server.cljPostgreSQL wire protocol (pgwire) server with DML execution~720 LOC
src/stratum/csv.cljCSV import with auto type detection~160 LOC
src/stratum/parquet.cljParquet import via parquet-java~190 LOC
src/stratum/index.cljPersistentColumnIndex (PSS tree)~1340 LOC
src/stratum/chunk.cljPersistentColChunk (CoW arrays)~390 LOC
src/stratum/stats.cljChunkStats, zone map predicates~400 LOC
src/stratum/storage.cljKonserve storage backend, GC, commit/branch management~250 LOC
src/stratum/cached_storage.cljPSS IStorage impl: LRU cache, Fressian handlers, lazy loading~310 LOC
src/stratum/dataset.cljStratumDataset (deftype, persistence)~640 LOC
src/stratum/iforest.cljIsolation forest anomaly detection (train, score, predict, rotate)~290 LOC
src/stratum/specification.cljcMalli schemas for API validation (query, iforest, SQL types)~550 LOC
src-java/.../ColumnOps.javaCore SIMD: filter, aggregate, group-by, join, date/string ops~76KB bytecode
src-java/.../ColumnOpsExt.javaJIT-isolated: VARIANCE/CORR, LIKE, extract+count, LongVector, COUNT DISTINCT~26KB bytecode
src-java/.../ColumnOpsChunked.javaChunked dense group-by for index streaming~12KB bytecode
src-java/.../ColumnOpsChunkedSimd.javaChunked fused filter+aggregate SIMD, chunked COUNT~15KB bytecode
src-java/.../ColumnOpsAnalytics.javaT-digest, isolation forest, window functions, top-N~24KB bytecode

Walkthrough: TPC-H Q6

Query: Sum revenue where shipdate in 1994, discount between 0.05-0.07, quantity < 24.

(q/q {:from {:shipdate shipdate-arr :discount discount-arr
                  :quantity quantity-arr :price price-arr}
          :where [[:between :shipdate 8766 9131]     ;; 1994 epoch-days
                  [:between :discount 0.05 0.07]
                  [:< :quantity 24]]
          :agg [[:sum [:* :price :discount]]]})

Step-by-step execution:

  1. prepare-columns: Resolve column references to typed arrays. Detect 2 long predicates + 1 double predicate, 1 SUM_PRODUCT aggregation.

  2. Dispatch: Single aggregation with ≤4L+4D predicates on ≥1000 rows → fused SIMD path.

  3. query-compiler: Build parallel arrays for Java: longPredTypes=[PRED_RANGE, PRED_LT], longCols=[shipdate, quantity], bounds arrays, aggType=AGG_SUM_PRODUCT, aggCol1=price, aggCol2=discount.

  4. fusedSimdParallel (Java): Morsel-driven parallel execution:

    • Split 6M rows into 64K morsels across N threads
    • Each morsel: SIMD predicate evaluation (DoubleVector/LongVector broadcasts + compare → VectorMask AND-chain) → masked accumulation (sum += price[i] * discount[i] where all predicates pass)
    • Thread-local results merged via double[2] addition (sum + count)
  5. Result: [{:sum 1234567.89 :_count 114160}]

Total time: ~4ms single-threaded, ~1ms multi-threaded (6M rows).

Related Documentation

Can you improve this documentation?Edit on GitHub

cljdoc builds & hosts documentation for Clojure/Script libraries

Keyboard shortcuts
Ctrl+kJump to recent docs
Move to previous article
Move to next article
Ctrl+/Jump to the search field
× close