Liking cljdoc? Tell your friends :D
One Tree, Many Forests

ordered-collections

Fast, modern, ropes and ordered collections that do more than sort.

Drop-in replacements for sorted-set and sorted-map. With inherent parallelism, work-optimal set algebra, positional access, parallel fold, and specialized collections for problems you didn't know you could solve efficiently:

  • Ropes — concat, split, splice, insert: 10–1300x faster than native baselines at scale, in three flavors: generic, StringRope (CharSequence), and ByteRope (unsigned bytes)
  • Sets and Maps work exactly as you're used to, but do more, up to 60x faster
  • Interval maps for overlap queries ("what's scheduled at 3pm?")
  • Range maps for non-overlapping regions ("which subnet owns this IP?")
  • Segment trees for range aggregation ("total sales from day 10 to 50?")
  • Fuzzy collections for nearest-neighbor lookup ("snap 9.3 to the closest valid size")
  • Priority queues, multisets, and more

All built from a modular, extensible, weight-balanced tree platform with shared foundation for splitting, joining, and parallel operations.

tests Clojars Project

Documentation


Quick Start

Use ordered-set and ordered-map exactly like clojure.core/sorted-set and clojure.core/sorted-map. All the functions you know work the same way. The difference is under the hood — and in the new things you can do.

(require '[ordered-collections.core :as oc])

;; Ropes — O(log n) split, splice, concat

(def r (oc/rope [:a :b :c :d :e]))     ;=> #vec/rope [:a :b :c :d:e]

(apply oc/rope-concat
   (reverse (oc/rope-split r 2)))      ;=> #vec/rope [:c :d :e :a :b]

(oc/rope-insert r 2 [:x :y])           ;=> #vec/rope [:a :b :x :y :c :d :e]

;; Sets

(def s (oc/ordered-set [3 1 4 1 5 9 2 6]))  ;=> #ordered/set [1 2 3 4 5 6 9]

(s 4)           ;=> 4
(s 7)           ;=> nil
(conj s 0)      ;=> #{0 1 2 3 4 5 6 9}

;; Maps

(def m (oc/ordered-map {:b 2 :a 1 :c 3}))  ;=> #ordered/map [[:a 1] [:b 2] [:c 3]]

(m :b)                  ;=> 2
(assoc m :d 4)          ;=> {:a 1, :b 2, :c 3, :d 4}


Performance

Across the measured workloads, ordered-collections is faster than both clojure.core/sorted-set and clojure.data.avl at every cardinality. Set algebra is the standout, with 18-60x wins at 500K. Even against unordered clojure.core/set the benchmarks still show roughly 4-28x wins.

Set algebra

vs clojure.core/sorted-set

OperationN=10KN=100KN=500K
Union12.9x23.4x59.6x
Intersection9.3x15.5x34.6x
Difference10.8x24.3x54.7x

vs clojure.data.avl

OperationN=10KN=100KN=500K
Union9.7x20.0x51.3x
Intersection6.9x13.1x29.9x
Difference7.9x15.0x37.2x

vs clojure.core/set

OperationN=10KN=100KN=500K
Union3.7x7.2x20.5x
Intersection3.5x6.6x17.8x
Difference4.6x9.8x28.4x

Other operations

Operationvs sorted-setvs data.avl
Construction2.1x / 2.4x / 2.8x1.1x / 1.2x / 1.5x
Lookup1.1x / 1.0x / 0.9x1.0x / 0.9x / 0.9x
Split4.9x / 6.1x / 6.8x
Fold2.5x / 4.0x / 3.5x3.1x / 5.4x / 4.8x

Rope vs PersistentVector

WorkloadN=1KN=5KN=10KN=100KN=500K
200 random edits4.7x14x26x261x1237x
Single splice4.8x13x106x762x863x
Concat pieces169x22x29x39x36x
Reduce (sum)1.0x1.7x1.4x1.5x1.3x
Fold (sum)2.9x1.4x1.2x1.3x1.6x
Random nth (1000)0.5x0.2x0.2x0.2x0.2x

StringRope vs String

WorkloadN=1KN=5KN=10KN=100KN=500K
200 random edits0.6x2.6x5.7x38x130x
Single splice0.4x3.2x5.9x42x349x
Single insert0.4x2.7x6.2x40x154x
Single remove1.5x3.6x7.1x44x412x
Concat halves0.9x0.5x2.5x20x29x
Reduce (sum chars)0.4x0.5x0.5x0.5x0.5x
re-find / re-seq0.6-1.3x0.1-0.2x0.1-0.2x0.1-0.2x0.1-0.2x

The rope family wins decisively on structural editing at scale; the advantage grows with collection size. See Ropes for the full tutorial.

Benchmarked on Apple M2 (aarch64), OpenJDK 25.0.2, Clojure 1.12.4. See report.txt for full results and benchmarks.md for methodology.


How It Works

The core is a weight-balanced binary tree. Each node knows its subtree size, enabling O(log n) positional access and efficient parallel decomposition.

Split and join are the fundamental primitives — splitting at a key produces two trees in O(log n); joining is also O(log n). Set operations, subrange extraction, and parallel fold all reduce to split/join. Set operations use Adams' divide-and-conquer algorithm (1992) extended with the parallel forkl-join based approach from Blelloch, Ferizovic & Sun (2016).

Collection constructors provide the comparator and node-construction hooks, so the same tree algorithms can back generic, primitive-specialized, and augmented variants.

Augmented trees extend the basic structure: interval trees store max-endpoint per subtree for O(log n + k) overlap queries; segment trees store aggregates for O(log n) range queries.

See Algorithms for implementation details and Why Weight-Balanced Trees? for comparison with red-black and AVL trees.


Collections

The fundamental collection types currently implemented are: ordered-set, ordered-map, rope, interval-set, interval-map, range-map, segment-tree, priority-queue, ordered-multiset, fuzzy-set, and fuzzy-map.

ConstructorDescription
Ordered Set
(oc/ordered-set coll)Sorted set (drop-in for sorted-set)
(oc/ordered-set-by pred coll)Sorted set with custom comparator
(oc/long-ordered-set coll)Sorted set optimized for Long keys
(oc/string-ordered-set coll)Sorted set optimized for String keys
Ordered Map
(oc/ordered-map coll)Sorted map (drop-in for sorted-map)
(oc/ordered-map-by pred coll)Sorted map with custom comparator
(oc/long-ordered-map coll)Sorted map optimized for Long keys
(oc/string-ordered-map coll)Sorted map optimized for String keys
Interval Collections
(oc/interval-set coll)Set supporting interval overlap queries
(oc/interval-map coll)Map supporting interval overlap queries
Range Map
(oc/range-map)Non-overlapping ranges (Guava TreeRangeMap)
Segment Tree
(oc/segment-tree f identity coll)O(log n) range aggregate queries
(oc/segment-tree-by pred f identity coll)Segment tree with custom ordering predicate
(oc/segment-tree-with cmp f identity coll)Segment tree with custom Comparator
Priority Queue
(oc/priority-queue pairs)Priority queue from [priority value] pairs
Ordered Multiset
(oc/ordered-multiset coll)Sorted multiset (allows duplicates)
Rope
(oc/rope coll)Persistent sequence for structural editing
(oc/rope-concat a b)Concatenate two ropes — O(log n)
(oc/rope-splice r start end items)Replace a range — O(log n)
StringRope
(oc/string-rope s)Persistent text sequence (implements CharSequence)
(oc/string-rope-concat a b)Concatenate two string ropes — O(log n)
ByteRope
(oc/byte-rope bs)Persistent memory — structural editing, zero-cost snapshots, structure sharing
(oc/byte-rope-concat a b)Concatenate two byte ropes — O(log n)
Fuzzy Collections
(oc/fuzzy-set coll)Returns closest element to query
(oc/fuzzy-map coll)Returns value for closest key to query

Ropes

A rope is a persistent sequence optimized for structural editing — concatenation, splitting, splicing, and insertion in the middle of large sequences. Where PersistentVector is O(n) for mid-sequence edits, the rope is O(log n). Three variants share the same kernel:

  • rope — arbitrary Clojure values (vector-compatible)
  • string-rope — UTF-16 text, implements CharSequence for re-find/clojure.string
  • byte-rope — persistent memory with structural editing, zero-cost snapshots, and structure sharing. Think of it as a byte buffer with the safety properties of a persistent data structure: splice at any offset in O(log n), keep old versions for free, let the GC reclaim what's unreachable
(def r (oc/rope (range 100000)))

;; Splice into the middle — O(log n), not O(n)
(def edited (oc/rope-splice r 50000 50010 [:new :data]))

;; Split — O(log n)
(let [[left right] (oc/rope-split r 50000)]
  [(count left) (count right)])  ;=> [50000 50000]

;; StringRope — drops into regex and clojure.string
(def text (oc/string-rope "hello world"))
(re-find #"wor" text)  ;=> "wor"

;; ByteRope — binary protocols, streaming digest
(def packet (oc/byte-rope [0x48 0x45 0x4C 0x4C 0x4F]))
(oc/byte-rope-get-int packet 0)  ;=> 1212501068

See Ropes for the full tutorial.


Capabilities

Operations that sorted-set and sorted-map don't provide — at any collection size. For the full collection-by-collection surface area, see Collections API.

Positional Access & Rank

(def s (oc/ordered-set [50 30 20 40 10]))

(nth s 2)              ;=> 30     O(log n)
(oc/rank s 30)         ;=> 2      O(log n)
(oc/median s)          ;=> 30     O(log n)
(oc/percentile s 90)   ;=> 50     O(log n)
(oc/slice s 1 4)       ;=> (20 30 40)

Nearest / Floor / Ceiling

(def s (oc/ordered-set [200 200 400 300 500]))

(oc/nearest s :<= 350)  ;=> 300  (floor)
(oc/nearest s :>= 350)  ;=> 400  (ceiling)
(oc/nearest s :< 300)   ;=> 200  (predecessor)
(oc/nearest s :> 300)   ;=> 400  (successor)

Split & Subrange

(def s (oc/ordered-set [5 4 3 1 2]))

(oc/split-key 3 s)  ;=> [#{1 2} 3 #{4 5}]    O(log n)
(oc/split-at 2 s)   ;=> [#{1 2} #{3 4 5}]     O(log n)

;; subrange returns a collection, not a seq
(oc/subrange s :>= 2 :<= 4)  ;=> #{2 3 4}

Interval Queries

 meeting: +-------+
   lunch:       +-------+
  review:                 +-------+
         9==10==11==12==13==14==15==16==17
(def schedule
  (oc/interval-map
    {[9 12] "meeting" [14 17] "review" [11 15] "lunch"}))

(schedule 11)       ;=> ("meeting" "lunch")       point query
(schedule [10 14])  ;=> ("meeting" "lunch" "review")  range query
(oc/span schedule)  ;=> [9 17]

Range Maps

Non-overlapping ranges — each point maps to exactly one value. Inserting a new range automatically carves out whatever it overlaps.

(def tiers
  (-> (oc/range-map)
      (assoc [0 100] :bronze)
      (assoc [100 500] :silver)
      (assoc [500 5000] :gold)))

(tiers 250)                      ;=> :silver
(oc/get-entry tiers 250)         ;=> [[100 500] :silver]

Insert a flash-sale range — bronze and silver are automatically split:

(oc/ranges (assoc tiers [50 200] :flash))

;; => ([[0 50]    :bronze]        ← auto-trimmed
;;     [[50 200]  :flash]         ← inserted
;;     [[200 500] :silver]        ← auto-trimmed
;;     [[500 5000] :gold])

Segment Trees

(def sales (oc/sum-tree {1 100, 2 200, 3 150, 4 300, 5 250}))

(oc/query sales 2 4)     ;=> 650    O(log n)
(oc/aggregate sales)      ;=> 1000   O(1)

;; Update and re-query

(def sales' (assoc sales 3 500))
(oc/query sales' 2 4)     ;=> 1000

;; Also: min-tree, max-tree, or any associative operation

(def peaks (oc/segment-tree max 0 {1 100, 2 200, 3 150}))
(oc/query peaks 1 3)      ;=> 200

Fuzzy Lookup

(def sizes (oc/fuzzy-set [6 7 8 9 10 11 12 13]))
(sizes 9.3)                    ;=> 9
(oc/fuzzy-nearest sizes 9.3)   ;=> [9 0.30]

(def tiers (oc/fuzzy-map {0 :bronze 500 :silver 1000 :gold}))
(tiers 480)                    ;=> :silver

Priority Queue & Multiset

;; Priority queue (min-heap)

(def pq (oc/priority-queue [[3 :medium] [1 :urgent] [5 :low]]))
(peek pq)       ;=> [1 :urgent]
(peek (pop pq)) ;=> [3 :medium]

;; Multiset (sorted bag, allows duplicates)

(def ms (oc/ordered-multiset [3 1 4 1 5 9 2 6 5 3 5]))
(oc/multiplicity ms 5)  ;=> 3

Parallel Fold

All collection types implement CollFold for efficient r/fold:

(require '[clojure.core.reducers :as r])

(def combinef (fn ([] {}) ([m1 m2] (merge-with + m1 m2))))
(def reducef  (fn [m x] (update m (mod x 100) (fnil inc 0))))

(r/fold combinef reducef (oc/ordered-set (range 1000000)))

;; 1M-element frequency-map workload from the benchmark suite:
;; ~6.9x faster than hash-set reduce, ~4.5x faster than sorted-set fold,
;; ~3.4x faster than data.avl fold

Testing

$ lein test

Ran 690 tests containing 471,000+ assertions.
0 failures, 0 errors.

The test suite includes generative tests via test.check and equivalence tests against sorted-set, sorted-map, and clojure.data.avl.

Tooling

$ lein codox                 # Generate API docs in doc/api
$ lein stats                 # Print project statistics

Benchmarks

$ lein bench                  # Criterium, N=100K (~5 min)
$ lein bench --full           # Criterium, N=1K,5K,10K,100K,500K (~60 min)
$ lein bench --readme --full  # README tables only (~10 min)
$ lein bench --sizes 50000    # Custom sizes

$ lein bench-simple           # Quick iteration bench (100 to 100K)
$ lein bench-simple --full    # Full suite (100 to 1M)
$ lein bench-range-map        # Range-map vs Guava TreeRangeMap
$ lein bench-parallel         # Parallel threshold crossover analysis
$ lein bench-report           # Analyze latest benchmark results

Criterium results are written to bench-results/<timestamp>.edn.


Inspiration

The implementation of this weight-balanced binary tree data structure library was inspired by the following:


License

The use and distribution terms for this software are covered by the Eclipse Public License 1.0, which can be found in the file LICENSE.txt at the root of this distribution. By using this software in any fashion, you are agreeing to be bound by the terms of this license. You must not remove this notice, or any other, from this software.


For extended examples featuring Zorp, Kevin the sentient flip-flop, and Big Toe Tony's 47 feet, see Zorp's Sneaker Emporium.

Can you improve this documentation? These fine people already did:
Dan Lentz & dco-hsu
Edit on GitHub

cljdoc builds & hosts documentation for Clojure/Script libraries

Keyboard shortcuts
Ctrl+kJump to recent docs
Move to previous article
Move to next article
Ctrl+/Jump to the search field
× close