Liking cljdoc? Tell your friends :D

ordered-collections

Fast, modern, ropes and ordered collections that do more than sort.

Drop-in replacements for sorted-set and sorted-map. With inherent parallelism, work-optimal set algebra, positional access, parallel fold, and specialized collections for problems you didn't know you could solve efficiently:

Ropes — concat, split, splice, insert: 10–5000x faster than Clojure vector at scale
Sets and Maps work exactly as you're used to, but do more, up to 50x faster
Interval maps for overlap queries ("what's scheduled at 3pm?")
Range maps for non-overlapping regions ("which subnet owns this IP?")
Segment trees for range aggregation ("total sales from day 10 to 50?")
Fuzzy collections for nearest-neighbor lookup ("snap 9.3 to the closest valid size")
Priority queues, multisets, and more

All built from a modular, extensible, weight-balanced tree platform with shared foundation for splitting, joining, and parallel operations.

Documentation

Zorp's Sneaker Emporium — Narrative guide with extended examples
Cookbook — Practical patterns: leaderboards, time-series, scheduling, queues, multisets, and more
Collections API — Collection-by-collection constructor and operation reference
Ropes — Rope tutorial, use cases, and design
Benchmarks — Detailed performance measurements
Competitive Analysis — Comparison with other libraries
vs clojure.data.avl — For data.avl users considering a switch
Algorithms — Tree structure, rotations, split/join, interval augmentation
Why Weight-Balanced Trees? — Comparison with red-black and AVL trees
One Tree, Many Forests — Conceptual architecture and design rationale

Quick Start

Use ordered-set and ordered-map exactly like clojure.core/sorted-set and clojure.core/sorted-map. All the functions you know work the same way. The difference is under the hood — and in the new things you can do.

(require '[ordered-collections.core :as oc])

;; Ropes — O(log n) split, splice, concat

(def r (oc/rope [:a :b :c :d :e]))     ;=> #ordered/rope [:a :b :c :d:e]

(apply oc/rope-concat
   (reverse (oc/rope-split r 2)))      ;=> #ordered/rope [:c :d :e :a :b]

(oc/rope-insert r 2 [:x :y])           ;=> #ordered/rope [:a :b :x :y :c :d :e]

;; Sets

(def s (oc/ordered-set [3 1 4 1 5 9 2 6]))  ;=> #ordered/set [1 2 3 4 5 6 9]

(s 4)           ;=> 4
(s 7)           ;=> nil
(conj s 0)      ;=> #{0 1 2 3 4 5 6 9}

;; Maps

(def m (oc/ordered-map {:b 2 :a 1 :c 3}))  ;=> #ordered/map [[:a 1] [:b 2] [:c 3]]

(m :b)                  ;=> 2
(assoc m :d 4)          ;=> {:a 1, :b 2, :c 3, :d 4}

Performance

Across the measured workloads, ordered-collections is faster than both clojure.core/sorted-set and clojure.data.avl at every cardinality Set algebra is the standout, with 28-57x wins at 500K. Even against unordered clojure.core/setthe benchmarks still show roughly 4-19x wins.

Rope vs PersistentVector

Workload	N=10K	N=100K	N=500K
200 random edits	43x	498x	1968x
Single splice	6x	116x	584x
Concat many pieces	3.4x	5.4x	9.5x
Chunk iteration	58x	83x	117x
Fold (sum)	5.6x	1.5x	1.3x
Reduce (sum)	0.4x	1.7x	1.3x
Random nth (1000)	0.7x	0.5x	0.4x

The rope wins on 6 of 7 workloads at scale and the advantage grows with collection size. Concat improves with N because the rope collects chunks in O(k) while the vector copies O(n) elements. Reduce beats vectors at N ≥ 100K thanks to 256-element chunk locality. Random access is slower (O(log n) vs O(1)) but bounded. See Ropes for the full tutorial.

Set algebra

vs clojure.core/sorted-set

Operation	N=10K	N=100K	N=500K
Union	15.4x	26.4x	56.6x
Intersection	9.0x	17.0x	36.2x
Difference	9.6x	22.1x	50.2x

vs clojure.data.avl

Operation	N=10K	N=100K	N=500K
Union	10.9x	20.5x	42.1x
Intersection	7.2x	13.0x	28.1x
Difference	7.2x	12.7x	32.0x

vs clojure.core/set

Operation	N=10K	N=100K	N=500K
Union	4.2x	7.2x	16.3x
Intersection	3.8x	6.1x	12.9x
Difference	4.4x	7.6x	18.6x

Set equality

| | vs hash-set | vs sorted-set | vs data.avl | |--|------------:|--------------:|------------:| | N=10K | 2.8x | 12.0x | 14.1x | | N=100K | 2.3x | 9.3x | 9.5x |

Other operations

Operation	vs sorted-set	vs data.avl
Construction	3.0x / 2.8x / 3.1x	1.5x / 1.3x / 1.7x
Lookup	1.1x / 1.1x / 1.1x	1.0x / 1.0x / 1.0x
Split	—	6.8x / 7.2x / 7.8x
Fold	2.5x / 4.1x / 4.1x	5.8x / 5.9x / 8.8x

Benchmarked on a 2023 MacBook Pro (M2). See Benchmarks for full results.

How It Works

The core is a weight-balanced binary tree. Each node knows its subtree size, enabling O(log n) positional access and efficient parallel decomposition.

Split and join are the fundamental primitives — splitting at a key produces two trees in O(log n); joining is also O(log n). Set operations, subrange extraction, and parallel fold all reduce to split/join. Set operations use Adams' divide-and-conquer algorithm (1992) extended with the parallel forkl-join based approach from Blelloch, Ferizovic & Sun (2016).

Collection constructors provide the comparator and node-construction hooks, so the same tree algorithms can back generic, primitive-specialized, and augmented variants.

Augmented trees extend the basic structure: interval trees store max-endpoint per subtree for O(log n + k) overlap queries; segment trees store aggregates for O(log n) range queries.

See Algorithms for implementation details and Why Weight-Balanced Trees? for comparison with red-black and AVL trees.

Collections

The fundamental collection types currently implemented are: ordered-set, ordered-map, rope, interval-set, interval-map, range-map, segment-tree, priority-queue, ordered-multiset, fuzzy-set, and fuzzy-map.

Constructor	Description
Ordered Set
`(oc/ordered-set coll)`	Sorted set (drop-in for `sorted-set`)
`(oc/ordered-set-by pred coll)`	Sorted set with custom comparator
`(oc/long-ordered-set coll)`	Sorted set optimized for Long keys
`(oc/string-ordered-set coll)`	Sorted set optimized for String keys
Ordered Map
`(oc/ordered-map coll)`	Sorted map (drop-in for `sorted-map`)
`(oc/ordered-map-by pred coll)`	Sorted map with custom comparator
`(oc/long-ordered-map coll)`	Sorted map optimized for Long keys
`(oc/string-ordered-map coll)`	Sorted map optimized for String keys
Interval Collections
`(oc/interval-set coll)`	Set supporting interval overlap queries
`(oc/interval-map coll)`	Map supporting interval overlap queries
Range Map
`(oc/range-map)`	Non-overlapping ranges (Guava TreeRangeMap)
Segment Tree
`(oc/segment-tree f identity coll)`	O(log n) range aggregate queries
`(oc/segment-tree-by pred f identity coll)`	Segment tree with custom ordering predicate
`(oc/segment-tree-with cmp f identity coll)`	Segment tree with custom Comparator
Priority Queue
`(oc/priority-queue pairs)`	Priority queue from `[priority value]` pairs
Ordered Multiset
`(oc/ordered-multiset coll)`	Sorted multiset (allows duplicates)
Rope
`(oc/rope coll)`	Persistent sequence for structural editing
`(oc/rope-concat a b)`	Concatenate two ropes — O(log n)
`(oc/rope-splice r start end items)`	Replace a range — O(log n)
Fuzzy Collections
`(oc/fuzzy-set coll)`	Returns closest element to query
`(oc/fuzzy-map coll)`	Returns value for closest key to query

Ropes

A rope is a persistent sequence optimized for structural editing — concatenation, splitting, splicing, and insertion in the middle of large sequences. Where PersistentVector is O(n) for mid-sequence edits, the rope is O(log n).

(def r (oc/rope (range 100000)))

;; Splice into the middle — O(log n), not O(n)
(def edited (oc/rope-splice r 50000 50010 [:new :data]))

;; Split — O(log n)
(let [[left right] (oc/rope-split r 50000)]
  [(count left) (count right)])  ;=> [50000 50000]

;; Concatenate — O(log n)
(oc/rope-concat (oc/rope [1 2 3]) (oc/rope [4 5 6]))
;=> #ordered/rope [1 2 3 4 5 6]

Capabilities

Operations that sorted-set and sorted-map don't provide — at any collection size. For the full collection-by-collection surface area, see Collections API.

Positional Access & Rank

(def s (oc/ordered-set [50 30 20 40 10]))

(nth s 2)              ;=> 30     O(log n)
(oc/rank s 30)         ;=> 2      O(log n)
(oc/median s)          ;=> 30     O(log n)
(oc/percentile s 90)   ;=> 50     O(log n)
(oc/slice s 1 4)       ;=> (20 30 40)

Nearest / Floor / Ceiling

(def s (oc/ordered-set [200 200 400 300 500]))

(oc/nearest s :<= 350)  ;=> 300  (floor)
(oc/nearest s :>= 350)  ;=> 400  (ceiling)
(oc/nearest s :< 300)   ;=> 200  (predecessor)
(oc/nearest s :> 300)   ;=> 400  (successor)

Split & Subrange

(def s (oc/ordered-set [5 4 3 1 2]))

(oc/split-key 3 s)  ;=> [#{1 2} 3 #{4 5}]    O(log n)
(oc/split-at 2 s)   ;=> [#{1 2} #{3 4 5}]     O(log n)

;; subrange returns a collection, not a seq
(oc/subrange s :>= 2 :<= 4)  ;=> #{2 3 4}

Interval Queries

 meeting: +-------+
   lunch:       +-------+
  review:                 +-------+
         9==10==11==12==13==14==15==16==17

(def schedule
  (oc/interval-map
    {[9 12] "meeting" [14 17] "review" [11 15] "lunch"}))

(schedule 11)       ;=> ("meeting" "lunch")       point query
(schedule [10 14])  ;=> ("meeting" "lunch" "review")  range query
(oc/span schedule)  ;=> [9 17]

Range Maps

Non-overlapping ranges — each point maps to exactly one value. Inserting a new range automatically carves out whatever it overlaps.

(def tiers
  (-> (oc/range-map)
      (assoc [0 100] :bronze)
      (assoc [100 500] :silver)
      (assoc [500 5000] :gold)))

(tiers 250)                      ;=> :silver
(oc/get-entry tiers 250)         ;=> [[100 500] :silver]

Insert a flash-sale range — bronze and silver are automatically split:

(oc/ranges (assoc tiers [50 200] :flash))

;; => ([[0 50]    :bronze]        ← auto-trimmed
;;     [[50 200]  :flash]         ← inserted
;;     [[200 500] :silver]        ← auto-trimmed
;;     [[500 5000] :gold])

Segment Trees

(def sales (oc/sum-tree {1 100, 2 200, 3 150, 4 300, 5 250}))

(oc/query sales 2 4)     ;=> 650    O(log n)
(oc/aggregate sales)      ;=> 1000   O(1)

;; Update and re-query

(def sales' (assoc sales 3 500))
(oc/query sales' 2 4)     ;=> 1000

;; Also: min-tree, max-tree, or any associative operation

(def peaks (oc/segment-tree max 0 {1 100, 2 200, 3 150}))
(oc/query peaks 1 3)      ;=> 200

Fuzzy Lookup

(def sizes (oc/fuzzy-set [6 7 8 9 10 11 12 13]))
(sizes 9.3)                    ;=> 9
(oc/fuzzy-nearest sizes 9.3)   ;=> [9 0.30]

(def tiers (oc/fuzzy-map {0 :bronze 500 :silver 1000 :gold}))
(tiers 480)                    ;=> :silver

Priority Queue & Multiset

;; Priority queue (min-heap)

(def pq (oc/priority-queue [[3 :medium] [1 :urgent] [5 :low]]))
(peek pq)       ;=> [1 :urgent]
(peek (pop pq)) ;=> [3 :medium]

;; Multiset (sorted bag, allows duplicates)

(def ms (oc/ordered-multiset [3 1 4 1 5 9 2 6 5 3 5]))
(oc/multiplicity ms 5)  ;=> 3

Parallel Fold

All collection types implement CollFold for efficient r/fold:

(require '[clojure.core.reducers :as r])

(def combinef (fn ([] {}) ([m1 m2] (merge-with + m1 m2))))
(def reducef  (fn [m x] (update m (mod x 100) (fnil inc 0))))

(r/fold combinef reducef (oc/ordered-set (range 1000000)))

;; 1M-element frequency-map workload from the benchmark suite:
;; ~6.9x faster than hash-set reduce, ~4.5x faster than sorted-set fold,
;; ~3.4x faster than data.avl fold

Testing

$ lein test

Ran 454 tests containing 466,000+ assertions.
0 failures, 0 errors.

The test suite includes generative tests via test.check and equivalence tests against sorted-set, sorted-map, and clojure.data.avl.

Tooling

$ lein codox                 # Generate API docs in doc/api
$ lein stats                 # Print project statistics

Benchmarks

$ lein bench                  # Criterium, N=100K (~5 min)
$ lein bench --full           # Criterium, N=10K,100K,500K (~40 min)
$ lein bench --readme --full  # README tables only (~10 min)
$ lein bench --sizes 50000    # Custom sizes

$ lein bench-simple           # Quick iteration bench (100 to 100K)
$ lein bench-simple --full    # Full suite (100 to 1M)
$ lein bench-range-map        # Range-map vs Guava TreeRangeMap
$ lein bench-parallel         # Parallel threshold crossover analysis
$ lein bench-report           # Analyze latest benchmark results

Criterium results are written to bench-results/<timestamp>.edn.

Inspiration

The implementation of this weight-balanced binary tree data structure library was inspired by the following:

Adams (1992) 'Implementing Sets Efficiently in a Functional Language' Technical Report CSTR 92-10, University of Southampton. http://groups.csail.mit.edu/mac/users/adams/BB/92-10.ps
Hirai and Yamamoto (2011) 'Balancing Weight-Balanced Trees' Journal of Functional Programming / 21 (3): Pages 287-307 https://yoichihirai.com/bst.pdf
Oleg Kiselyov 'Towards the best collection API, A design of the overall optimal collection traversal interface' https://okmij.org/ftp/papers/LL3-collections-enumerators.txt
Nievergelt and Reingold (1972) 'Binary Search Trees of Bounded Balance' STOC '72 Proceedings 4th Annual ACM symposium on Theory of Computing Pages 137-142 https://dl.acm.org/doi/abs/10.1145/800152.804906
Driscoll, Sarnak, Sleator, and Tarjan (1989) 'Making Data Structures Persistent' Journal of Computer and System Sciences Volume 38 Issue 1, February 1989 18th Annual ACM Symposium on Theory of Computing Pages 86-124 https://www.sciencedirect.com/science/article/pii/0022000089900342
MIT Scheme weight balanced tree as reimplemented by Yoichi Hirai and Kazuhiko Yamamoto using the revised non-variant algorithm recommended integer balance parameters from (Hirai/Yamamoto 2011). https://www.cambridge.org/core/journals/journal-of-functional-programming/article/balancing-weightbalanced-trees/7281C4DE7E56B74F2D13F06E31DCBC5B
Wikipedia 'Interval Tree' https://en.wikipedia.org/wiki/Interval_tree
Wikipedia 'Segment Tree' https://en.wikipedia.org/wiki/Segment_tree
Google Guava 'RangeMap' https://guava.dev/releases/snapshot/api/docs/com/google/common/collect/RangeMap.html
Wikipedia 'Weight Balanced Tree' https://en.wikipedia.org/wiki/Weight-balanced_tree
Andrew Baine (2007) 'Purely Functional Data Structures in Common Lisp' Google Summer of Code 2007, mentored by Rahul Jain https://funds.common-lisp.dev/funds.pdf https://developers.google.com/open-source/gsoc/2007/
Scott L. Burson 'Functional Set-Theoretic Collections for Common Lisp' https://fset.common-lisp.dev/
Adams (1993) 'Efficient sets—a balancing act' Journal of Functional Programming 3(4): 553-562 https://www.cambridge.org/core/journals/journal-of-functional-programming/article/functional-pearls-efficient-setsa-balancing-act/0CAA1C189B4F7C15CE9B8C02D0D4B54E
Blelloch, Ferizovic, and Sun (2016) 'Just Join for Parallel Ordered Sets' ACM SPAA 2016 https://dl.acm.org/doi/10.1145/2935764.2935768
Boehm, Atkinson, and Plass (1995) 'Ropes: an Alternative to Strings' Software: Practice and Experience 25(12) https://www.cs.rit.edu/usr/local/pub/jeh/courses/QUARTERS/FP/Labs/CesswordsII/boehm-ropes.pdf
Haskell containers library (Data.Set, Data.Map) https://hackage.haskell.org/package/containers
SLIB Weight-Balanced Trees (Aubrey Jaffer) https://people.csail.mit.edu/jaffer/slib/Weight_002dBalanced-Trees.html
PAM: Parallel Augmented Maps https://cmuparlay.github.io/PAMWeb/

License

The use and distribution terms for this software are covered by the Eclipse Public License 1.0, which can be found in the file LICENSE.txt at the root of this distribution. By using this software in any fashion, you are agreeing to be bound by the terms of this license. You must not remove this notice, or any other, from this software.

For extended examples featuring Zorp, Kevin the sentient flip-flop, and Big Toe Tony's 47 feet, see Zorp's Sneaker Emporium.

Can you improve this documentation? These fine people already did:
Dan Lentz & dco-hsuEdit on GitHub

cljdoc builds & hosts documentation for Clojure/Script libraries

Keyboard shortcuts

`Ctrl`+`k`	Jump to recent docs
`←`	Move to previous article
`→`	Move to next article
`Ctrl`+`/`	Jump to the search field

Raise an issue Browse cljdoc source Chat on Slack

× close