Liking cljdoc? Tell your friends :D

TOON: Token-Oriented Object Notation

Clojars Project CI SPEC v1.3 License: MIT

A Clojure/ClojureScript implementation of Token-Oriented Object Notation – a compact, human-readable serialization format designed for passing structured data to Large Language Models with significantly reduced token usage.

TOON achieves 49% fewer tokens than formatted JSON (28% vs compact JSON) while maintaining explicit structure that helps LLMs parse and validate data reliably. It's intended for LLM input as a lossless, drop-in representation of JSON data.

Specification: This library implements TOON v1.3 specification Reference Implementation: TypeScript/JavaScript

Why TOON?

When working with Large Language Models, token efficiency directly impacts cost, context window usage, and processing speed. LLM tokens still cost money – and standard JSON is verbose and token-expensive.

TOON's sweet spot is uniform arrays of objects – multiple fields per row, same structure across items. It borrows YAML's indentation-based structure for nested objects and CSV's tabular format for uniform data rows, then optimizes both for token efficiency in LLM contexts.

Token Efficiency

Based on benchmarks using the GPT-5 o200k_base tokenizer:

  • 49.1% reduction vs formatted JSON (2-space indentation)
  • 28.0% reduction vs compact JSON (minified)
  • 39.4% reduction vs YAML
  • 56.0% reduction vs XML

Real-world examples:

  • GitHub repositories (100 items): 42.3% fewer tokens than JSON
  • Daily analytics (180 days): 58.9% fewer tokens than JSON
  • E-commerce orders: 35.4% fewer tokens than JSON

Key Features

  • 💸 Token-efficient: Eliminates redundant punctuation and repeated keys
  • 🤿 LLM-friendly guardrails: Explicit lengths and fields enable validation
  • 🍱 Minimal syntax: Removes braces, brackets, and most quotes
  • 📐 Indentation-based: Uses whitespace like YAML instead of braces
  • 🧺 Tabular arrays: Declare keys once, stream data as rows

When to Use TOON

TOON excels at:

  • Uniform arrays of objects (same fields, primitive values)
  • Large datasets with consistent structure
  • Tabular data with multiple rows

JSON is better for:

  • Non-uniform data with varying field sets
  • Deeply nested structures
  • Mixed-type collections

CSV is more compact for:

  • Flat, uniform tables without any nesting
  • Data without nested objects or arrays

Installation

Clojure CLI/deps.edn

com.vadelabs/toon {:mvn/version "2025.11.05-43"}

Leiningen/Boot

[com.vadelabs/toon "2025.11.05-43"]

Quick Start

(require '[com.vadelabs.toon.interface :as toon])

;; Encode Clojure data to TOON
(toon/encode {:name "Alice" :age 30 :tags ["dev" "rust"]})
;=> "name: Alice\nage: 30\ntags[2]: dev,rust"

;; Decode TOON to Clojure data
(toon/decode "name: Alice\nage: 30\ntags[2]: dev,rust")
;=> {"name" "Alice", "age" 30.0, "tags" ["dev" "rust"]}

Format Examples

Objects

JSON:

{
  "name": "Alice",
  "age": 30,
  "active": true
}

TOON:

name: Alice
age: 30
active: true

Nested Objects

JSON:

{
  "user": {
    "name": "Alice",
    "email": "alice@example.com"
  }
}

TOON:

user:
  name: Alice
  email: alice@example.com

Arrays of Primitives (Inline)

JSON:

{
  "tags": ["reading", "gaming", "coding"]
}

TOON:

tags[3]: reading,gaming,coding

Arrays of Objects (Tabular Format)

This is TOON's sweet spot – uniform arrays of objects with consistent fields:

JSON:

{
  "users": [
    {"id": 1, "name": "Alice", "role": "admin"},
    {"id": 2, "name": "Bob", "role": "user"}
  ]
}

TOON:

users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user

The tabular format eliminates repeated keys, providing significant token savings for large datasets.

Arrays of Mixed Items (List Format)

For non-uniform data, TOON uses list format:

TOON:

items[3]:
  - name: Laptop
    price: 999
  - name: Mouse
    price: 29
  - name: Keyboard
    price: 79

API Reference

encode

Encodes Clojure data structures to TOON format.

(encode input)
(encode input options)

Parameters:

  • input - Any Clojure value (normalized to JSON-compatible types)
  • options - Optional map:
    • :indent - Spaces per indentation level (default: 2)
    • :delimiter - Array value delimiter: "," (default), "\t", or "|"
    • :length-marker - Array length marker: "#" or false (default: false)

Returns: String in TOON format

Examples:

;; Basic encoding
(encode {:name "Ada" :tags ["reading" "gaming"]})
;=> "name: Ada\ntags[2]: reading,gaming"

;; Custom delimiter
(encode {:tags ["a" "b" "c"]} {:delimiter "\t"})
;=> "tags[3\t]: a\tb\tc"

;; Length marker prefix
(encode {:items [1 2 3]} {:length-marker "#"})
;=> "items[#3]: 1,2,3"

;; Tabular array format
(encode [{:id 1 :name "Alice"}
         {:id 2 :name "Bob"}])
;=> "[2]{id,name}:\n  1,Alice\n  2,Bob"

decode

Decodes TOON format to Clojure data structures.

(decode input)
(decode input options)

Parameters:

  • input - String in TOON format
  • options - Optional map:
    • :indent - Spaces per indentation level (default: 2)
    • :strict - Enable strict validation (default: true)

Returns: Clojure data structure (maps, vectors, primitives)

Examples:

;; Basic decoding
(decode "name: Ada\ntags[2]: reading,gaming")
;=> {"name" "Ada", "tags" ["reading" "gaming"]}

;; Tabular array
(decode "[2]{id,name}:\n  1,Alice\n  2,Bob")
;=> [{"id" 1.0, "name" "Alice"} {"id" 2.0, "name" "Bob"}]

;; Inline array
(decode "[3]: 1,2,3")
;=> [1.0 2.0 3.0]

;; Relaxed mode (allows tabs, inconsistent indentation)
(decode "name: Ada" {:strict false})
;=> {"name" "Ada"}

Format Specification

Primitives

string: Hello World
number: 42
float: 3.14
boolean: true
nil: null

Quoted Strings

Strings are quoted when they contain special characters:

comma: "a,b"
colon: "key:value"
reserved: "true"
newline: "line1\nline2"

Objects

Key-value pairs separated by colons:

name: Alice
age: 30

Nested objects use indentation:

user:
  name: Alice
  email: alice@example.com

Arrays

Inline format (primitives):

tags[3]: reading,gaming,coding

Tabular format (objects with same keys):

[3]{id,name}:
  1,Alice
  2,Bob
  3,Carol

List format (mixed items):

items[2]:
  - name: Laptop
    price: 999
  - name: Mouse
    price: 29

Options

Custom delimiter:

tags[3|]: a|b|c
tags[3\t]: a\tb\tc

Length marker:

items[#3]: 1,2,3

Type Normalization

TOON normalizes Clojure types to JSON-compatible values:

  • Keywords → Strings: :name"name"
  • Sets → Sorted vectors: #{3 1 2}[1 2 3]
  • All numbers → Doubles: 4242.0
  • Maps → String-keyed maps: {:a 1}{"a" 1.0}

Testing

# Run all Clojure tests
bb test

# Run all tests (Clojure + Babashka)
bb test:all

# Run CI pipeline with tests
bb ci

# Generate test coverage report
bb coverage

The library includes:

  • 340+ unit tests with 90%+ code coverage
  • Property-based tests using test.check
  • Comprehensive roundtrip testing
  • Edge case coverage

Coverage reports are generated in target/coverage/ including:

  • HTML report: target/coverage/index.html
  • Codecov JSON: target/coverage/codecov.json

Contributing

We welcome contributions! Please see CONTRIBUTING.md for:

  • Development setup
  • Coding guidelines
  • Testing requirements
  • Pull request process

Quick Contribution Guide

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/my-feature
  3. Make your changes with tests
  4. Run tests: bb test
  5. Commit with clear messages: git commit -m "add feature X"
  6. Push and create a pull request

Specification

This implementation follows the TOON v1.3 specification (2025-10-31).

For detailed format rules, edge cases, and conformance requirements, see:

Benchmarks

Detailed benchmarks comparing TOON against JSON, YAML, XML, and CSV across multiple datasets and LLM models are available in the reference implementation repository.

Key findings:

  • Token efficiency: 49% fewer tokens than formatted JSON on average
  • Retrieval accuracy: 70.1% (TOON) vs 65.4% (JSON) across 4 LLMs
  • Best case: 58.9% reduction for uniform tabular data (daily analytics)

Token counts are measured using the GPT-5 o200k_base tokenizer. Actual savings vary by model and tokenizer.

Other Implementations

Official Implementations

Community Implementations

Note: When implementing TOON in other languages, follow the specification to ensure compatibility. The conformance tests provide language-agnostic validation.

Roadmap

  • [ ] Conformance test suite integration
  • [ ] Performance benchmarks vs JSON for Clojure
  • [ ] ClojureScript browser optimization
  • [ ] Streaming encoder/decoder
  • [ ] Custom type handlers

License

Copyright © 2025 Vade Labs Pvt. Ltd.

Distributed under the MIT License. See LICENSE for details.

Links

Can you improve this documentation?Edit on GitHub

cljdoc builds & hosts documentation for Clojure/Script libraries

Keyboard shortcuts
Ctrl+kJump to recent docs
Move to previous article
Move to next article
Ctrl+/Jump to the search field
× close