Liking cljdoc? Tell your friends :D

Fast EDN parser

EDN format is very similar to JSON, thus it should parse as fast as JSON.

Fast EDN is a drop-in replacement for clojure.edn/read-string that is roughly 6 times faster:

Test fileclojure.ednfast-edn.corespeed up, times
basic_100.5040.277× 1.8
basic_1003.0400.534× 5.7
basic_100019.4952.733× 7.1
basic_10000221.77336.887× 6.0
basic_1000002138.255356.772× 6.0
nested_1000002585.372441.200× 5.9
ints_1400431.43227.000× 16.0
keywords_103.9610.634× 6.2
keywords_10034.9804.848× 7.2
keywords_1000369.40453.942× 6.8
keywords_100004168.732654.090× 6.4
strings_1000651.04342.335× 15.4
strings_uni_250641.900102.268× 6.3

Fast EDN achieves JSON parsing speeds (json + keywordize keys vs EDN of the same size):

File sizecheshirejsonistacharredfast-edn
basic_100.5880.1370.3280.277
basic_1001.0430.5940.7210.534
basic_10004.2242.9993.0162.733
basic_1000037.79334.37432.62336.887
basic_100000359.558327.997313.280356.772

Speed of EDN parsing makes Transit obsolete on JVM:

fileclojure.edntransit+msgpacktransit+jsonfast-edn
basic_100.4812.8321.4740.273
basic_1002.7994.2422.2970.527
basic_100017.54814.7386.5832.695
basic_10000211.536125.74146.84938.214
basic_1000002016.8851167.972447.013363.691

All execution times above are in µs, M1 Pro 16 Gb, single thread, JDK Zulu23.30+13-CA.

To run benchmarks yourself:

./script/bench_json.sh
./script/bench_edn.sh
./script/bench_transit.sh

Other benefits

Fast EDN has more consistent error reporting. Clojure:

(clojure.edn/read-string "1a")
; => NumberFormatException: Invalid number: 1a

(clojure.edn/read-string "{:a 1 :b")
; => RuntimeException: EOF while reading

(clojure.edn/read-string "\"{:a 1 :b")
; => RuntimeException: EOF while reading string

(clojure.edn/read-string "\"\\u123\"")
; => IllegalArgumentException: Invalid character length: 3, should be: 4

Fast EDN includes location information in exceptions:

(fast-edn.core/read-string "1a")
; => NumberFormatException: For input string: "1a", offset: 2, context:
;    1a
;     ^

(fast-edn.core/read-string "{:a 1 :b")
; => RuntimeException: Map literal must contain an even number of forms: {:a 1, :b, offset: 8, context:
;    {:a 1 :b
;           ^

(fast-edn.core/read-string "\"{:a 1 :b")
; => RuntimeException: EOF while reading string: "{:a 1 :b, offset: 9, context:
;    "{:a 1 :b
;            ^

(fast-edn.core/read-string "\"\\u123\"")
; => RuntimeException: Unexpected digit: ", offset: 7, context:
;    "\u123"
;          ^

Optionally, you can include line number/column information at the cost of a little performance:

(read-string {:count-lines true} "\"abc")
; => RuntimeException: EOF while reading string: "abc, line: 1, column: 5, offset: 4, context:
;    "abc
;       ^

Using

Add this to deps.edn:

io.github.tonsky/fast-edn {:mvn/version "1.1.0"}

read-string works exactly the same as in clojure.edn:

(require '[fast-edn.core :as edn])

;; Read from string
(edn/read-string "{:a 1}")

;; Options
(edn/read-string
  {:eof     ::eof
   :readers {'inst #(edn/parse-timestamp edn/construct-instant %)}
   :default (fn [tag value]
              (clojure.core/tagged-literal tag value))})

In addition to strings, fast-edn.core/read-once allows you to read from InputStream, File, byte[], char[] and String:

(edn/read-once (io/file "data.edn"))

Note that read-once closes the Reader/InputStream you pass to it, so it’s not a direct analogue of clojure.edn/read.

Consuming multiple sequential objects from the same Reader/InputStream is possible but looks slightly different. In Clojure:

(let [r (java.io.PushbackReader. reader)]
  (take-while #(not= ::eof %)
    (repeatedly #(clojure.edn/read {:eof ::eof} r))))

In Fast EDN:

(let [p (fast-edn.core/parser {:eof ::eof} reader)]
  (take-while #(not= ::eof %)
    (repeatedly #(fast-edn.core/read-next p))))

Compatibility

Fast EDN is 100% compatible with clojure.edn. It will read everything that clojure.edn would.

Most cases that clojure.edn rejects, Fast EDN will reject too. There are some minor exceptions though: Fast EDN is a tiny bit more permissive than clojure.edn. We tried to follow intent and just simplify/streamline edge cases where it made sense.

In Fast EDN, ratios can be specified with arbitrary integers:

(clojure.edn/read-string "2r1111N")
; => NumberFormatException: For input string: "1111N" under radix 2

(fast-edn.core/read-string "2r1111N")
; => 15N

(clojure.edn/read-string "0xFF/0x02")
; => NumberFormatException: Invalid number: 0xFF/0x02

(fast-edn.core/read-string "0xFF/0x02")
; => 255/2

Symbols/keywords can have slashes anywhere, first slash is ns separator. Clojure allows them almost anywhere but rules for when it doesn’t are weird:

(clojure.edn/read-string ":ns/sym/")
; => RuntimeException: Invalid token: :ns/sym/

(read-string ":ns/sym/")
; => :ns/sym/

Same goes for keywords starting with a number. Clojure allows :1a but not :ns/1a and it seems like an oversight rather than a deliberate design decision:

(clojure.edn/read-string ":ns/1a")
; => RuntimeException: Invalid token: :ns/1a

(fast-edn.core/read-string ":ns/1a")
; => :ns/1a

We also support vectors in metadata since Clojure supports them and EDN parser was probably just not updated in time.

(clojure.edn/read-string "^[tag] {}")
; => IllegalArgumentException: Metadata must be Symbol,Keyword,String or Map

(fast-edn.core/read-string "^[tag] {}")
; => {:param-tags ['tag]} {}

According to github.com/edn-format/edn, metadata should not be handled by EDN at all, but clojure.edn supports it and so are we.

Test coverage

Fast EDN is extensively tested by test suite from clojure.core, by our own generative test suite and by a set of hand-crafted test cases.

To run tests yourself:

./script/test.sh

What’s the secret?

Fast EDN achieves its speed mainly by avoiding two things clojure.edn does:

  • reading from Reader one char at a time,
  • using regexps.

Appreciation

License

Copyright © 2024 Nikita Prokopov

Licensed under MIT.

Can you improve this documentation?Edit on GitHub

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close