This namespaces provides a reader that combines our grammar and clojure's reader to turn a string of prose text into data clojure can then evaluate.
The reader starts by parsing the text using our grammar. This gives a first data representation from which is computed data that clojure can evaluate.
The different syntactic elements are processed as follows:
This namespaces provides a reader that combines our grammar and clojure's reader to turn a string of prose text into data clojure can then evaluate. The reader starts by parsing the text using our grammar. This gives a first data representation from which is computed data that clojure can evaluate. The different syntactic elements are processed as follows: - text -> string - clojure call -> itself - symbol -> itself - tag -> clojure fn call - verbatim block -> string containing the verbatim block's content.
The Grammar propsed here is heavily inspired by Pollen's.
We construct it using in 2 parts:
Our lexer is made of regular expressions constructed with the
fr.jeremyschoffen.prose.alpha.reader.grammar.utils/def-regex
macro. It uses the Regal library under the covers.
Then, to assemble these regexes into a grammar we use the
fr.jeremyschoffen.prose.alpha.reader.grammar.utils/make-lexer
macro.
For instance we could construct the following 2 rules lexer:
(def-regex number [:* :digit])
(def-regex word [:* [:class ["a" "z"]]])
(def lexer (make-lexer number word))
lexer
;=> {:number {:tag :regexp
:regexp #"\d*"}
:word {:tag :regexp
:regexp #"[a-z]*"}}
Most of the grammatical rules are created using the ebnf notation as follows
(def rules
(instac/ebnf
"
doc = (token <':'>)*
token = (number | word)
"))
rules
;=>{:doc {:tag :star
:parser {:tag :cat
:parsers ({:tag :nt :keyword :token}
{:tag :string :string ":" :hide true})}}
:token {:tag :alt
:parsers ({:tag :nt :keyword :number}
{:tag :nt :keyword :word})}}
Now that we have both a lexer and and grammatical rules, we can simply merge them to have the full grammar.
(def parser
(insta/parser (merge lexer rules)
:start :doc))
(parser "abc:1:def:2:3:")
;=> [:doc
[:token [:word "abc"]]
[:token [:number "1"]]
[:token [:word "def"]]
[:token [:number "2"]]
[:token [:number "3"]]]
```
With the exception of some details, this is how this namespace is organized.
# Prose's grammar. The Grammar propsed here is heavily inspired by Pollen's. We construct it using in 2 parts: - a lexical part or lexer made of regular expressions. - a set of grammatical rules tyring the lexer together into the grammar. ## The lexer. Our lexer is made of regular expressions constructed with the [[fr.jeremyschoffen.prose.alpha.reader.grammar.utils/def-regex]] macro. It uses the Regal library under the covers. Then, to assemble these regexes into a grammar we use the [[fr.jeremyschoffen.prose.alpha.reader.grammar.utils/make-lexer]] macro. For instance we could construct the following 2 rules lexer: ```clojure (def-regex number [:* :digit]) (def-regex word [:* [:class ["a" "z"]]]) (def lexer (make-lexer number word)) lexer ;=> {:number {:tag :regexp :regexp #"\d*"} :word {:tag :regexp :regexp #"[a-z]*"}} ``` ## The grammatical rules Most of the grammatical rules are created using the ebnf notation as follows ```clojure (def rules (instac/ebnf " doc = (token <':'>)* token = (number | word) ")) rules ;=>{:doc {:tag :star :parser {:tag :cat :parsers ({:tag :nt :keyword :token} {:tag :string :string ":" :hide true})}} :token {:tag :alt :parsers ({:tag :nt :keyword :number} {:tag :nt :keyword :word})}} ``` ## The combining trick Now that we have both a lexer and and grammatical rules, we can simply merge them to have the full grammar. ````clojure (def parser (insta/parser (merge lexer rules) :start :doc)) (parser "abc:1:def:2:3:") ;=> [:doc [:token [:word "abc"]] [:token [:number "1"]] [:token [:word "def"]] [:token [:number "2"]] [:token [:number "3"]]] ``` With the exception of some details, this is how this namespace is organized.
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close