fr.jeremyschoffen.textp.alpha.reader.core

This namespaces provides a reader that combines our grammar and clojure's reader to turn a string of text into data clojure can then evaluate.

Reader results

The reader starts by parsing the text using our grammar then returns a clojurized version of the parse tree.

The different syntactic elements are processed as follows:

text -> string
tag -> clojure fn call
verbatim block -> string containing the verbatim block's content.
comments -> empty string or special map containing the comment depending on [[textp.reader.alpha.core/keep-comments]]
embedded clojure -> drop in clojure code or a map containing the code depending on [[textp.reader.alpha.core/wrap-embedded]]

fr.jeremyschoffen.textp.alpha.reader.core.error

fr.jeremyschoffen.textp.alpha.reader.grammar

Textp's grammar.

We construct here textp's grammar using instaparse. Our grammar is then constructed here in two parts:

a lexical part or lexer made of regular expressions.
a set of grammatical rules tyring the lexer together into the grammar.

Our lexer is made of regular expression constructed with the [[textp.reader.alpha.grammar/defregex]] macro which uses the Regal library under the covers. We then assemble a lexer from these regular expressions with the [[textp.reader.alpha.grammar/make-lexer]] macro.

For instance we could construct the following 2 rules lexer:

(def-regex number [:* :digit])

(def-regex word [:* [:class ["a" "z"]]])

(def lexer (make-lexer number word))

lexer
;=> {:number {:tag :regexp
              :regexp #"\d*"}
     :word {:tag :regexp
            :regexp #"[a-z]*"}}

The grammatical rules

We use the [[instaparse.combinators/ebnf]] function to produce grammatical rules. This allows use to write these rules in the ebnf format.

For instance we could write the following:

(def rules
  (instac/ebnf
     "
     doc = (token <':'>)*
     token = (number | word)
     "))

rules
;=>{:doc {:tag :star
          :parser {:tag :cat
                   :parsers ({:tag :nt :keyword :token}
                            {:tag :string :string ":" :hide true})}}
    :token {:tag :alt
            :parsers ({:tag :nt :keyword :number}
                      {:tag :nt :keyword :word})}}

This way of writing the grammatical rules is way easier than using function combinators and still gives us these rules in map form.

The combining trick

Now that we have both a lexer and and grammatical rules, we can simply merge them to have the full grammar.

We actually get a instparse parser this way:

(def parser
  (insta/parser (merge lexer rules)
                :start :doc))

(parser "abc:1:def:2:3:")
;=> [:doc
     [:token [:word "abc"]]
     [:token [:number "1"]]
     [:token [:word "def"]]
     [:token [:number "2"]]
     [:token [:number "3"]]]
```

With the exception of some details, this is how this namespace is made.

# Textp's grammar.

We construct here textp's grammar using instaparse. Our grammar is then constructed here in two parts:
- a lexical part or lexer made of regular expressions.
- a set of grammatical rules tyring the lexer together into the grammar.

## The lexer.
Our lexer is made of regular expression constructed with the [[textp.reader.alpha.grammar/defregex]] macro
which uses the Regal library under the covers. We then assemble a lexer from these regular expressions
with the [[textp.reader.alpha.grammar/make-lexer]] macro.

For instance we could construct the following 2 rules lexer:

```clojure
(def-regex number [:* :digit])

(def-regex word [:* [:class ["a" "z"]]])

(def lexer (make-lexer number word))

lexer
;=> {:number {:tag :regexp
              :regexp #"\d*"}
     :word {:tag :regexp
            :regexp #"[a-z]*"}}
```

## The grammatical rules
We use the [[instaparse.combinators/ebnf]] function to produce grammatical rules. This allows use
to write these rules in the ebnf format.

For instance we could write the following:
```clojure
(def rules
  (instac/ebnf
     "
     doc = (token <':'>)*
     token = (number | word)
     "))

rules
;=>{:doc {:tag :star
          :parser {:tag :cat
                   :parsers ({:tag :nt :keyword :token}
                            {:tag :string :string ":" :hide true})}}
    :token {:tag :alt
            :parsers ({:tag :nt :keyword :number}
                      {:tag :nt :keyword :word})}}
```

This way of writing the grammatical rules is way easier than using function combinators and still gives us
these rules in map form.

## The combining trick
Now that we have both a lexer and and grammatical rules, we can simply merge them to have the full grammar.

We actually get a instparse parser this way:

````clojure
(def parser
  (insta/parser (merge lexer rules)
                :start :doc))

(parser "abc:1:def:2:3:")
;=> [:doc
     [:token [:word "abc"]]
     [:token [:number "1"]]
     [:token [:word "def"]]
     [:token [:number "2"]]
     [:token [:number "3"]]]
```

With the exception of some details, this is how this namespace is made.

raw docstring