Liking cljdoc? Tell your friends :D

fr.jeremyschoffen.prose.alpha.reader.grammar

Prose's grammar.

The Grammar propsed here is heavily inspired by Pollen's.

We construct it using in 2 parts:

  • a lexical part or lexer made of regular expressions.
  • a set of grammatical rules tyring the lexer together into the grammar.

The lexer.

Our lexer is made of regular expressions constructed with the fr.jeremyschoffen.prose.alpha.reader.grammar.utils/def-regex macro. It uses the Regal library under the covers.

Then, to assemble these regexes into a grammar we use the fr.jeremyschoffen.prose.alpha.reader.grammar.utils/make-lexer macro.

For instance we could construct the following 2 rules lexer:

(def-regex number [:* :digit])

(def-regex word [:* [:class ["a" "z"]]])

(def lexer (make-lexer number word))

lexer
;=> {:number {:tag :regexp
              :regexp #"\d*"}
     :word {:tag :regexp
            :regexp #"[a-z]*"}}

The grammatical rules

Most of the grammatical rules are created using the ebnf notation as follows

(def rules
  (instac/ebnf
    "
    doc = (token <':'>)*
    token = (number | word)
    "))

rules
;=>{:doc {:tag :star
          :parser {:tag :cat
                   :parsers ({:tag :nt :keyword :token}
                             {:tag :string :string ":" :hide true})}}
           :token {:tag :alt
                   :parsers ({:tag :nt :keyword :number}
                             {:tag :nt :keyword :word})}}

The combining trick

Now that we have both a lexer and and grammatical rules, we can simply merge them to have the full grammar.

(def parser
  (insta/parser (merge lexer rules)
                :start :doc))

(parser "abc:1:def:2:3:")
;=> [:doc
      [:token [:word "abc"]]
      [:token [:number "1"]]
      [:token [:word "def"]]
      [:token [:number "2"]]
      [:token [:number "3"]]]
```

With the exception of some details, this is how this namespace is organized.
# Prose's grammar.

The Grammar propsed here is heavily inspired by Pollen's.

We construct it using in 2 parts:
- a lexical part or lexer made of regular expressions.
- a set of grammatical rules tyring the lexer together into the grammar.

## The lexer.
Our lexer is made of regular expressions constructed with the
[[fr.jeremyschoffen.prose.alpha.reader.grammar.utils/def-regex]] macro. It uses the Regal library under the covers.

Then, to assemble these regexes into a grammar we use the
[[fr.jeremyschoffen.prose.alpha.reader.grammar.utils/make-lexer]] macro.

For instance we could construct the following 2 rules lexer:

```clojure
(def-regex number [:* :digit])

(def-regex word [:* [:class ["a" "z"]]])

(def lexer (make-lexer number word))

lexer
;=> {:number {:tag :regexp
              :regexp #"\d*"}
     :word {:tag :regexp
            :regexp #"[a-z]*"}}
```

## The grammatical rules
Most of the grammatical rules are created using the ebnf notation as follows
```clojure
(def rules
  (instac/ebnf
    "
    doc = (token <':'>)*
    token = (number | word)
    "))

rules
;=>{:doc {:tag :star
          :parser {:tag :cat
                   :parsers ({:tag :nt :keyword :token}
                             {:tag :string :string ":" :hide true})}}
           :token {:tag :alt
                   :parsers ({:tag :nt :keyword :number}
                             {:tag :nt :keyword :word})}}
```

## The combining trick
Now that we have both a lexer and and grammatical rules, we can simply merge them to have the full grammar.

````clojure
(def parser
  (insta/parser (merge lexer rules)
                :start :doc))

(parser "abc:1:def:2:3:")
;=> [:doc
      [:token [:word "abc"]]
      [:token [:number "1"]]
      [:token [:word "def"]]
      [:token [:number "2"]]
      [:token [:number "3"]]]
```

With the exception of some details, this is how this namespace is organized.
raw docstring

all-delimitorsclj/s

source

all-grammatical-rulesclj/s

Merging of the lexer rules and the grammatical rules.

Merging of the lexer rules and the grammatical rules.
sourceraw docstring

any-charclj/s

Any character whatsoever.

Any character whatsoever.
sourceraw docstring

anythingclj/s

source

bracesclj/s

source

bracketsclj/s

source

clojure-call-textclj/s

Clojure text found in clojure calls (enclosed in parens).

Clojure text found in clojure calls (enclosed in parens).
sourceraw docstring

clojure-stringclj/s

Text found in clojure strings.

Text found in clojure strings.
sourceraw docstring

close-braceclj/s

source

close-bracketclj/s

source

close-parenclj/s

source

complex-symbolclj/s

source

double-quoteclj/s

source

enclosed-gclj/s

Grammatical describing text enclosed in balanced marker: quotes, parenthesis..;

Grammatical describing text enclosed in balanced marker: quotes, parenthesis..;
sourceraw docstring

escapeclj/s

source

escape-charclj/s

The escaping character

The escaping character
sourceraw docstring

general-gclj/s

The general grammar, tying the lexer and the enclosed rules together with the top grammatical rules.

The general grammar, tying the lexer and the enclosed rules together with the top grammatical rules.
sourceraw docstring

grammarclj/s

The proper grammar in map form.

It is an updated all-grammatical-rules in regards to hidden-tags hidden-results.

The proper grammar in map form.

It is an updated [[all-grammatical-rules]] in regards to [[hidden-tags]] [[hidden-results]].
sourceraw docstring

hidden-resultsclj/s

Names of rules whose productions are completely hidden.

Names of rules whose productions are completely hidden.
sourceraw docstring

hidden-tagsclj/s

Names of additional grammatical rules that have their tag hidden.

Names of additional grammatical rules that have their tag hidden.
sourceraw docstring

lexerclj/s

The proper lexer, a version of lexer* where each rule has its tag hidden.

The proper lexer, a version of [[lexer*]] where each rule has its tag hidden.
sourceraw docstring

lexer*clj/s

Our lexer, an incomplete instaparse grammar in map form containing only regex rules.

Our lexer, an incomplete instaparse grammar in map form containing only regex rules.
sourceraw docstring

macro-reader-charclj/s

source

ns-endclj/s

source

open-braceclj/s

source

open-bracketclj/s

source

open-parenclj/s

source

parensclj/s

source

parserclj/s

Instaparse parser made from grammar

Instaparse parser made from [[grammar]]
sourceraw docstring

pipeclj/s

source

pipe-charclj/s

The pipe character found between the special character and a symbol when directly using a symbol in text.

The pipe character found between the special character and a symbol when directly using a symbol in text.
sourceraw docstring

plain-textclj/s

Text to be interpreted as plain text, neither clojure code, nor special blocks of text. Basically any character excluding diamond which have special meaning.

Text to be interpreted as plain text, neither clojure code, nor special blocks of text.
Basically any character excluding diamond which have special meaning.
sourceraw docstring

simple-symbolclj/s

Regex for the ns name of a symbol, parses dot separated names until a final name.

Regex for the ns name of a symbol, parses dot separated names until
a final name.
sourceraw docstring

specialclj/s

source

special-charclj/s

The special character that denotes embedded code.

The special character that denotes embedded code.
sourceraw docstring

symbol-excluded-charsetclj/s

source

symbol-first-charclj/s

In the case of the first character of a symbol name, there are more forbidden chars:

  • digits aren't allowed as first character
  • the macro reader char # isn't allowed either.
In the case of the first character of a symbol name, there are more forbidden chars:
- digits aren't allowed as first character
- the macro reader char `#` isn't allowed either.
sourceraw docstring

symbol-regular-charclj/s

Characters that are always forbidden in symbol names:

  • spaces
  • diamond char since it starts another grammatical rule
  • delimitors: parens, brackets, braces and double quotes.
  • / since it the special meaning of separating the namespace from the symbol name.
  • . since it has the special meaning of separating symbol names.
  • \ since it is reserved by clojure to identify a literal character.
Characters that are always forbidden in symbol names:
- spaces
- diamond char since it starts another grammatical rule
- delimitors: parens, brackets, braces  and double quotes.
- `/` since it the special meaning of separating the namespace from the symbol name.
- `.` since it has the special meaning of separating symbol names.
- `\` since it is reserved by clojure to identify a literal character.
sourceraw docstring

symbol-textclj/s

Regex used when parsing a symbol.

Regex used when parsing a symbol.
sourceraw docstring

tag-clj-arg-textclj/s

Clojure text found in clojure arguments to tag fns (enclosed in brackets).

Clojure text found in clojure arguments to tag fns (enclosed in brackets).
sourceraw docstring

tag-spacesclj/s

Spaces found in-between tag args.

Spaces found in-between tag args.
sourceraw docstring

tag-text-arg-textclj/s

Regular text found in text arguments to tag fns (enclosed in braces).

Regular text found in text arguments to tag fns (enclosed in braces).
sourceraw docstring

verbatim-textclj/s

Text found in verbatim blocks.

Text found in verbatim blocks.
sourceraw docstring

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close