zensols.nlparse.config

Liking cljdoc? Tell your friends :D

Clojure only.

Configure the Stanford CoreNLP parser.

This provides a plugin architecture for natural language processing tasks in a pipeline. A parser takes either an human language utterance or a previously annotated data parsed from an utterance.

Parser Libraries

Each parser provides a set of components that make up the pipeline. Each component (i.e. tokenize) is a function that returns a map including a map containing keys:

component a key that's the name of the component to create.
parser a key that is the name of the parser it belongs to.

For example, the Stanford CoreNLP word tokenizer has the following return map:

:component :tokenize
:lang lang-code (e.g. en)
:parser :stanford

The map also has additional key/value pairs that represent remaining configuration given to the parser library used to create it's pipeline components. All parse library names (keys) are given in all-parsers.

Use register-library to add your library with the key name of your parser.

Usage

You can either create your own custom parser configuration with create-parse-config and then create it's respective context with create-context. If you do this, then each parse call needs to be in a with-context lexical context. If you don't, a default context is created and used for each parse invocation.

Once/if configured, use zensols.nlparse.parse/parse to invoke the parsing pipeline.

Configure the Stanford CoreNLP parser.

This provides a plugin architecture for natural language processing tasks in a
pipeline.  A parser takes either an human language utterance or a previously
annotated data parsed from an utterance.


### Parser Libraries

Each parser provides a set of *components* that make up the pipeline.  Each
component (i.e. [[tokenize]]) is a function that returns a map including a map
containing keys:

* **component** a key that's the name of the component to create.
* **parser** a key that is the name of the parser it belongs to.

For example, the Stanford CoreNLP word tokenizer has the following return map:

* **:component** :tokenize
* **:lang**  *lang-code* (e.g. `en`)
* **:parser** :stanford

The map also has additional key/value pairs that represent remaining
configuration given to the parser library used to create it's pipeline
components.  All parse library names (keys) are given in [[all-parsers]].

Use [[register-library]] to add your library with the key name of your parser.


### Usage

You can either create your own custom parser configuration
with [[create-parse-config]] and then create it's respective context
with [[create-context]].  If you do this, then each parse call needs to be in
a [[with-context]] lexical context.  If you don't, a default context is created
and used for each parse invocation.

Once/if configured, use [[zensols.nlparse.parse/parse]] to invoke the parsing
pipeline.

raw docstring

all-parsers^clj

All parsers available in this package (jar).

All parsers available in this package (jar).