opennlp.tools.train

Liking cljdoc? Tell your friends :D

Clojure only.

build-dictionary
build-posdictionary
train-document-categorization
train-name-finder
train-pos-tagger
train-sentence-detector
train-tokenizer
train-treebank-chunker
train-treebank-parser
write-model

This namespace contains tools used to train OpenNLP models

This namespace contains tools used to train OpenNLP models

raw docstring

build-dictionary^clj

(build-dictionary in)

Build a Dictionary based on file in appropriate format

Build a Dictionary based on file in appropriate format

source raw docstring

build-posdictionary^clj

(build-posdictionary in)

Build a POSDictionary based on file in appropriate format

A POSDictionary records which part-of-speech tags a word may be assigned

Build a POSDictionary based on file in appropriate format

A POSDictionary records which part-of-speech tags a word
may be assigned

source raw docstring

train-document-categorization^clj

(train-document-categorization in)

(train-document-categorization lang in)

(train-document-categorization lang in cut)

(train-document-categorization lang in cut iter)

Returns a classification model based on a given training file

Returns a classification model based on a given training file

source raw docstring

train-name-finder^clj

(train-name-finder in)

(train-name-finder lang in)

(train-name-finder lang
                   in
                   iter
                   cut
                   &
                   {:keys [entity-type feature-gen classifier]
                    :or {entity-type "default" classifier "MAXENT"}})

Returns a trained name finder based on a given training file. Uses a non-deprecated train() method that allows for perceptron training with minimum modification. Optional arguments include the type of entity (e.g "person"), custom feature generation and a knob for switching to perceptron training (maXent is the default). For perceptron prefer cutoff 0, whereas for maXent 5.

Returns a trained name finder based on a given training file. Uses a
non-deprecated train() method that allows for perceptron training with minimum
modification. Optional arguments include the type of entity (e.g "person"),
custom feature generation and a knob for switching to perceptron training
(maXent is the default). For perceptron prefer cutoff 0, whereas for
maXent 5.

source raw docstring

train-pos-tagger^clj

(train-pos-tagger in)

(train-pos-tagger lang in)

(train-pos-tagger lang in tagdict)

(train-pos-tagger lang in tagdict iter cut)

Returns a pos-tagger based on given training file

Returns a pos-tagger based on given training file

source raw docstring

train-sentence-detector^clj

(train-sentence-detector in)

(train-sentence-detector lang in)

Returns a sentence model based on a given training file

Returns a sentence model based on a given training file

source raw docstring

train-tokenizer^clj

(train-tokenizer in)

(train-tokenizer lang in)

(train-tokenizer lang in iter cut)

Returns a tokenizer based on given training file

Returns a tokenizer based on given training file

source raw docstring

train-treebank-chunker^clj

(train-treebank-chunker in)

(train-treebank-chunker lang in)

(train-treebank-chunker lang in iter cut)

Returns a treebank chunker based on given training file

Returns a treebank chunker based on given training file

source raw docstring

train-treebank-parser^clj

(train-treebank-parser in headrules)

(train-treebank-parser lang in headrules)

(train-treebank-parser lang in headrules iter cut)

Returns a treebank parser based a training file and a set of head rules

Returns a treebank parser based a training file and a set of head rules

source raw docstring

write-model^clj

(write-model model out-stream)

Write a model to disk

Write a model to disk

source raw docstring

cljdoc is a website building & hosting documentation for Clojure/Script libraries

Keyboard shortcuts Report a problem cljdoc on GitHub

× close