This namespace contains tools used to train OpenNLP models
This namespace contains tools used to train OpenNLP models
(build-dictionary in)
Build a Dictionary based on file in appropriate format
Build a Dictionary based on file in appropriate format
(build-posdictionary in)
Build a POSDictionary based on file in appropriate format
A POSDictionary records which part-of-speech tags a word may be assigned
Build a POSDictionary based on file in appropriate format A POSDictionary records which part-of-speech tags a word may be assigned
(train-document-categorization in)
(train-document-categorization lang in)
(train-document-categorization lang in cut)
(train-document-categorization lang in cut iter)
Returns a classification model based on a given training file
Returns a classification model based on a given training file
(train-name-finder in)
(train-name-finder lang in)
(train-name-finder lang
in
iter
cut
&
{:keys [entity-type feature-gen classifier]
:or {entity-type "default" classifier "MAXENT"}})
Returns a trained name finder based on a given training file. Uses a non-deprecated train() method that allows for perceptron training with minimum modification. Optional arguments include the type of entity (e.g "person"), custom feature generation and a knob for switching to perceptron training (maXent is the default). For perceptron prefer cutoff 0, whereas for maXent 5.
Returns a trained name finder based on a given training file. Uses a non-deprecated train() method that allows for perceptron training with minimum modification. Optional arguments include the type of entity (e.g "person"), custom feature generation and a knob for switching to perceptron training (maXent is the default). For perceptron prefer cutoff 0, whereas for maXent 5.
(train-pos-tagger in)
(train-pos-tagger lang in)
(train-pos-tagger lang in tagdict)
(train-pos-tagger lang in tagdict iter cut)
Returns a pos-tagger based on given training file
Returns a pos-tagger based on given training file
(train-sentence-detector in)
(train-sentence-detector lang in)
Returns a sentence model based on a given training file
Returns a sentence model based on a given training file
(train-tokenizer in)
(train-tokenizer lang in)
(train-tokenizer lang in iter cut)
Returns a tokenizer based on given training file
Returns a tokenizer based on given training file
(train-treebank-chunker in)
(train-treebank-chunker lang in)
(train-treebank-chunker lang in iter cut)
Returns a treebank chunker based on given training file
Returns a treebank chunker based on given training file
(train-treebank-parser in headrules)
(train-treebank-parser lang in headrules)
(train-treebank-parser lang in headrules iter cut)
Returns a treebank parser based a training file and a set of head rules
Returns a treebank parser based a training file and a set of head rules
(write-model model out-stream)
Write a model to disk
Write a model to disk
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close