Liking cljdoc? Tell your friends :D

zensols.model.classifier

A utility library that wraps Weka library. This library works with zensols.model.weka do the following:

  • Cross validate models
  • Manage and sort results (i.e. cross validations)
  • Train models
  • Read/write ARFF files

This namspace uses the resource location system to configure the location of files and output analysis files. For more information about the configuration specifics see model-read-resource and [[analysis-dir]], which both use resource-path.

You probably don't want to use this library directly. Please look at zensols.model.eval-classifier and zensols.model.execute-classifier.

A utility library that wraps Weka library.  This library works
    with [[zensols.model.weka]] do the following:
  * Cross validate models
  * Manage and sort results (i.e. cross validations)
  * Train models
  * Read/write ARFF files

This namspace uses the [resource
location](https://github.com/plandes/clj-actioncli#resource-location) system to
configure the location of files and output analysis files.  For more
information about the configuration specifics see [[model-read-resource]]
and [[analysis-dir]], which both
use [resource-path](https://plandes.github.io/clj-actioncli/codox/zensols.actioncli.resource.html#var-resource-path).

You probably don't want to use this library directly.  Please look
at [[zensols.model.eval-classifier]] and [[zensols.model.execute-classifier]].
raw docstring

*arff-file*clj

File to read or write from for any operation regarding file system access to a/the ARFF file(s).

File to read or write from for any operation regarding file system access to
a/the ARFF file(s).
raw docstring

*best-result-criteria*clj

Key used to sort results by their most optimal performance statistic. Valid values are: :accuracy, :wprecision, :wrecall, :wfmeasure, :kappa, :rmse

Key used to sort results by their most optimal performance statistic.  Valid
values are: `:accuracy`, `:wprecision`, `:wrecall`, `:wfmeasure`,
`:kappa`, `:rmse`
raw docstring

*class-feature-meta*clj

The class feature metadata (see zensols.model.weka/create-attrib).

The class feature metadata (see [[zensols.model.weka/create-attrib]]).
raw docstring

*classifier-class*clj

Class name for the classifier used. This defaults to J48.

Class name for the classifier used.  This defaults to J48.
raw docstring

*create-classifier-fn*clj

Function used to create a classifier. Takes as input a weka.core.Instances.

Function used to create a classifier.  Takes as input a
`weka.core.Instances`.
raw docstring

*cross-fold-count*clj

The default number of folds to use during cross fold validation (see [[cmpile-results]]).

The default number of folds to use during cross fold
validation (see [[cmpile-results]]).
raw docstring

*cross-val-fns*clj

If this is non-nil then two pass validation is used. This is a map with the following keys:

See [[zensols.model.eval-classifier/two-pass-config]]

If this is non-`nil` then two pass validation is used.  This is a map with
the following keys:

* **:train-fn** a function that is called during training for each fold to
*stitch in* partial feature-sets to get better results; almost always set
to [[zensols.model.eval-classifier/two-pass-train-instances]]

* **:test-fn** just like **:train-fn** but called during testing; almost
always set to [[zensols.model.eval-classifier/two-pass-train-instances]]

See [[zensols.model.eval-classifier/*two-pass-config*]]
raw docstring

*get-data-fn*clj

A function that generates a weka.core.Instances for cross validation, training, etc.

A function that generates a `weka.core.Instances` for cross validation,
training, etc.
raw docstring

*operation-write-instance-fns*clj

A map with valus of functions that are called that return a java.util.File for an operation represented by the respective key. An ARFF file is created at the file location. The keys are one of:

  • :train-classifier called when the classifier is training a model
  • :test-classifier called when the classifier is testing a model
A map with valus of functions that are called that return a `java.util.File`
for an operation represented by the respective key.  An ARFF file is created
at the file location.  The keys are one of:

* **:train-classifier** called when the classifier is training a model
* **:test-classifier** called when the classifier is testing a model
raw docstring

*output-class-feature-meta*clj

Default attribute name for the predicted label.

Default attribute name for the predicted label.
raw docstring

*rand-fn*clj

A function that returns a java.util.Random used to randomize the train/test dataset.

A function that returns a `java.util.Random` used to randomize the
train/test dataset.
raw docstring

analysis-report-resourceclj

(analysis-report-resource)

Return the model directory on the file system as defined by the :analysis-report. See namespace documentation on how to configure.

Return the model directory on the file system as defined by the
`:analysis-report`.  See namespace documentation on how to configure.
raw docstring

classifier-nameclj

(classifier-name classifier-instance)

Return a decent human readable name of a classifier instance.

Return a decent human readable name of a classifier instance.
raw docstring

classify-instanceclj

(classify-instance classifier unlabeled return-keys)

Make predictions for all instances.

  • classifier instance of weka.classifiers.Classifier
  • unlabeled contains feature set data with an empty class label as a weka.core.Instances
  • return-keys what data to return
    • :label the classified label
    • :distributions the probability distribution over the label
Make predictions for all instances.

* **classifier** instance of `weka.classifiers.Classifier`
* **unlabeled** contains feature set data with an empty class label as a
`weka.core.Instances`
* **return-keys** what data to return
  * **:label** the classified label
  * **:distributions** the probability distribution over the label
raw docstring

compile-resultsclj

(compile-results results)

Return an easier to use map of result data given from cross-validate-tests. The map returns all the performance statistics and:

  • :feature-metadata feature metadatas
  • :result weka.core.Evaluation instance
  • all-results a sorted list of weka.core.Evaluation instances

See cross-validate-tests for where the results data is created.

Return an easier to use map of result data given
from [[cross-validate-tests]].  The map returns all the performance
statistics and:

* **:feature-metadata** feature metadatas
* **:result** `weka.core.Evaluation` instance
* **all-results** a sorted list of `weka.core.Evaluation` instances

See [[cross-validate-tests]] for where the results data is created.
raw docstring

cross-validate-testsclj

(cross-validate-tests classifier attributes feature-metadata)

Run the cross validation for classifier and attributes (symbol set).

Run the cross validation for **classifier** and **attributes** (symbol
set).
raw docstring

excel-resultsclj

(excel-results sheet-name-results out-file)

Save the results in Excel format.

Save the results in Excel format.
raw docstring

excel-results-precisionclj

An integer specifying the length of the mantissa when creating the results spreadsheet in excel-results.

An integer specifying the length of the mantissa when creating the results
spreadsheet in [[excel-results]].
raw docstring

filter-attribute-dataclj

(filter-attribute-data unfiltered attributes)

Create a filtered data set (weka.core.Instances) from unfiltered Instances. Paramater attributes is a set of string attribute names.

Create a filtered data set (`weka.core.Instances`) from unfiltered Instances.
Paramater **attributes** is a set of string attribute names.
raw docstring

initializeclj

(initialize)

Initialize model resource locations.

This needs the system property clj.nlp.parse.model set to a directory that has the POS tagger model english-left3words-distsim.tagger(or whatever you configure in [[zensols.nlparse.stanford/create-context]]) in a directory called pos.

See the source documentation for more information.

Initialize model resource locations.

This needs the system property `clj.nlp.parse.model` set to a directory that
has the POS tagger model `english-left3words-distsim.tagger`(or whatever
you configure in [[zensols.nlparse.stanford/create-context]]) in a directory
called `pos`.

See the [source documentation](https://github.com/plandes/clj-nlp-parse) for
more information.
raw docstring

model-exists?clj

(model-exists? name)

Return whether a the model exists with name.

See model-read-resource.

Return whether a the model exists with `name`.

See [[model-read-resource]].
raw docstring

model-read-resourceclj

(model-read-resource name)

Return a file pointing to model with name using the the :model-read resource path (see [[zensols.actioncli.resource/resource-path]]).

Return a file pointing to model with `name` using the the `:model-read`
resource path (see [[zensols.actioncli.resource/resource-path]]).
raw docstring

model-write-resourceclj

(model-write-resource name)

Return a file pointing to model with name using the the :model-write resource path (see [[zensols.actioncli.resource/resource-path]]).

Return a file pointing to model with `name` using the the `:model-write`
resource path (see [[zensols.actioncli.resource/resource-path]]).
raw docstring

(print-eval-results eval)

Print the results, confusion matrix and class details to standard out of a weka.core.Evalution.

Print the results, confusion matrix and class details to standard out of a
`weka.core.Evalution`.
raw docstring

(print-results results & {:keys [title]})

Print the results, confusion matrix and class details to standard out of a single or sequence of weka.core.Evalutions.

Print the results, confusion matrix and class details to standard out of a
single or sequence of `weka.core.Evalution`s.
raw docstring

read-arffclj

(read-arff input-file)

Return a weka.core.Instances from an ARFF file.

Return a `weka.core.Instances` from an ARFF file.
raw docstring

read-modelclj

(read-model name & {:keys [fail-if-not-exists?] :or {fail-if-not-exists? true}})

Get a saved model (classifier and attributes used). If name is a string, use model-read-resource to calculate the file name. Otherwise, it should be a file of where the model exists.

See model-read-resource.

Keys

  • :fail-if-not-exists? if true then throw an exception if the model file is missing
Get a saved model (classifier and attributes used).  If **name** is a
string, use [[model-read-resource]] to calculate the file name.  Otherwise,
it should be a file of where the model exists.

See [[model-read-resource]].

Keys
----
* **:fail-if-not-exists?** if `true` then throw an exception if the model
file is missing
raw docstring

test-classifierclj

(test-classifier classifier attributes train-data test-data)

Test/evaluate classifier (weka.classifiers.Classifier).

Test/evaluate **classifier** (`weka.classifiers.Classifier`).
raw docstring

train-classifierclj

(train-classifier classifier attributes)

Train classifier (weka.classifiers.Classifier).

Train **classifier** (`weka.classifiers.Classifier`).
raw docstring

train-test-classifierclj

(train-test-classifier classifier
                       feature-meta-sets
                       feature-metadata
                       train-instances
                       test-instances)

write-arffclj

(write-arff instances)

Write a weka.core.Instances to an ARFF file and return that file.

Write a `weka.core.Instances` to an ARFF file and return that file.
raw docstring

write-modelclj

(write-model name model)

Get a saved model (classifier and attributes used). If name is a string, use model-write-resource to calculate the file name. Otherwise, it should be a file of where to write the model.

See model-read-resource

Get a saved model (classifier and attributes used).  If **name** is a
string, use [[model-write-resource]] to calculate the file name.  Otherwise,
it should be a file of where to write the model.

See [[model-read-resource]]
raw docstring

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close