Liking cljdoc? Tell your friends :D

zensols.model.weka

Wraps the Weka Java API. This is probably the wrong library to use for most uses. Instead take a look at zensols.model.eval-classifier and zensols.model.execute-classifier.

Wraps the Weka Java API.  This is probably the wrong library to use
for most uses.  Instead take a look at [[zensols.model.eval-classifier]]
and [[zensols.model.execute-classifier]].
raw docstring

*classifiers*clj

An (incomplete) set of Weka classifiers keyed by their speed, type or singleton by name.

  • fast train quickly
  • slow train slowly
  • really-slow train very very slowly
  • lazy lazy category
  • meta meta classifiers (i.e. boosting)
  • tree tree based classifiers (typically train quickly)

The singleton classifiers is a list like the others but have only a single element of the class. They include: zeror, svm, j48, random-forest, naivebays, logit, logitboost, smo, kstar.

An (incomplete) set of Weka classifiers keyed by their speed, type or
singleton by name.

* **fast** train quickly
* **slow** train slowly
* **really-slow** train very very slowly
* **lazy** lazy category
* **meta** meta classifiers (i.e. boosting)
* **tree** tree based classifiers (typically train quickly)

The singleton classifiers is a list like the others but have only a single
element of the class.  They include: `zeror`, `svm`, `j48`, `random-forest`,
`naivebays`, `logit`, `logitboost`, `smo`, `kstar`.
raw docstring

*cross-fold-info*clj

When two-pass cross fold validations are used this is bound to the following map during the validation (see clone-instances):

  • :train? true if creating folds during for the train phase, otherwise the test phase is used
  • :fold the number of the fold
  • :state state shared between training and testing (i.e. context)
When two-pass cross fold validations are used this is bound to the following
map during the validation (see [[clone-instances]]):

* **:train?** `true` if creating folds during for the train phase, otherwise
the test phase is used
* **:fold** the number of the fold
* **:state** state shared between training and testing (i.e. context)
raw docstring

*missing-values-ok*clj

Whether missing the classifier can handle missing values, otherwise an exception is thrown for missing values.

Whether missing the classifier can handle missing values, otherwise an
exception is thrown for missing values.
raw docstring

append-instancesclj

(append-instances src dst)

Merge two instances row wise by adding dst to src.

Merge two instances row wise by adding dst to src.
raw docstring

attribute-by-nameclj

(attribute-by-name instances name)

Return a weka.core.Attribute instance by name from a weka.core.Instances.

Return a `weka.core.Attribute` instance by name from a
`weka.core.Instances`.
raw docstring

attributes-for-instancesclj

(attributes-for-instances insts & {:keys [sort?] :or {sort? true}})

Return a map with :name and :type for each attribute in an weka.core.Instances.

Return a map with **:name** and **:type** for each attribute in an
`weka.core.Instances`.
raw docstring

clone-classifierclj

(clone-classifier classifier)

clone-instancesclj

(clone-instances inst & {:keys [train-fn test-fn randomize-fn] :as opts})

Return a deep clone of inst, optionally with a specific training and test set. See *cross-fold-info* to get information during the validation for debugging and analysis.

  • inst an (object) instance of weka.core.Instances (the whole dataset)

Keys

  • train-fn a function that takes the following arguments: an weka.core.Instances created for the training set, number of folds, the fold number and a java.util.Random to pass to the Weka layer to shuffle the dataset

  • test-fn just like train-fn but used to create the test data set and it doesn't take the java.util.Random instance

Return a deep clone of **inst**, optionally with a specific training and
test set.  See [[*cross-fold-info*]] to get information during the
validation for debugging and analysis.

* **inst** an (object) instance of `weka.core.Instances` (the whole dataset)

Keys
----
* **train-fn** a function that takes the following arguments: an
`weka.core.Instances` created for the training set, number of folds, the fold
number and a `java.util.Random` to pass to the Weka layer to shuffle the
dataset

* **test-fn** just like **train-fn** but used to create the test data set and
it doesn't take the `java.util.Random` instance
raw docstring

create-attribclj

(create-attrib att-name type)

Create a Weka Attribute instance with att-name.

type is the type of attribute, which can be string, boolean, numeric, or a sequence of strings representing possible enumeration values (nominals in Weka speak).

Create a Weka Attribute instance with **att-name**.

**type** is the type of attribute, which can be `string`, `boolean`,
`numeric`, or a sequence of strings representing possible enumeration
values (nominals in Weka speak).
raw docstring

instancesclj

(instances inst-name feature-sets feature-metas)
(instances inst-name
           feature-sets
           feature-metas
           class-feature-meta
           &
           {:keys [clone?] :or {clone? true}})

Create a new weka.core.Instances instance.

  • inst-name used to identify the model data set

  • feature-sets a sequence of maps with each map having key/value pairs of the features of the model to be populated in the returned weka.core.Instances

  • feature-metas a map of key/value pairs describing the features (they become weka.core.Attributes) where the values are string, boolean, numeric, or a sequence of strings representing possible enumeration values (nominals in Weka speak)

  • class-feature-meta just like a (single) feature-metas but describes the class

Create a new `weka.core.Instances` instance.

* **inst-name** used to identify the model data set
* **feature-sets** a sequence of maps with each map having key/value pairs of
the features of the model to be populated in the returned
`weka.core.Instances`
* **feature-metas** a map of key/value pairs describing the features (they
become `weka.core.Attribute`s) where the values are `string`, `boolean`,
`numeric`, or a sequence of strings representing possible enumeration
values (nominals in Weka speak)

* **class-feature-meta** just like a (single) **feature-metas** but describes
the class
raw docstring

let-classifierclj/smacro

(let-classifier fdef-expr & forms)

fnspec ==> (classifier-name [insts] exprs)

Define a classifier that uses Clojure code to evaluate insts instances and evaluate body exprs.

Example:

(let-classifier
  (langid-baseline [inst]
     (let [attrib (weka/attribute-by-name inst "langid-1-id")
           val (.stringValue inst attrib)
           rval (= "en" val)]
       (log/infof "langid: %s for: %s: res: %s" val inst rval)
       (if rval 1 0)))
(terse-results lang-baseline meta-set))
fnspec ==> (classifier-name [insts] exprs)

Define a classifier that uses Clojure code to evaluate **insts** instances
and evaluate body **exprs**.

Example:
```
(let-classifier
  (langid-baseline [inst]
     (let [attrib (weka/attribute-by-name inst "langid-1-id")
           val (.stringValue inst attrib)
           rval (= "en" val)]
       (log/infof "langid: %s for: %s: res: %s" val inst rval)
       (if rval 1 0)))
(terse-results lang-baseline meta-set))
```
raw docstring

make-classifiersclj

(make-classifiers)
(make-classifiers set-name-or-instance)

Make classifiers from either a key in *classifiers* or an instance of weka.classifiers.Classifier (meaning an already constructed instance). All classifiers are returned for the 0-arg option.

Make classifiers from either a key in [[*classifiers*]] or an instance of
`weka.classifiers.Classifier` (meaning an already constructed instance).  All
classifiers are returned for the 0-arg option.
raw docstring

populate-instancesclj

(populate-instances insts feature-metas feature-sets)

Populate a weka.core.Instances instance Clojure data structures.

  • inst a weka.core.Instances that will be populated
  • feature-metas a map of key/value pairs describing the features (they become weka.core.Attributes) where the values are described as types in create-attrib
  • feature-sets a sequence of maps with each map having key/value pairs of the features of the model to be populated in the returned weka.core.Instances
Populate a `weka.core.Instances` instance Clojure data structures.

* **inst** a `weka.core.Instances` that will be populated
* **feature-metas** a map of key/value pairs describing the features (they
become `weka.core.Attribute`s) where the values are described as types
in [[create-attrib]]
* **feature-sets** a sequence of maps with each map having key/value pairs of
the features of the model to be populated in the returned
`weka.core.Instances`
raw docstring

remove-attributesclj

(remove-attributes inst attrib-names & {:keys [invert-selection?]})

Remove a set of attributes from inst (weka.core.Instances) by string (string) name.

Remove a set of attributes from **inst** (`weka.core.Instances`) by
string (string) name.
raw docstring

sparse-instancesclj

(sparse-instances maps
                  dim
                  &
                  {:keys [pattern class-attribute-name instance-name add-class?
                          default-value]
                   :or {pattern "f%d"
                        class-attribute-name "class"
                        instance-name "inst"
                        add-class? true}})

Create a sparse core.weka.Instance using a sequence of maps (map). The keys of the maps are the class with the values maps each with the key as the index and the value the weight. The dim parameter is the dimension of each instance.

Keys

  • :pattern a [[format]] using one integer as the index (default: f%d)
  • :class-attribute the name of the output class (values are given from the keys of maps)
  • instance-name the name of the Instance created object and defaults to inst
  • :add-class? if true add the class that comes from the key in maps
  • default-value if a double replace missing values not in th emap with this value, otherwise missing values will be used
Create a sparse `core.weka.Instance` using a sequence of maps (`map`).
The keys of the maps are the class with the values maps each with the key as
the index and the value the weight.  The `dim` parameter is the dimension of
each instance.

Keys
----
* **:pattern** a [[format]] using one integer as the index (default: `f%d`)
* **:class-attribute** the name of the output class (values are given from the keys of `maps`)
* **instance-name** the name of the `Instance` created object and defaults to `inst`
* **:add-class?** if `true` add the class that comes from the key in **maps**
* **default-value** if a double replace missing values not in th emap with
this value, otherwise missing values will be used
raw docstring

valueclj

(value insts n name)

Return the value for instance n in core.weka.Instance insts with attribute of name.

Return the value for instance **n** in `core.weka.Instance` **insts** with
attribute of **name**.
raw docstring

value-for-instanceclj

(value-for-instance val)
(value-for-instance type val)

Return a Java variable that plays nicely with the Weka framework. If no type is given it tries to determine the type on its own.

  • val is a Java primitive (wrapper)
  • type if given, is the type of val (see create-attrib)
Return a Java variable that plays nicely with the Weka framework.  If no
**type** is given it tries to determine the type on its own.

* **val** is a Java primitive (wrapper)
* **type** if given, is the type of **val** (see [[create-attrib]])
raw docstring

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close