clj-ml — cc.artifice/clj-ml 0.8.7

This namespace contains several functions for building classifiers using different classification algorithms: Bayes networks, multilayer perceptron, decision tree or support vector machines are available. Some of these classifiers have incremental versions so they can be built without having all the dataset instances in memory.

Functions for evaluating the classifiers built using cross validation or a training set are also provided.

A sample use of the API for classifiers is shown below:

(use 'clj-ml.classifiers)

; Building a classifier using a C4.5 decision tree (def classifier (make-classifier :decision-tree :c45))

; We set the class attribute for the loaded dataset. ; dataset is supposed to contain a set of instances. (dataset-set-class dataset 4)

; Training the classifier (classifier-train classifier dataset)

; We evaluate the classifier using a test dataset (def evaluation (classifier-evaluate classifier :dataset dataset trainingset))

; We retrieve some data from the evaluation result (:kappa evaluation) (:root-mean-squared-error evaluation) (:precision evaluation)

; A trained classifier can be used to classify new instances (def to-classify (make-instance dataset {:class :Iris-versicolor :petalwidth 0.2 :petallength 1.4 :sepalwidth 3.5 :sepallength 5.1}))

; We retrieve the index of the class value assigned by the classifier (classifier-classify classifier to-classify)

; We retrieve a symbol with the value assigned by the classifier ; and assigns it to a certain instance (classifier-label classifier to-classify)

A classifier can also be trained using cross-validation:

(classifier-evaluate classifier :cross-validation dataset 10)

Finally a classifier can be stored in a file for later use:

(use 'clj-ml.utils)

(serialize-to-file classifier "/Users/antonio.garrote/Desktop/classifier.bin")

This namespace contains several functions for building classifiers using different
classification algorithms: Bayes networks, multilayer perceptron, decision tree or
support vector machines are available. Some of these classifiers have incremental
versions so they can be built without having all the dataset instances in memory.

Functions for evaluating the classifiers built using cross validation or a training
set are also provided.

A sample use of the API for classifiers is shown below:

 (use 'clj-ml.classifiers)

 ; Building a classifier using a  C4.5 decision tree
 (def *classifier* (make-classifier :decision-tree :c45))

 ; We set the class attribute for the loaded dataset.
 ; *dataset* is supposed to contain a set of instances.
 (dataset-set-class *dataset* 4)

 ; Training the classifier
 (classifier-train *classifier* *dataset*)

 ; We evaluate the classifier using a test dataset
 (def *evaluation*   (classifier-evaluate *classifier* :dataset *dataset* *trainingset*))

 ; We retrieve some data from the evaluation result
 (:kappa *evaluation*)
 (:root-mean-squared-error *evaluation*)
 (:precision *evaluation*)

 ; A trained classifier can be used to classify new instances
 (def *to-classify* (make-instance *dataset*  {:class :Iris-versicolor
                                               :petalwidth 0.2
                                               :petallength 1.4
                                               :sepalwidth 3.5
                                               :sepallength 5.1}))

 ; We retrieve the index of the class value assigned by the classifier
 (classifier-classify *classifier* *to-classify*)

 ; We retrieve a symbol with the value assigned by the classifier
 ; and assigns it to a certain instance
 (classifier-label *classifier* *to-classify*)

A classifier can also be trained using cross-validation:

 (classifier-evaluate *classifier* :cross-validation *dataset* 10)

Finally a classifier can be stored in a file for later use:

 (use 'clj-ml.utils)

 (serialize-to-file *classifier*
  "/Users/antonio.garrote/Desktop/classifier.bin")

raw docstring

clj-ml.clusterers

This namespace contains several functions for building clusterers using different clustering algorithms. K-means, Cobweb and Expectation maximization algorithms are currently supported.

Some of these algorithms support incremental building of the clustering without having the full data set in main memory. Functions for evaluating the clusterer as well as for clustering new instances are also supported

This namespace contains several functions for
building clusterers using different clustering algorithms. K-means, Cobweb and
Expectation maximization algorithms are currently supported.

Some of these algorithms support incremental building of the clustering without
having the full data set in main memory. Functions for evaluating the clusterer
as well as for clustering new instances are also supported

raw docstring

clj-ml.data

This namespace contains several functions for building creating and manipulating data sets and instances. The formats of these data sets as well as their classes can be modified and assigned to the instances. Finally data sets can be transformed into Clojure sequences that can be transformed using usual Clojure functions like map, reduce, etc.

This namespace contains several functions for
building creating and manipulating data sets and instances. The formats of
these data sets as well as their classes can be modified and assigned to
the instances. Finally data sets can be transformed into Clojure sequences
that can be transformed using usual Clojure functions like map, reduce, etc.

raw docstring

clj-ml.distance-functions

Generates different distance metrics that can be passed as parameters to certain classifiers and clusterers like K-Means.

Euclidean, Manhattan and Chebysev distance functions are supported.

Generates different distance metrics that can be passed as parameters to certain
classifiers and clusterers like K-Means.

Euclidean, Manhattan and Chebysev distance functions are supported.

raw docstring

make-distance-function

clj-ml.filters

This namespace defines a set of functions that can be applied to data sets to modify the dataset in some way: transforming nominal attributes into binary attributes, removing attributes etc.

There are a number of ways to use the filtering API. The most straight forward and idomatic clojure way is to use the provided filter fns:

;; ds is the dataset (def ds (make-dataset :test [:a :b {:c [:g :m]}] [ [1 2 :g] [2 3 :m] [4 5 :g]])) (def filtered-ds (-> ds (add-attribute {:type :nominal, :column 1, :name "pet", :labels ["dog" "cat"]}) (remove-attributes {:attributes [:a :c]})))

The above functions rely on lower level fns that create and apply the filters which you may also use if you need more control over the actual filter objects:

(def filter (make-filter :remove-attributes {:dataset-format ds :attributes [:a :c]}))

;; We apply the filter to the original data set and obtain the new one (def filtered-ds (filter-apply filter ds))

The previous sample of code could be rewritten with the make-apply-filter function:

(def filtered-ds (make-apply-filter :remove-attributes {:attributes [:a :c]} ds))

This namespace defines a set of functions that can be applied to data sets to modify the
dataset in some way: transforming nominal attributes into binary attributes, removing
attributes etc.

There are a number of ways to use the filtering API.  The most straight forward and
idomatic clojure way is to use the provided filter fns:

  ;; ds is the dataset
  (def ds (make-dataset :test [:a :b {:c [:g :m]}]
                                  [ [1 2 :g]
                                    [2 3 :m]
                                    [4 5 :g]]))
  (def filtered-ds
     (-> ds
         (add-attribute {:type :nominal, :column 1, :name "pet", :labels ["dog" "cat"]})
         (remove-attributes {:attributes [:a :c]})))


The above functions rely on lower level fns that create and apply the filters which you may
also use if you need more control over the actual filter objects:

  (def filter (make-filter :remove-attributes {:dataset-format ds :attributes [:a :c]}))


  ;; We apply the filter to the original data set and obtain the new one
  (def filtered-ds (filter-apply filter ds))


The previous sample of code could be rewritten with the make-apply-filter function:

  (def filtered-ds (make-apply-filter :remove-attributes {:attributes [:a :c]} ds))

raw docstring

clj-ml.io

Functions for reading and saving datasets, classifiers and clusterers to files and other persistence mechanisms.

Functions for reading and saving datasets, classifiers and clusterers to files and other
persistence mechanisms.

raw docstring

clj-ml.kernel-functions

Kernel functions that can be passed as parameters to support vector machines classifiers.

Polynomic, radial basis and string kernels are supported

Kernel functions that can be passed as parameters to support vector machines classifiers.

Polynomic, radial basis and string kernels are supported

clj-ml.attribute-selection

clj-ml.classifiers

clj-ml.clusterers

clj-ml.data

clj-ml.distance-functions

clj-ml.filters

clj-ml.io

clj-ml.kernel-functions

clj-ml.options-utils

clj-ml.public-datasets

clj-ml.utils