Liking cljdoc? Tell your friends :D

tech.ml.dataset.math


compute-centroid-and-global-meansclj

(compute-centroid-and-global-means dataset row-major-centroids)

Return a map of: centroid-means - centroid-index -> (double array) column means. global-means - global means (double array) for the dataset.

Return a map of:
centroid-means - centroid-index -> (double array) column means.
global-means - global means (double array) for the dataset.
sourceraw docstring

correlation-tableclj

(correlation-table dataset & {:keys [correlation-type colname-seq]})

Return a map of colname->list of sorted tuple of [colname, coefficient]. Sort is: (sort-by (comp #(Math/abs (double %)) second) >)

Thus the first entry is: [colname, 1.0]

There are three possible correlation types: :pearson :spearman :kendall

:pearson is the default.

Return a map of colname->list of sorted tuple of [colname, coefficient].
Sort is:
(sort-by (comp #(Math/abs (double %)) second) >)

Thus the first entry is:
[colname, 1.0]

There are three possible correlation types:
:pearson
:spearman
:kendall

:pearson is the default.
sourceraw docstring

find-staticclj

source

g-meansclj

(g-means dataset & [max-k error-on-missing?])

g-means. Not NAN aware, missing is an error. Returns array of centroids in row-major array-of-array-of-doubles format.

g-means. Not NAN aware, missing is an error.
Returns array of centroids in row-major array-of-array-of-doubles format.
sourceraw docstring

group-rows-by-nearest-centroidclj

(group-rows-by-nearest-centroid dataset
                                row-major-centroids
                                &
                                [error-on-missing?])
source

impute-missing-by-centroid-averagesclj

(impute-missing-by-centroid-averages dataset
                                     row-major-centroids
                                     {:keys [centroid-means global-means]})

Impute missing columns by first grouping by nearest centroids and then computing the mean. In the case where the grouping for a given centroid contains all NaN's, use the global dataset mean. In the case where this is NaN, this algorithm will fail to replace the missing values with meaningful values. Return a new dataset.

Impute missing columns by first grouping by nearest centroids and then computing the
mean.  In the case where the grouping for a given centroid contains all NaN's, use the
global dataset mean.  In the case where this is NaN, this algorithm will fail to
replace the missing values with meaningful values.  Return a new dataset.
sourceraw docstring

interpolate-loessclj

(interpolate-loess ds x-colname y-colname)
(interpolate-loess ds
                   x-colname
                   y-colname
                   {:keys [bandwidth iterations accuracy result-name]
                    :or {bandwidth 0.75
                         iterations 4
                         accuracy LoessInterpolator/DEFAULT_ACCURACY}})

Interpolate using the LOESS regression engine. Useful for smoothing out graphs.

Interpolate using the LOESS regression engine.  Useful for smoothing out graphs.
sourceraw docstring

k-meansclj

(k-means dataset & [k max-iterations num-runs error-on-missing? tolerance])

Nan-aware k-means. Returns array of centroids in row-major array-of-array-of-doubles format.

Nan-aware k-means.
Returns array of centroids in row-major array-of-array-of-doubles format.
sourceraw docstring

nan-aware-meanclj

(nan-aware-mean col-data)
source

nan-aware-squared-distanceclj

(nan-aware-squared-distance lhs rhs)

Nan away squared distance.

Nan away squared distance.
sourceraw docstring

to-column-major-double-array-of-arraysclj

(to-column-major-double-array-of-arrays dataset & [error-on-missing?])

Convert a dataset to a row major array of arrays. Note that if error-on-missing is false, missing values will appear as NAN.

Convert a dataset to a row major array of arrays.
Note that if error-on-missing is false, missing values will appear as NAN.
sourceraw docstring

to-row-major-double-array-of-arraysclj

(to-row-major-double-array-of-arrays dataset & [error-on-missing?])

Convert a dataset to a column major array of arrays. Note that if error-on-missing is false, missing values will appear as NAN.

Convert a dataset to a column major array of arrays.
Note that if error-on-missing is false, missing values will appear as NAN.
sourceraw docstring

transpose-double-array-of-arraysclj

(transpose-double-array-of-arrays input-data)
source

x-meansclj

(x-means dataset & [max-k error-on-missing?])

x-means. Not NAN aware, missing is an error. Returns array of centroids in row-major array-of-array-of-doubles format.

x-means. Not NAN aware, missing is an error.
Returns array of centroids in row-major array-of-array-of-doubles format.
sourceraw docstring

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close