Liking cljdoc? Tell your friends :D

tech.ml.dataset.modelling


->k-fold-datasetsclj

(->k-fold-datasets dataset k)
(->k-fold-datasets
  dataset
  k
  {:keys [randomize-dataset?] :or {randomize-dataset? true} :as options})

Given 1 dataset, prepary K datasets using the k-fold algorithm. Randomize dataset defaults to true which will realize the entire dataset so use with care if you have large datasets.

Given 1 dataset, prepary K datasets using the k-fold algorithm.
Randomize dataset defaults to true which will realize the entire dataset
so use with care if you have large datasets.
sourceraw docstring

->row-majorclj

(->row-major dataset)
(->row-major dataset options)
(->row-major dataset
             key-colname-seq-map
             {:keys [datatype] :or {datatype :float64}})

Given a dataset and a map of desired key names to sequences of columns, produce a sequence of maps where each key name points to contiguous vector composed of the column values concatenated. If colname-seq-map is not provided then each row defaults to {:features [feature-columns] :label [label-columns]}

Given a dataset and a map of desired key names to sequences of columns,
produce a sequence of maps where each key name points to contiguous vector
composed of the column values concatenated.
If colname-seq-map is not provided then each row defaults to
{:features [feature-columns]
 :label [label-columns]}
sourceraw docstring

->train-test-splitclj

(->train-test-split dataset)
(->train-test-split dataset
                    {:keys [randomize-dataset? train-fraction]
                     :or {randomize-dataset? true train-fraction 0.7}
                     :as options})
source

column-label-mapclj

(column-label-map dataset column-name)
source

column-values->categoricalclj

(column-values->categorical dataset src-column)

Given a column encoded via either string->number or one-hot, reverse map to the a sequence of the original string column values.

Given a column encoded via either string->number or one-hot, reverse
map to the a sequence of the original string column values.
sourceraw docstring

dataset-label-mapclj

(dataset-label-map dataset)
source

feature-ecountclj

(feature-ecount dataset)

When columns aren't scalars then this will change. For now, just the number of feature columns.

When columns aren't scalars then this will change.
For now, just the number of feature columns.
sourceraw docstring

has-column-label-map?clj

(has-column-label-map? dataset column-name)
source

inference-target-column-namesclj

(inference-target-column-names ds)
source

inference-target-label-inverse-mapclj

(inference-target-label-inverse-map dataset & [label-columns])

Given options generated during ETL operations and annotated with :label-columns sequence container 1 label column, generate a reverse map that maps from a dataset value back to the label that generated that value.

Given options generated during ETL operations and annotated with :label-columns
sequence container 1 label column, generate a reverse map that maps from a dataset
value back to the label that generated that value.
sourceraw docstring

inference-target-label-mapclj

(inference-target-label-map dataset & [label-columns])
source

model-typeclj

(model-type dataset & [column-name-seq])

Check the label column after dataset processing. Return either :regression :classification

Check the label column after dataset processing.
Return either
:regression
:classification
sourceraw docstring

num-inference-classesclj

(num-inference-classes dataset)

Given a dataset and correctly built options from pipeline operations, return the number of classes used for the label. Error if not classification dataset.

Given a dataset and correctly built options from pipeline operations,
return the number of classes used for the label.  Error if not classification
dataset.
sourceraw docstring

reduce-column-namesclj

(reduce-column-names dataset colname-seq)

Reverse map from the one-hot encoded columns to the original source column.

Reverse map from the one-hot encoded columns
to the original source column.
sourceraw docstring

set-inference-targetclj

(set-inference-target dataset target-name-or-target-name-seq)
source

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close