tech.ml.dataset.base

Liking cljdoc? Tell your friends :D

Clojure only.

->dataset^clj

(->dataset dataset)

(->dataset dataset {:keys [table-name] :or {table-name "_unnamed"} :as options})

source

add-column^clj

(add-column dataset column)

Add a new column. Error if name collision

Add a new column. Error if name collision

source raw docstring

add-or-update-column^clj

(add-or-update-column dataset column)

If column exists, replace. Else append new column.

If column exists, replace.  Else append new column.

source raw docstring

column^clj

(column dataset column-name)

Return the column or throw if it doesn't exist.

Return the column or throw if it doesn't exist.

source raw docstring

column-map^clj

(column-map datatypes)

clojure map of column-name->column

clojure map of column-name->column

source raw docstring

column-names^clj

(column-names dataset)

In-order sequence of column names

In-order sequence of column names

source raw docstring

columns^clj

(columns dataset)

Return sequence of all columns in dataset.

Return sequence of all columns in dataset.

source raw docstring

columns-with-missing-seq^clj

(columns-with-missing-seq dataset)

Return a sequence of: {:column-name column-name :missing-count missing-count } or nil of no columns are missing data.

Return a sequence of:
{:column-name column-name
 :missing-count missing-count
}
or nil of no columns are missing data.

source raw docstring

dataset-name^clj

(dataset-name dataset)

source

ds-column-map^clj

(ds-column-map map-fn first-ds & ds-seq)

Map a function columnwise across datasets and produce a new dataset. column sequence. Note this does not produce a new dataset as that would preclude remove,filter on nil values.

Map a function columnwise across datasets and produce a new dataset.
column sequence.  Note this does not produce a new dataset as that would
preclude remove,filter on nil values.

source raw docstring

ds-concat^clj

(ds-concat dataset & other-datasets)

source

ds-filter^clj

(ds-filter predicate dataset & [column-name-seq])

dataset->dataset transformation

dataset->dataset transformation

source raw docstring

ds-group-by^clj

(ds-group-by key-fn dataset & [column-name-seq])

Produce a map of key-fn-value->dataset. key-fn is a function taking Y values where Y is the count of column-name-seq or :all.

Produce a map of key-fn-value->dataset.  key-fn is a function taking
Y values where Y is the count of column-name-seq or :all.

source raw docstring

ds-map-values^clj

(ds-map-values dataset map-fn & [column-name-seq])

Note this returns a sequence, not a dataset.

Note this returns a sequence, not a dataset.

source raw docstring

ds-sort-by^clj

(ds-sort-by key-fn dataset)

(ds-sort-by key-fn compare-fn dataset)

(ds-sort-by key-fn compare-fn dataset column-name-seq)

source

ds-take-nth^clj

(ds-take-nth n-val dataset)

source

from-prototype^clj

(from-prototype dataset table-name column-seq)

Create a new dataset that is the same type as this one but with a potentially different table name and column sequence. Take care that the columns are all of the correct type.

Create a new dataset that is the same type as this one but with a potentially
different table name and column sequence.  Take care that the columns are all of
the correct type.

source raw docstring

index-value-seq^clj

(index-value-seq dataset)

Get a sequence of tuples: [idx col-value-vec]

Values are in order of column-name-seq. Duplicate names are allowed and result in duplicate values.

Get a sequence of tuples:
  [idx col-value-vec]

Values are in order of column-name-seq.  Duplicate names are allowed and result in
duplicate values.

source raw docstring

map-seq->dataset^clj

(map-seq->dataset map-seq
                  {:keys [scan-depth column-definitions table-name
                          dataset-constructor]
                   :or {scan-depth 100
                        table-name "_unnamed"
                        dataset-constructor
                          (quote tech.libs.tablesaw/map-seq->tablesaw-dataset)}
                   :as options})

Given a sequence of maps, construct a dataset. Defaults to a tablesaw-based dataset.

Given a sequence of maps, construct a dataset.  Defaults to a tablesaw-based
dataset.

source raw docstring

maybe-column^clj

(maybe-column dataset column-name)

Return either column if exists or nil.

Return either column if exists or nil.

source raw docstring

metadata^clj

(metadata dataset)

source

new-column^clj

(new-column dataset column-name values)

(new-column dataset
            column-name
            values
            {:keys [datatype container-type]
             :or {container-type :tablesaw-column}
             :as options})

Create a new column from some values.

Create a new column from some values.

source raw docstring

order-column-names^clj

(order-column-names dataset colname-seq)

Order a sequence of columns names so they match the order in the original dataset. Missing columns are placed last.

Order a sequence of columns names so they match the order in the
original dataset.  Missing columns are placed last.

source raw docstring

remove-column^clj

(remove-column dataset col-name)

Fails quietly

Fails quietly

source raw docstring

remove-columns^clj

(remove-columns dataset colname-seq)

source

select^clj

(select dataset colname-seq index-seq)

Reorder/trim dataset according to this sequence of indexes. Returns a new dataset. colname-seq - either keyword :all or list of column names with no duplicates. index-seq - either keyword :all or list of indexes. May contain duplicates.

Reorder/trim dataset according to this sequence of indexes.  Returns a new dataset.
colname-seq - either keyword :all or list of column names with no duplicates.
index-seq - either keyword :all or list of indexes.  May contain duplicates.

source raw docstring

select-columns^clj

(select-columns dataset col-name-seq)

source

set-metadata^clj

(set-metadata dataset meta-map)

source

supported-column-stats^clj

(supported-column-stats dataset)

Return the set of natively supported stats for the dataset. This must be at least #{:mean :variance :median :skew}.

Return the set of natively supported stats for the dataset.  This must be at least
#{:mean :variance :median :skew}.

source raw docstring

unordered-select^clj

(unordered-select dataset colname-seq index-seq)

Perform a selection but use the order of the columns in the existing table; do not reorder the columns based on colname-seq. Useful when doing selection based on sets.

Perform a selection but use the order of the columns in the existing table; do
*not* reorder the columns based on colname-seq.  Useful when doing selection based
on sets.

source raw docstring

update-column^clj

(update-column dataset col-name update-fn)

Update a column returning a new dataset. update-fn is a column->column transformation. Error if column does not exist.

Update a column returning a new dataset.  update-fn is a column->column
transformation.  Error if column does not exist.

source raw docstring

update-columns^clj

(update-columns dataset column-name-seq update-fn)

Update a sequence of columns.

Update a sequence of columns.

source raw docstring

cljdoc is a website building & hosting documentation for Clojure/Script libraries

Keyboard shortcuts Report a problem cljdoc on GitHub

× close