(->dataframe dataframe)
(->dataframe dataframe {:keys [table-name dataset-name] :as options})
(->ndarray ndm dataframe)
Convert dataframe to NDArray
Convert dataframe to NDArray
(add-column dataset column)
Add a new column. Error if name collision
Add a new column. Error if name collision
(add-or-update-column dataset column)
(add-or-update-column dataset colname column)
If column exists, replace. Else append new column.
If column exists, replace. Else append new column.
(assoc-ds dataset cname cdata & args)
If dataset is not nil, calls clojure.core/assoc
. Else creates a new empty dataset and
then calls clojure.core/assoc
. Guaranteed to return a dataset (unlike assoc).
If dataset is not nil, calls `clojure.core/assoc`. Else creates a new empty dataset and then calls `clojure.core/assoc`. Guaranteed to return a dataset (unlike assoc).
(brief ds)
(brief ds options)
Get a brief description, in mapseq form of a dataset. A brief description is the mapseq form of descriptive stats.
Get a brief description, in mapseq form of a dataset. A brief description is the mapseq form of descriptive stats.
(categorical->one-hot dataset filter-fn-or-ds)
(categorical->one-hot dataset filter-fn-or-ds table-args)
(categorical->one-hot dataset filter-fn-or-ds table-args result-datatype)
Convert string columns to numeric columns. See tech.v3.dataset.categorical/fit-one-hot
Convert string columns to numeric columns. See tech.v3.dataset.categorical/fit-one-hot
(column-names dataset)
In-order sequence of column names
In-order sequence of column names
(columns dataset)
Return sequence of all columns in dataset.
Return sequence of all columns in dataset.
(columns-with-missing-seq dataset)
Return a sequence of:
{:column-name column-name
:missing-count missing-count
}
or nil of no columns are missing data.
Return a sequence of: ```clojure {:column-name column-name :missing-count missing-count } ``` or nil of no columns are missing data.
(concat dataset & datasets)
Concatenate datasets in place. See also concat-copying as it may be more efficient for your use case.
Concatenate datasets in place. See also concat-copying as it may be more efficient for your use case.
(concat-copying dataset & datasets)
Concatenate datasets into a new dataset copying data. Respects missing values. Datasets must all have the same columns. Result column datatypes will be a widening cast of the datatypes.
Concatenate datasets into a new dataset copying data. Respects missing values. Datasets must all have the same columns. Result column datatypes will be a widening cast of the datatypes.
(concat-inplace dataset & datasets)
Concatenate datasets in place. Respects missing values. Datasets must all have the same columns. Result column datatypes will be a widening cast of the datatypes.
Concatenate datasets in place. Respects missing values. Datasets must all have the same columns. Result column datatypes will be a widening cast of the datatypes.
(drop-columns dataset col-name-seq)
Same as remove-columns
Same as remove-columns
(drop-missing dataset-or-col)
Remove missing entries by simply selecting out the missing indexes
Remove missing entries by simply selecting out the missing indexes
(drop-rows dataset-or-col row-indexes)
Drop rows from dataset or column
Drop rows from dataset or column
(ensure-array-backed ds)
(ensure-array-backed ds {:keys [unpack?] :or {unpack? true}})
Ensure the column data in the dataset is stored in pure java arrays. This is sometimes necessary for interop with other libraries and this operation will force any lazy computations to complete. This also clears the missing set for each column and writes the missing values to the new arrays.
Columns that are already array backed and that have no missing values are not changed and retuned.
The postcondition is that dtype/->array will return a java array in the appropriate datatype for each column.
options - :unpack? - unpack packed datetime types. Defaults to true
Ensure the column data in the dataset is stored in pure java arrays. This is sometimes necessary for interop with other libraries and this operation will force any lazy computations to complete. This also clears the missing set for each column and writes the missing values to the new arrays. Columns that are already array backed and that have no missing values are not changed and retuned. The postcondition is that dtype/->array will return a java array in the appropriate datatype for each column. options - :unpack? - unpack packed datetime types. Defaults to true
(filter dataset predicate)
dataset->dataset transformation. Predicate is passed a map of colname->column-value.
dataset->dataset transformation. Predicate is passed a map of colname->column-value.
(filter-column dataset colname predicate)
Filter a given column by a predicate. Predicate is passed column values. If predicate is not an instance of Ifn it is treated as a value and will be used as if the predicate is #(= value %). Returns a dataset.
Filter a given column by a predicate. Predicate is passed column values. If predicate is *not* an instance of Ifn it is treated as a value and will be used as if the predicate is #(= value %). Returns a dataset.
(group-by dataset key-fn)
Produce a map of key-fn-value->dataset. key-fn is a function taking a map of colname->column-value. Selecting which columns are used in the key-fn using column-name-seq is optional but will greatly improve performance.
Produce a map of key-fn-value->dataset. key-fn is a function taking a map of colname->column-value. Selecting which columns are used in the key-fn using column-name-seq is optional but will greatly improve performance.
(group-by->indexes dataset key-fn)
(Non-lazy) - Group a dataset and return a map of key-fn-value->indexes where indexes is an in-order contiguous group of indexes.
(Non-lazy) - Group a dataset and return a map of key-fn-value->indexes where indexes is an in-order contiguous group of indexes.
(group-by-column dataset colname)
Return a map of column-value->dataset.
Return a map of column-value->dataset.
(group-by-column->indexes dataset colname)
(Non-lazy) - Group a dataset by a column return a map of column-val->indexes where indexes is an in-order contiguous group of indexes.
(Non-lazy) - Group a dataset by a column return a map of column-val->indexes where indexes is an in-order contiguous group of indexes.
(head dataset)
(head dataset n)
Get the first n row of a dataset. Equivalent to `(select-rows ds (range n)). Arguments are reversed, however, so this can be used in ->> operators.
Get the first n row of a dataset. Equivalent to `(select-rows ds (range n)). Arguments are reversed, however, so this can be used in ->> operators.
(missing dataset-or-col)
Given a dataset or a column, return the missing set as a roaring bitmap
Given a dataset or a column, return the missing set as a roaring bitmap
(new-column name data)
(new-column name data metadata)
(new-column name data metadata missing)
Create a new column. Data will scanned for missing values unless the full 4-argument pathway is used.
Create a new column. Data will scanned for missing values unless the full 4-argument pathway is used.
(order-column-names dataset colname-seq)
Order a sequence of columns names so they match the order in the original dataset. Missing columns are placed last.
Order a sequence of columns names so they match the order in the original dataset. Missing columns are placed last.
(remove-column dataset col-name)
Same as:
(dissoc dataset col-name)
Same as: ```clojure (dissoc dataset col-name) ```
(remove-columns dataset colname-seq)
Same as drop-columns
Same as drop-columns
(remove-rows dataset-or-col row-indexes)
Same as drop-rows.
Same as drop-rows.
(rename-columns dataset colname-map)
Rename columns using a map. Does not reorder columns.
Rename columns using a map. Does not reorder columns.
(replace-missing df)
(replace-missing df strategy)
(replace-missing df col-sel strategy)
Replace missing with:
:mid
:up
:down
and :lerp
Replace missing with: - builtin strategys: `:mid` `:up` `:down` and `:lerp` - value - or column function with missing slot dropped
(select dataset colname-seq index-seq)
Reorder/trim dataset according to this sequence of indexes. Returns a new dataset. colname-seq - one of:
rename-columns
except this trims the result to be only the columns
in the map.
index-seq - either keyword :all or list of indexes. May contain duplicates.Reorder/trim dataset according to this sequence of indexes. Returns a new dataset. colname-seq - one of: - :all - all the columns - sequence of column names - those columns in that order. - implementation of java.util.Map - column order is dictate by map iteration order selected columns are subsequently named after the corresponding value in the map. similar to `rename-columns` except this trims the result to be only the columns in the map. index-seq - either keyword :all or list of indexes. May contain duplicates.
(select-by-index dataframe row-index col-index)
Select a sub-dataframe by seq of row index and column index
Select a sub-dataframe by seq of row index and column index
(select-columns dataset col-name-seq)
Select columns from the dataset by seq of column names or :all.
Select columns from the dataset by seq of column names or :all.
(select-columns-by-index dataset col-index)
Select columns from the dataset by seq of index(includes negative) or :all.
See documentation for select-by-index
.
Select columns from the dataset by seq of index(includes negative) or :all. See documentation for `select-by-index`.
(select-rows dataset-or-col row-indexes)
Select rows from the dataset or column.
Select rows from the dataset or column.
(select-rows-by-index dataset-or-col row-index)
Select rows from the dataset or column by seq of index(includes negative) or :all.
See documentation for select-by-index
.
Select rows from the dataset or column by seq of index(includes negative) or :all. See documentation for `select-by-index`.
(shape dataframe)
Get the shape of dataframe, in row major way
Get the shape of dataframe, in row major way
(sort-by dataset key-fn)
(sort-by dataset key-fn compare-fn)
Sort a dataset by a key-fn and compare-fn.
Sort a dataset by a key-fn and compare-fn.
(sort-by-column dataset colname)
(sort-by-column dataset colname compare-fn)
Sort a dataset by a given column using the given compare fn.
Sort a dataset by a given column using the given compare fn.
(tail dataset)
(tail dataset n)
Get the last n rows of a dataset. Equivalent to `(select-rows ds (range ...)). Argument order is dataset-last, however, so this can be used in ->> operators.
Get the last n rows of a dataset. Equivalent to `(select-rows ds (range ...)). Argument order is dataset-last, however, so this can be used in ->> operators.
(unique-by dataset map-fn)
(unique-by dataset
{:keys [keep-fn]
:or {keep-fn (fn* [p1__25510# p2__25509#] (first p2__25509#))}
:as _options}
map-fn)
Map-fn function gets passed map for each row, rows are grouped by the return value. Keep-fn is used to decide the index to keep.
:keep-fn - Function from key,idx-seq->idx. Defaults to #(first %2).
Map-fn function gets passed map for each row, rows are grouped by the return value. Keep-fn is used to decide the index to keep. :keep-fn - Function from key,idx-seq->idx. Defaults to #(first %2).
(unique-by-column dataset colname)
(unique-by-column dataset
{:keys [keep-fn]
:or {keep-fn (fn* [p1__25523# p2__25522#]
(first p2__25522#))}
:as _options}
colname)
Map-fn function gets passed map for each row, rows are grouped by the return value. Keep-fn is used to decide the index to keep.
:keep-fn - Function from key, idx-seq->idx. Defaults to #(first %2).
Map-fn function gets passed map for each row, rows are grouped by the return value. Keep-fn is used to decide the index to keep. :keep-fn - Function from key, idx-seq->idx. Defaults to #(first %2).
(unordered-select dataset colname-seq index-seq)
Perform a selection but use the order of the columns in the existing table; do not reorder the columns based on colname-seq. Useful when doing selection based on sets or persistent hash maps.
Perform a selection but use the order of the columns in the existing table; do *not* reorder the columns based on colname-seq. Useful when doing selection based on sets or persistent hash maps.
(update-column dataset col-name update-fn)
Update a column returning a new dataset. update-fn is a column->column transformation. Error if column does not exist.
Update a column returning a new dataset. update-fn is a column->column transformation. Error if column does not exist.
(update-columns dataframe col-name-seq-or-fn update-fn)
Update a sequence of columns selected by column name seq or column selector function.
Update a sequence of columns selected by column name seq or column selector function.
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close