clj-djl.dataframe

Liking cljdoc? Tell your friends :D

Clojure only.

->dataframe^clj

(->dataframe dataframe)

(->dataframe dataframe {:keys [table-name dataset-name] :as options})

source

->ndarray^clj

(->ndarray ndm dataframe)

Convert dataframe to NDArray

Convert dataframe to NDArray

source raw docstring

add-column^clj

(add-column dataset column)

Add a new column. Error if name collision

Add a new column. Error if name collision

source raw docstring

add-or-update-column^clj

(add-or-update-column dataset column)

(add-or-update-column dataset colname column)

If column exists, replace. Else append new column.

If column exists, replace.  Else append new column.

source raw docstring

assoc-ds^clj

(assoc-ds dataset cname cdata & args)

If dataset is not nil, calls clojure.core/assoc. Else creates a new empty dataset and then calls clojure.core/assoc. Guaranteed to return a dataset (unlike assoc).

If dataset is not nil, calls `clojure.core/assoc`. Else creates a new empty dataset and
then calls `clojure.core/assoc`.  Guaranteed to return a dataset (unlike assoc).

source raw docstring

brief^clj

(brief ds)

(brief ds options)

Get a brief description, in mapseq form of a dataset. A brief description is the mapseq form of descriptive stats.

Get a brief description, in mapseq form of a dataset.  A brief description is
the mapseq form of descriptive stats.

source raw docstring

categorical->one-hot^clj

(categorical->one-hot dataset filter-fn-or-ds)

(categorical->one-hot dataset filter-fn-or-ds table-args)

(categorical->one-hot dataset filter-fn-or-ds table-args result-datatype)

Convert string columns to numeric columns. See tech.v3.dataset.categorical/fit-one-hot

Convert string columns to numeric columns.
See tech.v3.dataset.categorical/fit-one-hot

source raw docstring

column^clj

(column dataset colname)

source

column-count^clj

(column-count dataset)

source

column-names^clj

(column-names dataset)

In-order sequence of column names

In-order sequence of column names

source raw docstring

columns^clj

(columns dataset)

Return sequence of all columns in dataset.

Return sequence of all columns in dataset.

source raw docstring

columns-with-missing-seq^clj

(columns-with-missing-seq dataset)

Return a sequence of:

  {:column-name column-name
   :missing-count missing-count
  }

or nil of no columns are missing data.

Return a sequence of:
```clojure
  {:column-name column-name
   :missing-count missing-count
  }
```
  or nil of no columns are missing data.

source raw docstring

concat^clj

(concat dataset & datasets)

Concatenate datasets in place. See also concat-copying as it may be more efficient for your use case.

Concatenate datasets in place.  See also concat-copying as it may be more
efficient for your use case.

source raw docstring

concat-copying^clj

(concat-copying dataset & datasets)

Concatenate datasets into a new dataset copying data. Respects missing values. Datasets must all have the same columns. Result column datatypes will be a widening cast of the datatypes.

Concatenate datasets into a new dataset copying data.  Respects missing values.
Datasets must all have the same columns.  Result column datatypes will be a widening
cast of the datatypes.

source raw docstring

concat-inplace^clj

(concat-inplace dataset & datasets)

Concatenate datasets in place. Respects missing values. Datasets must all have the same columns. Result column datatypes will be a widening cast of the datatypes.

Concatenate datasets in place.  Respects missing values.  Datasets must all have the
same columns.  Result column datatypes will be a widening cast of the datatypes.

source raw docstring

drop-columns^clj

(drop-columns dataset col-name-seq)

Same as remove-columns

Same as remove-columns

source raw docstring

drop-missing^clj

(drop-missing dataset-or-col)

Remove missing entries by simply selecting out the missing indexes

Remove missing entries by simply selecting out the missing indexes

source raw docstring

drop-rows^clj

(drop-rows dataset-or-col row-indexes)

Drop rows from dataset or column

Drop rows from dataset or column

source raw docstring

ensure-array-backed^clj

(ensure-array-backed ds)

(ensure-array-backed ds {:keys [unpack?] :or {unpack? true}})

Ensure the column data in the dataset is stored in pure java arrays. This is sometimes necessary for interop with other libraries and this operation will force any lazy computations to complete. This also clears the missing set for each column and writes the missing values to the new arrays.

Columns that are already array backed and that have no missing values are not changed and retuned.

The postcondition is that dtype/->array will return a java array in the appropriate datatype for each column.

options - :unpack? - unpack packed datetime types. Defaults to true

Ensure the column data in the dataset is stored in pure java arrays.  This is
sometimes necessary for interop with other libraries and this operation will
force any lazy computations to complete.  This also clears the missing set
for each column and writes the missing values to the new arrays.

Columns that are already array backed and that have no missing values are not
changed and retuned.

The postcondition is that dtype/->array will return a java array in the appropriate
datatype for each column.

options -
:unpack? - unpack packed datetime types.  Defaults to true

source raw docstring

filter^clj

(filter dataset predicate)

dataset->dataset transformation. Predicate is passed a map of colname->column-value.

dataset->dataset transformation.  Predicate is passed a map of
colname->column-value.

source raw docstring

filter-column^clj

(filter-column dataset colname predicate)

Filter a given column by a predicate. Predicate is passed column values. If predicate is not an instance of Ifn it is treated as a value and will be used as if the predicate is #(= value %). Returns a dataset.

Filter a given column by a predicate.  Predicate is passed column values.
If predicate is *not* an instance of Ifn it is treated as a value and will
be used as if the predicate is #(= value %).
Returns a dataset.

source raw docstring

group-by^clj

(group-by dataset key-fn)

Produce a map of key-fn-value->dataset. key-fn is a function taking a map of colname->column-value. Selecting which columns are used in the key-fn using column-name-seq is optional but will greatly improve performance.

Produce a map of key-fn-value->dataset.  key-fn is a function taking
a map of colname->column-value.  Selecting which columns are used in the key-fn
using column-name-seq is optional but will greatly improve performance.

source raw docstring

group-by->indexes^clj

(group-by->indexes dataset key-fn)

(Non-lazy) - Group a dataset and return a map of key-fn-value->indexes where indexes is an in-order contiguous group of indexes.

(Non-lazy) - Group a dataset and return a map of key-fn-value->indexes where indexes
is an in-order contiguous group of indexes.

source raw docstring

group-by-column^clj

(group-by-column dataset colname)

Return a map of column-value->dataset.

Return a map of column-value->dataset.

source raw docstring

group-by-column->indexes^clj

(group-by-column->indexes dataset colname)

(Non-lazy) - Group a dataset by a column return a map of column-val->indexes where indexes is an in-order contiguous group of indexes.

(Non-lazy) - Group a dataset by a column return a map of column-val->indexes
where indexes is an in-order contiguous group of indexes.

source raw docstring

has-column?^clj

(has-column? dataset column-name)

source

head^clj

(head dataset)

(head dataset n)

Get the first n row of a dataset. Equivalent to `(select-rows ds (range n)). Arguments are reversed, however, so this can be used in ->> operators.

Get the first n row of a dataset.  Equivalent to
`(select-rows ds (range n)).  Arguments are reversed, however, so this can
be used in ->> operators.

source raw docstring

missing^clj

(missing dataset-or-col)

Given a dataset or a column, return the missing set as a roaring bitmap

Given a dataset or a column, return the missing set as a roaring bitmap

source raw docstring

new-column^clj

(new-column name data)

(new-column name data metadata)

(new-column name data metadata missing)

Create a new column. Data will scanned for missing values unless the full 4-argument pathway is used.

Create a new column.  Data will scanned for missing values
unless the full 4-argument pathway is used.

source raw docstring

order-column-names^clj

(order-column-names dataset colname-seq)

Order a sequence of columns names so they match the order in the original dataset. Missing columns are placed last.

Order a sequence of columns names so they match the order in the
original dataset.  Missing columns are placed last.

source raw docstring

remove-column^clj

(remove-column dataset col-name)

Same as:

(dissoc dataset col-name)

Same as:

```clojure
(dissoc dataset col-name)
```

source raw docstring

remove-columns^clj

(remove-columns dataset colname-seq)

Same as drop-columns

Same as drop-columns

source raw docstring

remove-rows^clj

(remove-rows dataset-or-col row-indexes)

Same as drop-rows.

Same as drop-rows.

source raw docstring

rename-columns^clj

(rename-columns dataset colname-map)

Rename columns using a map. Does not reorder columns.

Rename columns using a map.  Does not reorder columns.

source raw docstring

replace-missing^clj

(replace-missing df)

(replace-missing df strategy)

(replace-missing df col-sel strategy)

Replace missing with:

builtin strategys: :mid :up :down and :lerp
value
or column function with missing slot dropped

Replace missing with:

- builtin strategys: `:mid` `:up` `:down` and `:lerp`
- value
- or column function with missing slot dropped

source raw docstring

row-count^clj

(row-count dataset-or-col)

source

select^clj

(select dataset colname-seq index-seq)

Reorder/trim dataset according to this sequence of indexes. Returns a new dataset. colname-seq - one of:

:all - all the columns
sequence of column names - those columns in that order.
implementation of java.util.Map - column order is dictate by map iteration order selected columns are subsequently named after the corresponding value in the map. similar to rename-columns except this trims the result to be only the columns in the map. index-seq - either keyword :all or list of indexes. May contain duplicates.

Reorder/trim dataset according to this sequence of indexes.  Returns a new dataset.
colname-seq - one of:
  - :all - all the columns
  - sequence of column names - those columns in that order.
  - implementation of java.util.Map - column order is dictate by map iteration order
     selected columns are subsequently named after the corresponding value in the map.
     similar to `rename-columns` except this trims the result to be only the columns
     in the map.
index-seq - either keyword :all or list of indexes.  May contain duplicates.

source raw docstring

select-by-index^clj

(select-by-index dataframe row-index col-index)

Select a sub-dataframe by seq of row index and column index

Select a sub-dataframe by seq of row index and column index

source raw docstring

select-columns^clj

(select-columns dataset col-name-seq)

Select columns from the dataset by seq of column names or :all.

Select columns from the dataset by seq of column names or :all.

source raw docstring

select-columns-by-index^clj

(select-columns-by-index dataset col-index)

Select columns from the dataset by seq of index(includes negative) or :all.

See documentation for select-by-index.

Select columns from the dataset by seq of index(includes negative) or :all.

See documentation for `select-by-index`.

source raw docstring

select-rows^clj

(select-rows dataset-or-col row-indexes)

Select rows from the dataset or column.

Select rows from the dataset or column.

source raw docstring

select-rows-by-index^clj

(select-rows-by-index dataset-or-col row-index)

Select rows from the dataset or column by seq of index(includes negative) or :all.

See documentation for select-by-index.

Select rows from the dataset or column by seq of index(includes negative) or :all.

See documentation for `select-by-index`.

source raw docstring

set-dataframe-name^clj

source

shape^clj

(shape dataframe)

Get the shape of dataframe, in row major way

Get the shape of dataframe, in row major way

source raw docstring

sort-by^clj

(sort-by dataset key-fn)

(sort-by dataset key-fn compare-fn)

Sort a dataset by a key-fn and compare-fn.

Sort a dataset by a key-fn and compare-fn.

source raw docstring

sort-by-column^clj

(sort-by-column dataset colname)

(sort-by-column dataset colname compare-fn)

Sort a dataset by a given column using the given compare fn.

Sort a dataset by a given column using the given compare fn.

source raw docstring

tail^clj

(tail dataset)

(tail dataset n)

Get the last n rows of a dataset. Equivalent to `(select-rows ds (range ...)). Argument order is dataset-last, however, so this can be used in ->> operators.

Get the last n rows of a dataset.  Equivalent to
`(select-rows ds (range ...)).  Argument order is dataset-last, however, so this can
be used in ->> operators.

source raw docstring

take-nth^clj

(take-nth dataset n-val)

source

unique-by^clj

(unique-by dataset map-fn)

(unique-by dataset
           {:keys [keep-fn]
            :or {keep-fn (fn* [p1__25510# p2__25509#] (first p2__25509#))}
            :as _options}
           map-fn)

Map-fn function gets passed map for each row, rows are grouped by the return value. Keep-fn is used to decide the index to keep.

:keep-fn - Function from key,idx-seq->idx. Defaults to #(first %2).

Map-fn function gets passed map for each row, rows are grouped by the
return value.  Keep-fn is used to decide the index to keep.

:keep-fn - Function from key,idx-seq->idx.  Defaults to #(first %2).

source raw docstring

unique-by-column^clj

(unique-by-column dataset colname)

(unique-by-column dataset
                  {:keys [keep-fn]
                   :or {keep-fn (fn* [p1__25523# p2__25522#]
                                     (first p2__25522#))}
                   :as _options}
                  colname)

Map-fn function gets passed map for each row, rows are grouped by the return value. Keep-fn is used to decide the index to keep.

:keep-fn - Function from key, idx-seq->idx. Defaults to #(first %2).

Map-fn function gets passed map for each row, rows are grouped by the
return value.  Keep-fn is used to decide the index to keep.

:keep-fn - Function from key, idx-seq->idx.  Defaults to #(first %2).

source raw docstring

unordered-select^clj

(unordered-select dataset colname-seq index-seq)

Perform a selection but use the order of the columns in the existing table; do not reorder the columns based on colname-seq. Useful when doing selection based on sets or persistent hash maps.

Perform a selection but use the order of the columns in the existing table; do
*not* reorder the columns based on colname-seq.  Useful when doing selection based
on sets or persistent hash maps.

source raw docstring

update-column^clj

(update-column dataset col-name update-fn)

Update a column returning a new dataset. update-fn is a column->column transformation. Error if column does not exist.

Update a column returning a new dataset.  update-fn is a column->column
transformation.  Error if column does not exist.

source raw docstring

update-columns^clj

(update-columns dataframe col-name-seq-or-fn update-fn)

Update a sequence of columns selected by column name seq or column selector function.

Update a sequence of columns selected by column name seq or column selector function.

source raw docstring

cljdoc is a website building & hosting documentation for Clojure/Script libraries

Keyboard shortcuts Report a problem cljdoc on GitHub

× close