Liking cljdoc? Tell your friends :D

scicloj.tcutils.api


betweenclj

(between ds col-name low high)
(between ds col-selector low high {:keys [missing-default]})

Detect where values fall in a specified range in a numeric column. This is a shortcut for (< low x high).

Usage

(between ds col-name low high) (between ds col-name low high {:missing-default val})

Arguments

  • ds - A tech.ml.dataset (i.e a tablecloth dataset)
  • column-name - Name of the column to use in the comparison
  • low - Lower bound for values of column-name
  • high - Upper bound for values of column-name
  • options - optional Options map containing the key missing-default to specify what value to use in the case that the value of (col-name row) is nil. Throws an error if there are any missing values in the column and this option is not provided.

Returns

A dataset with only rows that contain values between low and high in column col-name

Detect where values fall in a specified range in a numeric column. This is a shortcut for `(< low x high)`.

## Usage

`(between ds col-name low high)`
`(between ds col-name low high {:missing-default val})`

## Arguments

- `ds` - A `tech.ml.dataset` (i.e a `tablecloth` dataset)
- `column-name` - Name of the column to use in the comparison
- `low` - Lower bound for values of `column-name`
- `high` - Upper bound for values of `column-name`
- `options` - __optional__ Options map containing the key `missing-default` to specify what value to use in the case that the value of (col-name row) is `nil`. Throws an error if there are any missing values in the column and this option is not provided.

## Returns

A dataset with only rows that contain values between `low` and `high` in column `col-name`
raw docstring

clean-column-namesclj

(clean-column-names ds)

Convert column names of a dataset into ASCII-only, kebab-cased keywords. Throws an error if any column would be left with no name, e.g. one that was an all non-ASCII string.

Usage

clean-column-names(ds)

Arguments

  • ds - A tech.ml.dataset (i.e a tablecloth dataset)

Returns

A dataset with the column names converted to ASCII-only, kebab-cased keywords.

Convert column names of a dataset into ASCII-only, kebab-cased keywords. Throws an error if any column would be left with no name, e.g. one that was an all non-ASCII string.

## Usage

`clean-column-names(ds)`

## Arguments

- `ds` - A `tech.ml.dataset` (i.e a `tablecloth` dataset)

## Returns

A dataset with the column names converted to ASCII-only, kebab-cased keywords.
raw docstring

cumsumclj

(cumsum ds column-name)
(cumsum ds new-column-name column-name)

Compute the cumulative sum of a column

Usage

(cumsum ds column-name) (cumsum ds new-column-name column-name)

Arguments

  • ds - A tech.ml.dataset (i.e a tablecloth dataset)
  • new-column-name - optional Name for the column where newly computed values will go. When ommitted new column name defaults to the keyword <old-column-name>-cumulative-sum
  • column-name - Name of the column to use to compute the cumulative sum

Returns

A dataset with the additional column containing the cumulative sum.

Compute the cumulative sum of a column

## Usage

`(cumsum ds column-name)`
`(cumsum ds new-column-name column-name)`

## Arguments

- `ds` - A `tech.ml.dataset` (i.e a `tablecloth` dataset)
- `new-column-name` - __optional__ Name for the column where newly computed values will go.
  When ommitted new column name defaults to the keyword `<old-column-name>-cumulative-sum`
- `column-name` - Name of the column to use to compute the cumulative sum

## Returns

A dataset with the additional column containing the cumulative sum.
raw docstring

duplicate-rowsclj

(duplicate-rows ds)

Filter a dataset for only duplicated rows.

Usage

(duplicate-rows ds)

Arguments

  • ds - A tech.ml.dataset (i.e a tablecloth dataset)

Returns

A dataset containing only rows that are exact duplicates.

Filter a dataset for only duplicated rows.

## Usage

`(duplicate-rows ds)`

## Arguments

- `ds` - A `tech.ml.dataset` (i.e a `tablecloth` dataset)

## Returns

A dataset containing only rows that are exact duplicates.
raw docstring

lagclj

(lag ds column-name lag-size)
(lag ds new-column-name column-name lag-size)

Compute previous (lagged) values from one column in a new column, can be used e.g. to compare values behind the current value.

Usage

(lag ds column-name lag-size) (lag ds new-column-name column-name lag-size)

Arguments

  • ds - A tech.ml.dataset (i.e a tablecloth dataset)
  • new-column-name - optional Name for the column where newly computed values will go. When ommitted new column name defaults to the keyword <old-column-name>-lag-<lag-size>
  • column-name - Name of the column to use to compute the lagged values
  • lag-size - positive integer indicating how many rows to skip over to compute the lag

Returns

A dataset with the new column populated with the lagged values.

Compute previous (lagged) values from one column in a new column, can be used e.g. to compare values behind the current value.

## Usage

`(lag ds column-name lag-size)`
`(lag ds new-column-name column-name lag-size)`

## Arguments

- `ds` - A `tech.ml.dataset` (i.e a `tablecloth` dataset)
- `new-column-name` - __optional__ Name for the column where newly computed values will go.
  When ommitted new column name defaults to the keyword `<old-column-name>-lag-<lag-size>`
- `column-name` - Name of the column to use to compute the lagged values
- `lag-size` - positive integer indicating how many rows to skip over to compute the lag

## Returns

A dataset with the new column populated with the lagged values.
raw docstring

leadclj

(lead ds column-name lead-size)
(lead ds new-column-name column-name lead-size)

Compute next (lead) values from one column in a new column, can be used e.g. to compare values ahead of the current value.

Usage

(lead ds column-name lead-size) (lead ds new-column-name column-name lead-size)

Arguments

  • ds - A tech.ml.dataset (i.e a tablecloth dataset)
  • new-column-name - optional Name for the column where newly computed values will go. When ommitted new column name defaults to the keyword <old-column-name>-lead-<lag-size>
  • column-name - Name of the column to use to compute the lead values
  • lead-size - positive integer indicating how many rows to skip over to compute the lead

Returns

A dataset with the column populated with the lagged values.

Compute next (lead) values from one column in a new column, can be used e.g. to compare values ahead of the current value.

## Usage

`(lead ds column-name lead-size)`
`(lead ds new-column-name column-name lead-size)`

## Arguments

- `ds` - A `tech.ml.dataset` (i.e a `tablecloth` dataset)
- `new-column-name` - __optional__ Name for the column where newly computed values will go.
  When ommitted new column name defaults to the keyword `<old-column-name>-lead-<lag-size>`
- `column-name` - Name of the column to use to compute the lead values
- `lead-size` - positive integer indicating how many rows to skip over to compute the lead

## Returns

A dataset with the column populated with the lagged values.
raw docstring

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close