grafter-2.tabular

Liking cljdoc? Tell your friends :D

Clojure only.

Functions for processing tabular data.

Functions for processing tabular data.

raw docstring

_^clj

An alias for the identity function, used for providing positional arguments to mapc.

An alias for the identity function, used for providing positional arguments to mapc.

source raw docstring

add-column^clj

(add-column dataset new-column value)

Add a new column to a dataset with the supplied value lazily copied into every row within it.

Add a new column to a dataset with the supplied value lazily copied
into every row within it.

source raw docstring

add-columns^clj

(add-columns dataset hash)

(add-columns dataset source-cols f)

(add-columns dataset new-col-ids source-cols f)

Add several new columns to a dataset at once. There are a number of different parameterisations:

(add-columns ds {:foo 10 :bar 20})

Calling with two arguments where the second argument is a hash map creates new columns in the dataset for each of the hashmaps keys and copies the hashes values lazily down all the rows. This parameterisation is designed to work well build-lookup-table.

When given either a single column id or many along with a function which returns a hashmap, add-columns will pass each cell from the specified columns into the given function, and then associate its returned map back into the dataset. e.g.

(add-columns ds "a" (fn [a] {:b (inc a) :c (inc a)} )) ; =>

a	:b	:c
0	1	1
1	2	2

As a dataset needs to know its columns in this case it will infer them from the return value of the first row. If you don't want to infer them from the first row then you can also supply them like so:

(add-columns ds [:b :c] "a" (fn [a] {:b (inc a) :c (inc a)} )) ; =>

a	:b	:c
0	1	1
1	2	2

Add several new columns to a dataset at once.  There are a number of different parameterisations:

`(add-columns ds {:foo 10 :bar 20})`

Calling with two arguments where the second argument is a hash map
creates new columns in the dataset for each of the hashmaps keys and
copies the hashes values lazily down all the rows.  This
parameterisation is designed to work well build-lookup-table.

When given either a single column id or many along with a function
which returns a hashmap, add-columns will pass each cell from the
specified columns into the given function, and then associate its
returned map back into the dataset.  e.g.

`(add-columns ds "a" (fn [a] {:b (inc a) :c (inc a)} )) ; =>`

| a | :b | :c |
|---|----|----|
| 0 |  1 |  1 |
| 1 |  2 |  2 |

As a dataset needs to know its columns in this case it will infer
them from the return value of the first row.  If you don't want to
infer them from the first row then you can also supply them like so:

`(add-columns ds [:b :c] "a" (fn [a] {:b (inc a) :c (inc a)} )) ; =>`

| a | :b | :c |
|---|----|----|
| 0 |  1 |  1 |
| 1 |  2 |  2 |

source raw docstring

apply-columns^clj

(apply-columns dataset fs)

Like mapc in that you associate functions with particular columns, though it differs in that the functions given to mapc should receive and return values for individual cells.

With apply-columns, the function receives a collection of cell values from the column and should return a collection of values for the column.

It is also possible to create new columns with apply-columns for example to assign row ids you can do:

(apply-columns ds {:row-id (fn [_] (grafter.sequences/integers-from 0))})

Like mapc in that you associate functions with particular columns,
though it differs in that the functions given to mapc should receive
and return values for individual cells.

With apply-columns, the function receives a collection of cell
values from the column and should return a collection of values for
the column.

It is also possible to create new columns with apply-columns for
example to assign row ids you can do:

`(apply-columns ds {:row-id (fn [_] (grafter.sequences/integers-from 0))})`

source raw docstring

build-lookup-table^clj

(build-lookup-table dataset key-cols)

(build-lookup-table dataset key-cols return-keys)

Takes a dataset, a vector of any number of column names corresponding to key columns and a column name corresponding to the value column. Returns a function, taking a vector of keys as argument and returning the value wanted

Takes a dataset, a vector of any number of column names corresponding
to key columns and a column name corresponding to the value
column.
Returns a function, taking a vector of keys as
argument and returning the value wanted

source raw docstring

column-names^clj

If given a dataset, it returns its column names. If given a dataset and a sequence of column names, it returns a dataset with the given column names.

If given a dataset, it returns its column names. If given a dataset and a sequence
of column names, it returns a dataset with the given column names.

source raw docstring

columns^clj

(columns dataset cols)

Given a dataset and a sequence of column identifiers, columns narrows the dataset to just the supplied columns.

Columns specified in the selection that are not included in the Dataset will be silently ignored.

The order of the columns in the returned dataset will be determined by the order of matched columns in the selection.

The supplied sequence of columns are first cropped to the number of columns in the dataset before being selected, this means that infinite sequences can safely supplied to this function.

Given a dataset and a sequence of column identifiers, columns
narrows the dataset to just the supplied columns.

Columns specified in the selection that are not included in the
Dataset will be silently ignored.

The order of the columns in the returned dataset will be determined
by the order of matched columns in the selection.

The supplied sequence of columns are first cropped to the number of
columns in the dataset before being selected, this means that
infinite sequences can safely supplied to this function.

source raw docstring

dataset?^clj

(dataset? ds)

Predicate function to test whether the supplied argument is a dataset or not.

Predicate function to test whether the supplied argument is a
dataset or not.

source raw docstring

derive-column^clj

(derive-column dataset new-column-name from-cols)

(derive-column dataset new-column-name from-cols f)

Adds a new column to the end of the row which is derived from column with position col-n. f should just return the cells value.

If no f is supplied the identity function is used, which results in the specified column being cloned.

Adds a new column to the end of the row which is derived from
column with position col-n.  f should just return the cells value.

If no f is supplied the identity function is used, which results in
the specified column being cloned.

source raw docstring

drop-rows^clj

(drop-rows dataset n)

Drops the first n rows from the dataset, retaining the rest.

Drops the first n rows from the dataset, retaining the rest.

source raw docstring

graph-fn^cljmacro

(graph-fn [row-bindings] & forms)

A macro that defines an anonymous function to convert a tabular dataset into a graph of RDF quads. Ultimately it converts a lazy-seq of rows inside a dataset, into a lazy-seq of RDF Statements.

The function body should be composed of any number of forms, each of which should return a sequence of RDF quads. These will then be concatenated together into a flattened lazy-seq of RDF statements.

Rows are passed to the function one at a time as hash-maps, which can be destructured via Clojure's standard destructuring syntax.

Additionally destructuring can be done on row-indicies (when a vector form is supplied) or column names (when a hash-map form is supplied).

A macro that defines an anonymous function to convert a tabular
dataset into a graph of RDF quads.  Ultimately it converts a
lazy-seq of rows inside a dataset, into a lazy-seq of RDF
Statements.

The function body should be composed of any number of forms, each of
which should return a sequence of RDF quads.  These will then be
concatenated together into a flattened lazy-seq of RDF statements.

Rows are passed to the function one at a time as hash-maps, which
can be destructured via Clojure's standard destructuring syntax.

Additionally destructuring can be done on row-indicies (when a
vector form is supplied) or column names (when a hash-map form is
supplied).

source raw docstring

grep^cljmultimethod

Filters rows in the table for matches. This is multi-method dispatches on the type of its second argument. It also takes any number of column numbers as the final set of arguments. These narrow the scope of the grep to only those columns. If no columns are specified then grep operates on all columns.

Filters rows in the table for matches.  This is multi-method
dispatches on the type of its second argument.  It also takes any
number of column numbers as the final set of arguments.  These
narrow the scope of the grep to only those columns.  If no columns
are specified then grep operates on all columns.

source raw docstring

make-dataset^clj

(make-dataset)

(make-dataset data)

(make-dataset data columns-or-f)

Like incanter's dataset function except it can take a lazy-sequence of column names which will get mapped to the source data.

Works by inspecting the amount of columns in the first row, and taking that many column names from the sequence.

Inspects the first row of data to determine the number of columns, and creates an incanter dataset with columns named alphabetically as by grafter.sequences/column-names-seq.

Like incanter's dataset function except it can take a lazy-sequence
of column names which will get mapped to the source data.

Works by inspecting the amount of columns in the first row, and
taking that many column names from the sequence.

Inspects the first row of data to determine the number of columns,
and creates an incanter dataset with columns named alphabetically as
by grafter.sequences/column-names-seq.

source raw docstring

mapc^clj

(mapc dataset fs)

Takes a vector or a hashmap of functions and maps each to the key column for every row. Each function should be from a cell to a cell, where as with apply-columns it should be from a column to a column i.e. its function from a collection of cells to a collection of cells.

If the specified column does not exist in the source data a new column will be created, though the supplied function will need to either ignore its argument or handle a nil argument.

Takes a vector or a hashmap of functions and maps each to the key
column for every row.  Each function should be from a cell to a
cell, where as with apply-columns it should be from a column to a
column i.e. its function from a collection of cells to a collection
of cells.

If the specified column does not exist in the source data a new
column will be created, though the supplied function will need to
either ignore its argument or handle a nil argument.

source raw docstring

melt^clj

(melt dataset pivot-keys)

Melt an object into a form suitable for easy casting, like a melt function in R. It accepts multiple pivot keys (identifier variables that are reproduced for each row in the output).

(use '(incanter core charts datasets))

(view (with-data (melt (get-dataset :flow-meter) :Subject)

(line-chart :Subject :value :group-by :variable :legend true)))

See http://www.statmethods.net/management/reshape.html for more examples.

Melt an object into a form suitable for easy casting, like a melt function in R.
It accepts multiple pivot keys (identifier variables that are
reproduced for each row in the output).

`(use '(incanter core charts datasets))`

`(view (with-data (melt (get-dataset :flow-meter) :Subject)`

`(line-chart :Subject :value :group-by :variable :legend true)))`

See http://www.statmethods.net/management/reshape.html for more
examples.

source raw docstring

move-first-row-to-header^clj

(move-first-row-to-header [first-row & other-rows])

For use with make-dataset. Moves the first row of data into the header, removing it from the source data.

For use with make-dataset.  Moves the first row of data into the
header, removing it from the source data.

source raw docstring

read-dataset^clj

(read-dataset source & {:as opts})

source

read-datasets^clj

(read-datasets dataset & {:keys [format] :as opts})

Opens a lazy sequence of datasets from a something that returns multiple datasetables - i.e. all the worksheets in an Excel workbook.

Opens a lazy sequence of datasets from a something that returns multiple
datasetables - i.e. all the worksheets in an Excel workbook.

source raw docstring

rename-columns^clj

(rename-columns dataset col-map-or-fn)

Renames the columns in the dataset. Takes either a map or a function. If a map is passed it will rename the specified keys to the corresponding values.

If a function is supplied it will apply the function to all of the column-names in the supplied dataset. The return values of this function will then become the new column names in the dataset returned by rename-columns.

Renames the columns in the dataset.  Takes either a map or a
function.  If a map is passed it will rename the specified keys to
the corresponding values.

If a function is supplied it will apply the function to all of the
column-names in the supplied dataset.  The return values of this
function will then become the new column names in the dataset
returned by rename-columns.

source raw docstring

reorder-columns^clj

(reorder-columns {:keys [column-names] :as ds} cols)

Reorder the columns in a dataset to the supplied order. An error will be raised if the supplied set of columns are different to the set of columns in the dataset.

Reorder the columns in a dataset to the supplied order. An error
will be raised if the supplied set of columns are different to the
set of columns in the dataset.

source raw docstring

resolve-column-id^clj

(resolve-column-id dataset column-key)

(resolve-column-id dataset column-key not-found)

Finds and resolves the column id by converting between symbols and strings. If column-key is not found in the datsets headers then not-found is returned.

Finds and resolves the column id by converting between symbols and
strings.  If column-key is not found in the datsets headers then
not-found is returned.

source raw docstring

resolve-key-cols^clj

(resolve-key-cols dataset key-cols)

source

rows^clj

(rows dataset row-numbers)

Takes a dataset and a seq of row-numbers and returns a dataset consisting of just the supplied rows. If a row number is not found the function will assume it has consumed all the rows and return normally.

Takes a dataset and a seq of row-numbers and returns a dataset
consisting of just the supplied rows.  If a row number is not found
the function will assume it has consumed all the rows and return
normally.

source raw docstring

swap^clj

(swap dataset first-col second-col)

(swap dataset first-col second-col & more)

Takes an even numer of column names and swaps each column

Takes an even numer of column names and swaps each column

source raw docstring

take-rows^clj

(take-rows dataset n)

Takes only the first n rows from the dataset, discarding the rest.

Takes only the first n rows from the dataset, discarding the rest.

source raw docstring

test-dataset^clj

(test-dataset r c)

Constructs a test dataset of r rows by c cols e.g.

(test-dataset 2 2) ;; =>

A	B
0	0
1	1

Constructs a test dataset of r rows by c cols e.g.

`(test-dataset 2 2) ;; =>`

| A | B |
|---|---|
| 0 | 0 |
| 1 | 1 |

source raw docstring

with-metadata-columns^clj

(with-metadata-columns [context data])

Takes a pair of [context, data] and returns a dataset. Where the metadata context is merged into the dataset itself.

Takes a pair of [context, data] and returns a dataset.  Where the
metadata context is merged into the dataset itself.

source raw docstring

without-metadata-columns^clj

(without-metadata-columns [context data])

Ignores any possible metadata and leaves the dataset as is.

Ignores any possible metadata and leaves the dataset as is.

source raw docstring

write-dataset^clj

(write-dataset destination dataset & {:keys [format] :as opts})

source

cljdoc is a website building & hosting documentation for Clojure/Script libraries

Keyboard shortcuts Report a problem cljdoc on GitHub

× close