Liking cljdoc? Tell your friends :D

datajure.index

Keyed lookup indexes — explicit, immutable prepared row-index structures over a source dataset. An index is built once and reused for fast access; the source dataset is never reordered or mutated (this is data.table's setindex(), never setkey()), and the index carries a reference to the exact dataset value it was built from so it can never be applied to a mismatched table.

Two kinds:

  • :hash (default) — equality point-lookups. All key columns form the lookup tuple. lookup resolves a key to that key's rows in O(1):

    (def by-tic (idx/index-by panel :tic))
    (idx/lookup by-tic "AAPL")            ;; => dataset of AAPL's rows
    (idx/lookup by-firm-date [1690 date])   ;; multi-column tuple key
    
  • :asof — the prepared right-side structure for as-of / window joins. The LAST key column is the asof column, the rest are exact-match keys; each exact tuple maps to its asof values (sorted ascending, nils last) plus original row ids. An :asof index is not point-lookable via lookup (the access pattern needs direction/tolerance/window semantics) — it is consumed by datajure.asof / a :how :asof join:

    (def right-idx (idx/index-by compustat [:gvkey :rdq] {:kind :asof}))
    (idx/asof-index compustat [:gvkey :rdq])   ;; convenience alias
    

lookup-indices returns raw row indices for callers gathering from a row-aligned projection themselves.

Keyed lookup indexes — explicit, immutable prepared row-index structures over
a source dataset. An index is built once and reused for fast access; the source
dataset is never reordered or mutated (this is data.table's `setindex()`, never
`setkey()`), and the index carries a reference to the exact dataset value it was
built from so it can never be applied to a mismatched table.

Two kinds:

- `:hash` (default) — equality point-lookups. All key columns form the lookup
  tuple. `lookup` resolves a key to that key's rows in O(1):

      (def by-tic (idx/index-by panel :tic))
      (idx/lookup by-tic "AAPL")            ;; => dataset of AAPL's rows
      (idx/lookup by-firm-date [1690 date])   ;; multi-column tuple key

- `:asof` — the prepared right-side structure for as-of / window joins. The
  LAST key column is the asof column, the rest are exact-match keys; each exact
  tuple maps to its asof values (sorted ascending, nils last) plus original row
  ids. An `:asof` index is not point-lookable via `lookup` (the access pattern
  needs direction/tolerance/window semantics) — it is consumed by
  `datajure.asof` / a `:how :asof` join:

      (def right-idx (idx/index-by compustat [:gvkey :rdq] {:kind :asof}))
      (idx/asof-index compustat [:gvkey :rdq])   ;; convenience alias

`lookup-indices` returns raw row indices for callers gathering from a row-aligned
projection themselves.
raw docstring

asof-groupsclj

(asof-groups index)

Low-level: the exact-key -> {:reader :orig :n} table of an :asof index. Used by datajure.asof to consume a prebuilt index.

Low-level: the exact-key -> {:reader :orig :n} table of an `:asof` index.
Used by `datajure.asof` to consume a prebuilt index.
sourceraw docstring

asof-indexclj

(asof-index dataset key-cols)

Convenience for (index-by dataset key-cols {:kind :asof}).

Convenience for `(index-by dataset key-cols {:kind :asof})`.
sourceraw docstring

index-byclj

(index-by dataset key-cols)
(index-by dataset key-cols opts)

Build an immutable index over dataset keyed by key-cols (a keyword or a vector of keywords). opts may set :kind to :hash (default) or :asof.

:hash — all key columns form the equality tuple; rows are grouped by key, so a key may map to many rows (ascending original order). O(row-count) to build, O(1) per lookup.

:asof — the last key column is the asof column, the rest are exact-match keys; the prepared structure for datajure.asof (see ns doc). Consumed by as-of / window joins, not by lookup.

The index holds a reference to dataset.

Build an immutable index over `dataset` keyed by `key-cols` (a keyword or a
vector of keywords). `opts` may set `:kind` to `:hash` (default) or `:asof`.

`:hash` — all key columns form the equality tuple; rows are grouped by key, so a
key may map to many rows (ascending original order). O(row-count) to build,
O(1) per `lookup`.

`:asof` — the last key column is the asof column, the rest are exact-match keys;
the prepared structure for `datajure.asof` (see ns doc). Consumed by as-of /
window joins, not by `lookup`.

The index holds a reference to `dataset`.
sourceraw docstring

index?clj

(index? x)

True if x is a datajure index value.

True if `x` is a datajure index value.
sourceraw docstring

key-columnsclj

(key-columns index)

The vector of key columns this index is built on.

The vector of key columns this index is built on.
sourceraw docstring

kindclj

(kind index)

The kind of index:hash or :asof.

The kind of `index` — `:hash` or `:asof`.
sourceraw docstring

lookupclj

(lookup index k)

Return the rows of the index's source dataset whose key equals k, as a dataset in original row order. Empty (0-row) dataset if the key is absent. :hash indexes only.

Return the rows of the index's source dataset whose key equals `k`, as a
dataset in original row order. Empty (0-row) dataset if the key is absent.
`:hash` indexes only.
sourceraw docstring

lookup-indicesclj

(lookup-indices index k)

Return the vector of row indices (into the index's source dataset) whose key equals k — a scalar for a single-column index, a tuple/vector for a multi-column one. Empty vector if the key is absent. :hash indexes only.

Return the vector of row indices (into the index's source dataset) whose key
equals `k` — a scalar for a single-column index, a tuple/vector for a
multi-column one. Empty vector if the key is absent. `:hash` indexes only.
sourceraw docstring

row-asof-valclj

(row-asof-val readers i)

Extract the asof-key value (last key column) from row i. Low-level.

Extract the asof-key value (last key column) from row `i`. Low-level.
sourceraw docstring

row-exact-keyclj

(row-exact-key readers i)

Extract the exact-key tuple (all key columns except the last) from row i given readers (one reader per key column). Returns [] when there is only one key column (pure asof, no exact grouping). Low-level — shared with the as-of search layer for probing left rows.

Extract the exact-key tuple (all key columns except the last) from row `i`
given `readers` (one reader per key column). Returns [] when there is only one
key column (pure asof, no exact grouping). Low-level — shared with the as-of
search layer for probing left rows.
sourceraw docstring

source-datasetclj

(source-dataset index)

The dataset value this index was built from.

The dataset value this index was built from.
sourceraw docstring

cljdoc builds & hosts documentation for Clojure/Script libraries

Keyboard shortcuts
Ctrl+kJump to recent docs
Move to previous article
Move to next article
Ctrl+/Jump to the search field
× close