(path->dataset-master-token-table path bag-of-words-colname)
(path->dataset-master-token-table path
bag-of-words-colname
{:keys [tokenizer]
:or {tokenizer (simple-tokenizer-fn)}})
Parse a file returning a map of {:dataset :token-table} where token-table is a map of tokens to counts. Dataset has a sha-256-hash where the original text once was.
Parse a file returning a map of {:dataset :token-table} where token-table is a map of tokens to counts. Dataset has a sha-256-hash where the original text once was.
(path-token-map->bag-of-words path bag-of-words-colname token->idx-map)
(path-token-map->bag-of-words path
bag-of-words-colname
token->idx-map
{:keys [tokenizer]
:or {tokenizer (simple-tokenizer-fn)}})
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close