Liking cljdoc? Tell your friends :D

tech.ml.dataset.text


data->enc-builderclj

(data->enc-builder data)
source

default-charsetclj

source

default-tokenizerclj

source

enc-builder->dataclj

(enc-builder->data enc)
source

encoded-text-builderclj

(encoded-text-builder)
(encoded-text-builder encoding)
source

Encodingclj

source

make-tokenized-text-builderclj

(make-tokenized-text-builder)
(make-tokenized-text-builder str-table)
(make-tokenized-text-builder str-table options)
source

PEncodingToFncljprotocol

encoding->decode-fnclj

(encoding->decode-fn encoding)

encoding->encode-fnclj

(encoding->encode-fn encoding)
source

string->tokenized-text!clj

(string->tokenized-text! str-data)
(string->tokenized-text! str-table str-data)
(string->tokenized-text! str-table
                         {:keys [token tokenizer]
                          :or {token " " tokenizer default-tokenizer}}
                         str-data)

Mutates the string table adding tokens and records offset and length in a text object.

Mutates the string table adding tokens and records offset and length in a text
object.
sourceraw docstring

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close