(add! builder values)
Populate a FST with <input,output>
tuples. This function can be called iteratively
multiple times before the (create-fst!)
function is called to actually create the
FST.
[builder]
: builder where to populate the FST[values]
: map of the inputs->ouputs. The keys of the maps are the inputs,
and their values are the outputs.Note: if (add!)
is used iteratively, then you have to make sure that the
structure it iterates over has been previously sorted by the input keys.
Populate a FST with `<input,output>` tuples. This function can be called iteratively multiple times before the `(create-fst!)` function is called to actually create the FST. * `[builder]`: builder where to populate the FST * `[values]`: map of the inputs->ouputs. The keys of the maps are the inputs, and their values are the outputs. **Note:** if `(add!)` is used iteratively, then you have to make sure that the structure it iterates over has been previously sorted by the input keys.
(builder! type
&
{:keys [min-suffix-count-1 min-suffix-count-2 share-suffix
share-non-singleton-nodes share-max-tail-length pack-fst
acceptable-overhead-ratio allow-array-arcs bytes-page-bits]
:or {share-non-singleton-nodes true
min-suffix-count-2 0
min-suffix-count-1 0
share-max-tail-length Integer/MAX_VALUE
pack-fst false
share-suffix true
bytes-page-bits 15
acceptable-overhead-ratio
org.apache.lucene.util.packed.PackedInts/COMPACT
allow-array-arcs true}})
Create a builder object.
You can directly use this function instead of the (create-builder!)
function
if you require really specific settings.
[type]
: type of the output. Can be :int
or :char
[min-suffix-count-1]
(optional): If pruning the input graph during construction, this threshold is used for telling
if a node is kept or pruned. If transition_count(node) >= minSuffixCount1, the node
is kept.[mind-suffix-count-2]
(optional): (Note: only Mike McCandless knows what this one is really doing...)[share-suffix]
(optional): If true, the shared suffixes will be compacted into unique paths. This requires an
additional RAM-intensive hash map for lookups in memory. Setting this parameter to
false creates a single suffix path for all input sequences. This will result in a
larger FST, but requires substantially less memory and CPU during building.[share-non-singleton-nodes]
(optional): Only used if share-suffix is true. Set this to true to ensure FST is
fully minimal, at cost of more CPU and more RAM during building.[share-max-tail-length]
(optional): Only used if share-suffix is true. Set this to
Integer.MAX_VALUE to ensure FST is fully minimal, at cost of more
CPU and more RAM during building.[allow-array-arcs]
(optional): Pass false to disable the array arc optimization while building the FST;
this will make the resulting FST smaller but slower to traverse.[bytes-page-bits]
(optional): How many bits wide to make each byte[] block in the BytesStore; if you know
the FST will be large then make this larger. For example 15 bits = 32768
byte pages.Create a builder object. You can directly use this function instead of the `(create-builder!)` function if you require really specific settings. * `[type]`: type of the output. Can be `:int` or `:char` * `[min-suffix-count-1]` (optional): If pruning the input graph during construction, this threshold is used for telling if a node is kept or pruned. If transition_count(node) >= minSuffixCount1, the node is kept. * `[mind-suffix-count-2]` (optional): (Note: only Mike McCandless knows what this one is really doing...) * `[share-suffix]` (optional): If true, the shared suffixes will be compacted into unique paths. This requires an additional RAM-intensive hash map for lookups in memory. Setting this parameter to false creates a single suffix path for all input sequences. This will result in a larger FST, but requires substantially less memory and CPU during building. * `[share-non-singleton-nodes]` (optional): Only used if share-suffix is true. Set this to true to ensure FST is fully minimal, at cost of more CPU and more RAM during building. * `[share-max-tail-length]` (optional): Only used if share-suffix is true. Set this to Integer.MAX_VALUE to ensure FST is fully minimal, at cost of more CPU and more RAM during building. * `[allow-array-arcs]` (optional): Pass false to disable the array arc optimization while building the FST; this will make the resulting FST smaller but slower to traverse. * `[bytes-page-bits]` (optional): How many bits wide to make each byte[] block in the BytesStore; if you know the FST will be large then make this larger. For example 15 bits = 32768 byte pages.
(bytes-ref-builder)
Create a BytesRefBuilder
Create a BytesRefBuilder
(char-outputs)
Create a CharSequenceOutputs
Create a CharSequenceOutputs
(chars-ref-builder)
Create a CharsRefBuilder
Create a CharsRefBuilder
(create-builder! & {:keys [type] :or {type :char}})
Create a new FST builder map.
[type]
(optional): Output type of the FST. Can be :int
or :char
(default)Create a new FST builder map. * `[type]` *(optional)*: Output type of the FST. Can be `:int` or `:char` (default)
(create-fst! builder)
Create a new FST based on a builder that has been populated with inputs/outputs
[builder]
: builder option that has been created and populatedCreate a new FST based on a builder that has been populated with inputs/outputs * `[builder]`: builder option that has been created and populated
(get-output input fst)
Get the output for a given input.
[input]
: input for which we want its output[fst]
: FST object where to look for the outputGet the output for a given input. * `[input]`: input for which we want its output * `[fst]`: FST object where to look for the output
(int-outputs)
Create a PositiveIntOutputs
Create a PositiveIntOutputs
(ints-ref-builder)
Create a IntsRefBuilder
Create a IntsRefBuilder
(load! file & {:keys [output-type] :or {output-type :char}})
Load a FST to a file on the file system
[file] is the file path on the file system [output-type] (optional) :int (default) when the output of the FST file are integers. :char when the output of the FST file are characters.
Returns the loaded FST
Load a FST to a file on the file system [file] is the file path on the file system [output-type] (optional) :int (default) when the output of the FST file are integers. :char when the output of the FST file are characters. Returns the loaded FST
(save! file fst)
Save a FST to a file on the file system
[file]
is the file path on the file system[fst]
is the FST instanceSave a FST to a file on the file system * `[file]` is the file path on the file system * `[fst]` is the FST instance
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close