Liking cljdoc? Tell your friends :D

clj-fst.core


add!clj

(add! builder values)

Populate a FST with <input,output> tuples. This function can be called iteratively multiple times before the (create-fst!) function is called to actually create the FST.

  • [builder]: builder where to populate the FST
  • [values]: map of the inputs->ouputs. The keys of the maps are the inputs, and their values are the outputs.

Note: if (add!) is used iteratively, then you have to make sure that the structure it iterates over has been previously sorted by the input keys.

Populate a FST with `<input,output>` tuples. This function can be called iteratively
multiple times before the `(create-fst!)` function is called to actually create the
FST.

 * `[builder]`: builder where to populate the FST
 * `[values]`: map of the inputs->ouputs. The keys of the maps are the inputs,
               and their values are the outputs.

**Note:** if `(add!)` is used iteratively, then you have to make sure that the
          structure it iterates over has been previously sorted by the input keys.
sourceraw docstring

builder!clj

(builder! type
          &
          {:keys [min-suffix-count-1 min-suffix-count-2 share-suffix
                  share-non-singleton-nodes share-max-tail-length pack-fst
                  acceptable-overhead-ratio allow-array-arcs bytes-page-bits]
           :or {share-non-singleton-nodes true
                min-suffix-count-2 0
                min-suffix-count-1 0
                share-max-tail-length Integer/MAX_VALUE
                pack-fst false
                share-suffix true
                bytes-page-bits 15
                acceptable-overhead-ratio
                  org.apache.lucene.util.packed.PackedInts/COMPACT
                allow-array-arcs true}})

Create a builder object.

You can directly use this function instead of the (create-builder!) function if you require really specific settings.

  • [type]: type of the output. Can be :int or :char
  • [min-suffix-count-1] (optional): If pruning the input graph during construction, this threshold is used for telling if a node is kept or pruned. If transition_count(node) >= minSuffixCount1, the node is kept.
  • [mind-suffix-count-2] (optional): (Note: only Mike McCandless knows what this one is really doing...)
  • [share-suffix] (optional): If true, the shared suffixes will be compacted into unique paths. This requires an additional RAM-intensive hash map for lookups in memory. Setting this parameter to false creates a single suffix path for all input sequences. This will result in a larger FST, but requires substantially less memory and CPU during building.
  • [share-non-singleton-nodes] (optional): Only used if share-suffix is true. Set this to true to ensure FST is fully minimal, at cost of more CPU and more RAM during building.
  • [share-max-tail-length] (optional): Only used if share-suffix is true. Set this to Integer.MAX_VALUE to ensure FST is fully minimal, at cost of more CPU and more RAM during building.
  • [allow-array-arcs] (optional): Pass false to disable the array arc optimization while building the FST; this will make the resulting FST smaller but slower to traverse.
  • [bytes-page-bits] (optional): How many bits wide to make each byte[] block in the BytesStore; if you know the FST will be large then make this larger. For example 15 bits = 32768 byte pages.
Create a builder object.

You can directly use this function instead of the `(create-builder!)` function
if you require really specific settings.

* `[type]`: type of the output. Can be `:int` or `:char`
* `[min-suffix-count-1]` (optional): If pruning the input graph during construction, this threshold is used for telling
                                     if a node is kept or pruned. If transition_count(node) >= minSuffixCount1, the node
                                     is kept.
* `[mind-suffix-count-2]` (optional): (Note: only Mike McCandless knows what this one is really doing...)
* `[share-suffix]` (optional): If true, the shared suffixes will be compacted into unique paths. This requires an
                               additional RAM-intensive hash map for lookups in memory. Setting this parameter to
                               false creates a single suffix path for all input sequences. This will result in a
                               larger FST, but requires substantially less memory and CPU during building.
* `[share-non-singleton-nodes]` (optional): Only used if share-suffix is true. Set this to true to ensure FST is
                                            fully minimal, at cost of more CPU and more RAM during building.
* `[share-max-tail-length]` (optional): Only used if share-suffix is true. Set this to
                                        Integer.MAX_VALUE to ensure FST is fully minimal, at cost of more
                                        CPU and more RAM during building.
* `[allow-array-arcs]` (optional): Pass false to disable the array arc optimization while building the FST;
                                   this will make the resulting FST smaller but slower to traverse.
* `[bytes-page-bits]` (optional): How many bits wide to make each byte[] block in the BytesStore; if you know
                                  the FST will be large then make this larger. For example 15 bits = 32768
                                  byte pages.
sourceraw docstring

bytes-refclj

(bytes-ref)

Create a BytesRef

Create a BytesRef
sourceraw docstring

bytes-ref-builderclj

(bytes-ref-builder)

Create a BytesRefBuilder

Create a BytesRefBuilder
sourceraw docstring

char-outputsclj

(char-outputs)

Create a CharSequenceOutputs

Create a CharSequenceOutputs
sourceraw docstring

chars-refclj

(chars-ref)

Create a CharsRef

Create a CharsRef
sourceraw docstring

chars-ref-builderclj

(chars-ref-builder)

Create a CharsRefBuilder

Create a CharsRefBuilder
sourceraw docstring

create-builder!clj

(create-builder! & {:keys [type] :or {type :char}})

Create a new FST builder map.

  • [type] (optional): Output type of the FST. Can be :int or :char (default)
Create a new FST builder map.

* `[type]` *(optional)*: Output type of the FST. Can be `:int` or `:char` (default)
sourceraw docstring

create-fst!clj

(create-fst! builder)

Create a new FST based on a builder that has been populated with inputs/outputs

  • [builder]: builder option that has been created and populated
Create a new FST based on a builder that has been populated with inputs/outputs

* `[builder]`: builder option that has been created and populated
sourceraw docstring

get-outputclj

(get-output input fst)

Get the output for a given input.

  • [input]: input for which we want its output
  • [fst]: FST object where to look for the output
Get the output for a given input.

* `[input]`: input for which we want its output
* `[fst]`: FST object where to look for the output
sourceraw docstring

int-outputsclj

(int-outputs)

Create a PositiveIntOutputs

Create a PositiveIntOutputs
sourceraw docstring

ints-refclj

(ints-ref)

Create a IntsRef

Create a IntsRef
sourceraw docstring

ints-ref-builderclj

(ints-ref-builder)

Create a IntsRefBuilder

Create a IntsRefBuilder
sourceraw docstring

load!clj

(load! file & {:keys [output-type] :or {output-type :char}})

Load a FST to a file on the file system

[file] is the file path on the file system [output-type] (optional) :int (default) when the output of the FST file are integers. :char when the output of the FST file are characters.

Returns the loaded FST

Load a FST to a file on the file system

[file] is the file path on the file system
[output-type] (optional) :int (default) when the output of the FST file are
                         integers. :char when the output of the FST file are
                         characters.

Returns the loaded FST
sourceraw docstring

save!clj

(save! file fst)

Save a FST to a file on the file system

  • [file] is the file path on the file system
  • [fst] is the FST instance
Save a FST to a file on the file system

* `[file]` is the file path on the file system
* `[fst]` is the FST instance
sourceraw docstring

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close