(doc->graph doc analyzer)
Each field is analyzed into graph. Params:
Each field is analyzed into graph. Params: * doc: flat associative data type * analyzer: Lucene Analyzer, but probably you want a PerFieldAnalyzerWrapper
(doc->token-strings doc analyzer)
Given a document iterates through all its fields, applies an analyzer to each field, and returns a map with the same keys and the analyzed text. TIP: the analyzer probably is the PerFieldAnalyzerWrapper.
Given a document iterates through all its fields, applies an analyzer to each field, and returns a map with the same keys and the analyzed text. TIP: the analyzer probably is the PerFieldAnalyzerWrapper.
(doc->tokens doc analyzer)
Each field is analyzed into tokens. Params:
Each field is analyzed into tokens. Params: * doc: flat associative data type * analyzer: Lucene Analyzer, but probably you want a PerFieldAnalyzerWrapper
(normalize text)
(normalize text analyzer)
(normalize text analyzer field-name)
Given a text invokes Analyzer::normalize on it. Returns a String representation of a ByteRef.
Given a text invokes Analyzer::normalize on it. Returns a String representation of a ByteRef.
(normalize-doc doc analyzer)
Normalizes each field with an analyzer. Params:
Normalizes each field with an analyzer. Params: * doc: flat associative data type * analyzer: Lucene Analyzer, but probably you want a PerFieldAnalyzerWrapper
(text->graph text)
(text->graph text analyzer)
(text->graph text analyzer field-name)
Given a text (and an optional analyzer) turns the text into a TokenStream
that will be converted to the dot
language program as a string, e.g.:
`digraph tokens {
graph [ fontsize=30 labelloc="t" label="" splines=true overlap=false rankdir = "LR" ];
// A2 paper size
size = "34.4,16.5";
edge [ fontname="Helvetica" fontcolor="red" color="#606060" ]
node [ style="filled" fillcolor="#e8e8f0" shape="Mrecord" fontname="Helvetica" ]
0 [label="0"] -1 [shape=point color=white] -1 -> 0 [] 0 -> 1 [ label="foobarbazs / fooBarBazs"] -2 [shape=point color=white] 1 -> -2 [] }`
Given a text (and an optional analyzer) turns the text into a TokenStream that will be converted to the `dot` language program as a string, e.g.: `digraph tokens { graph [ fontsize=30 labelloc=\"t\" label=\"\" splines=true overlap=false rankdir = \"LR\" ]; // A2 paper size size = \"34.4,16.5\"; edge [ fontname=\"Helvetica\" fontcolor=\"red\" color=\"#606060\" ] node [ style=\"filled\" fillcolor=\"#e8e8f0\" shape=\"Mrecord\" fontname=\"Helvetica\" ] 0 [label=\"0\"] -1 [shape=point color=white] -1 -> 0 [] 0 -> 1 [ label=\"foobarbazs / fooBarBazs\"] -2 [shape=point color=white] 1 -> -2 [] }`
(text->token-strings text)
(text->token-strings text analyzer)
(text->token-strings text analyzer field-name)
Given a text (and an optional analyzer) returns a vector of tokens as strings. Params:
Given a text (and an optional analyzer) returns a vector of tokens as strings. Params: * text: String * analyzer: Lucene Analyzer * field-name can be either string if clojure.lang.Named.
(text->tokens text)
(text->tokens text analyzer)
(text->tokens text analyzer field-name)
Given a text (and an optional analyzer) returns a list of tokens as maps of shape: {:token "pre", :type "<ALPHANUM>", :start_offset 0, :end_offset 3, :position 0, :positionLength 1} Params:
Given a text (and an optional analyzer) returns a list of tokens as maps of shape: {:token "pre", :type "<ALPHANUM>", :start_offset 0, :end_offset 3, :position 0, :positionLength 1} Params: * text: String * analyzer: Lucene Analyzer * field-name can be either string if clojure.lang.Named.
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close