Liking cljdoc? Tell your friends :D

lucene.custom.text-analysis


doc->graphclj

(doc->graph doc analyzer)

Each field is analyzed into graph. Params:

  • doc: flat associative data type
  • analyzer: Lucene Analyzer, but probably you want a PerFieldAnalyzerWrapper
Each field is analyzed into graph.
Params:
* doc: flat associative data type
* analyzer: Lucene Analyzer, but probably you want a PerFieldAnalyzerWrapper
sourceraw docstring

doc->token-stringsclj

(doc->token-strings doc analyzer)

Given a document iterates through all its fields, applies an analyzer to each field, and returns a map with the same keys and the analyzed text. TIP: the analyzer probably is the PerFieldAnalyzerWrapper.

Given a document iterates through all its fields, applies an analyzer to each field,
and returns a map with the same keys and the analyzed text.
TIP: the analyzer probably is the PerFieldAnalyzerWrapper.
sourceraw docstring

doc->tokensclj

(doc->tokens doc analyzer)

Each field is analyzed into tokens. Params:

  • doc: flat associative data type
  • analyzer: Lucene Analyzer, but probably you want a PerFieldAnalyzerWrapper
Each field is analyzed into tokens.
Params:
* doc: flat associative data type
* analyzer: Lucene Analyzer, but probably you want a PerFieldAnalyzerWrapper
sourceraw docstring

normalizeclj

(normalize text)
(normalize text analyzer)
(normalize text analyzer field-name)

Given a text invokes Analyzer::normalize on it. Returns a String representation of a ByteRef.

Given a text invokes Analyzer::normalize on it.
Returns a String representation of a ByteRef.
sourceraw docstring

normalize-docclj

(normalize-doc doc analyzer)

Normalizes each field with an analyzer. Params:

  • doc: flat associative data type
  • analyzer: Lucene Analyzer, but probably you want a PerFieldAnalyzerWrapper
Normalizes each field with an analyzer.
Params:
* doc: flat associative data type
* analyzer: Lucene Analyzer, but probably you want a PerFieldAnalyzerWrapper
sourceraw docstring

text->graphclj

(text->graph text)
(text->graph text analyzer)
(text->graph text analyzer field-name)

Given a text (and an optional analyzer) turns the text into a TokenStream that will be converted to the dot language program as a string, e.g.: `digraph tokens { graph [ fontsize=30 labelloc="t" label="" splines=true overlap=false rankdir = "LR" ]; // A2 paper size size = "34.4,16.5"; edge [ fontname="Helvetica" fontcolor="red" color="#606060" ] node [ style="filled" fillcolor="#e8e8f0" shape="Mrecord" fontname="Helvetica" ]

0 [label="0"] -1 [shape=point color=white] -1 -> 0 [] 0 -> 1 [ label="foobarbazs / fooBarBazs"] -2 [shape=point color=white] 1 -> -2 [] }`

Given a text (and an optional analyzer) turns the text into a TokenStream
that will be converted to the `dot` language program as a string, e.g.:
`digraph tokens {
   graph [ fontsize=30 labelloc=\"t\" label=\"\" splines=true overlap=false rankdir = \"LR\" ];
   // A2 paper size
   size = \"34.4,16.5\";
   edge [ fontname=\"Helvetica\" fontcolor=\"red\" color=\"#606060\" ]
   node [ style=\"filled\" fillcolor=\"#e8e8f0\" shape=\"Mrecord\" fontname=\"Helvetica\" ]

   0 [label=\"0\"]
   -1 [shape=point color=white]
   -1 -> 0 []
   0 -> 1 [ label=\"foobarbazs / fooBarBazs\"]
   -2 [shape=point color=white]
   1 -> -2 []
 }`
sourceraw docstring

text->token-stringsclj

(text->token-strings text)
(text->token-strings text analyzer)
(text->token-strings text analyzer field-name)

Given a text (and an optional analyzer) returns a vector of tokens as strings. Params:

  • text: String
  • analyzer: Lucene Analyzer
  • field-name can be either string if clojure.lang.Named.
Given a text (and an optional analyzer) returns a vector of tokens as strings.
Params:
* text: String
* analyzer: Lucene Analyzer
* field-name can be either string if clojure.lang.Named.
sourceraw docstring

text->tokensclj

(text->tokens text)
(text->tokens text analyzer)
(text->tokens text analyzer field-name)

Given a text (and an optional analyzer) returns a list of tokens as maps of shape: {:token "pre", :type "<ALPHANUM>", :start_offset 0, :end_offset 3, :position 0, :positionLength 1} Params:

  • text: String
  • analyzer: Lucene Analyzer
  • field-name can be either string if clojure.lang.Named.
Given a text (and an optional analyzer) returns a list of tokens as maps of shape:
{:token "pre",
 :type "<ALPHANUM>",
 :start_offset 0,
 :end_offset 3,
 :position 0,
 :positionLength 1}
 Params:
 * text: String
 * analyzer: Lucene Analyzer
 * field-name can be either string if clojure.lang.Named.
sourceraw docstring

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close