(index-keys-from-string s)
(index-keys-from-string s valid-word-fn)
Creates a collection of prepped and valid words from a string.
Users may provide their own valid-word-fn
.
Creates a collection of prepped and valid words from a string. Users may provide their own `valid-word-fn`.
(index-map-from-doc {:keys [id content]} & opts-map)
Builds an index map from a document. A document is a map with two keys - :id
and :content
.
The :id
is the unique identifier for the document that the users can use during search to get the actual document.
The :content
key is the string whose words will be indexed.
Users may provide an opts-map with keys :maintain-actual?
and :valid-word-fn
.
:maintain-actual?
is true
, the actual indexed words are saved along with the encoded form of the words.:valid-word-fn
is a custom word validator that users may provide.
Note that maintaining actual words will consume additional space.
Sample input:(index-map-from-doc {:id 1 :content "World War 1"} {:maintain-actual? true})
Sample output:
{"W643" [{:id 1, :actuals #{"world"}, :frequency 1}]
"W600" [{:id 1, :actuals #{"war"}, :frequency 1}]}
The :id
is the same as supplied by the user.
The value of :frequency
is the frequency of the word in the :content
string.
Builds an index map from a document. A document is a map with two keys - `:id` and `:content`. The `:id` is the unique identifier for the document that the users can use during search to get the actual document. The `:content` key is the string whose words will be indexed. Users may provide an opts-map with keys `:maintain-actual?` and `:valid-word-fn`. - When `:maintain-actual?` is `true`, the actual indexed words are saved along with the encoded form of the words. - The `:valid-word-fn` is a custom word validator that users may provide. Note that maintaining actual words will consume additional space. Sample input: ``` (index-map-from-doc {:id 1 :content "World War 1"} {:maintain-actual? true}) ``` Sample output: ``` {"W643" [{:id 1, :actuals #{"world"}, :frequency 1}] "W600" [{:id 1, :actuals #{"war"}, :frequency 1}]} ``` The `:id` is the same as supplied by the user. The value of `:frequency` is the frequency of the word in the `:content` string.
(prep-string s)
This is the default string sanitization function.
This is the default string sanitization function.
(prep-string-coll str-coll & valid-word-fn)
Creates a collection of prepped and valid words. Input is a collection of strings.
Users may provide their own valid-word-fn
.
Creates a collection of prepped and valid words. Input is a collection of strings. Users may provide their own `valid-word-fn`.
(text-index doc-coll & opts-map)
Builds the final index map from a collection of documents. A document is a map with two keys - :id
and :content
.
The :id
is the unique identifier for the document that the users can use during search to get the actual document.
The :content
key is the string whose words will be indexed.
Users may provide an opts-map with keys :maintain-actual?
and :valid-word-fn
.
:maintain-actual?
is true
, the actual indexed words are saved along with the encoded form of the words.:valid-word-fn
is a custom word validator that users may provide.
The value of :valid-word-fn
is a single arity fn that takes one word (string) and returns boolean.
Note that maintaining actual words will consume additional space.
Sample input:(text-index [{:id 1 :content "World war 1"}
{:id 2 :content "Independence for the world"}]
{:maintain-actual? true})
Sample output:
{"W643" [{:id 1, :actuals #{"world"}, :frequency 1}
{:id 2, :actuals #{"world"}, :frequency 1}]
"W600" [{:id 1, :actuals #{"war"}, :frequency 1}]
"I531" [{:id 2, :actuals #{"independence"}, :frequency 1}]}
The :id
is the same as supplied by the user.
The value of :frequency
is the frequency of the word in the :content
string.
Builds the final index map from a collection of documents. A document is a map with two keys - `:id` and `:content`. The `:id` is the unique identifier for the document that the users can use during search to get the actual document. The `:content` key is the string whose words will be indexed. Users may provide an opts-map with keys `:maintain-actual?` and `:valid-word-fn`. - When `:maintain-actual?` is `true`, the actual indexed words are saved along with the encoded form of the words. - The value of `:valid-word-fn` is a custom word validator that users may provide. The value of `:valid-word-fn` is a single arity fn that takes one word (string) and returns boolean. Note that maintaining actual words will consume additional space. Sample input: ``` (text-index [{:id 1 :content "World war 1"} {:id 2 :content "Independence for the world"}] {:maintain-actual? true}) ``` Sample output: ``` {"W643" [{:id 1, :actuals #{"world"}, :frequency 1} {:id 2, :actuals #{"world"}, :frequency 1}] "W600" [{:id 1, :actuals #{"war"}, :frequency 1}] "I531" [{:id 2, :actuals #{"independence"}, :frequency 1}]} ``` The `:id` is the same as supplied by the user. The value of `:frequency` is the frequency of the word in the `:content` string.
(valid-word? s)
This is the default word validation function.
This is the default word validation function.
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close