Liking cljdoc? Tell your friends :D

zensols.nlparse.stopword

This namesapce provides ways of filtering stop word tokens.

To avoid the double negative in function names, go words are defined to be the compliment of a vocabulary with a stop word list. Functions like go-word? tell whether or not a token is a stop word, which are defined to be:

  • stopwords (predefined list)
  • punctuation
  • numbers
  • non-alphabetic characters
This namesapce provides ways of filtering *stop word* tokens.

To avoid the double negative in function names, *go words* are defined to be
the compliment of a vocabulary with a stop word list.  Functions
like [[go-word?]] tell whether or not a token is a stop word, which are
defined to be:

  * stopwords (predefined list)
  * punctuation
  * numbers
  * non-alphabetic characters
raw docstring

*stopword-config*clj

Configuration for filtering stop words.

Keys

  • :post-tags POS tags for go words (see namespace docs)
  • :word-form-fn function run on the token in go-word-form; for example if #(-> % :lemma s/lower-case) is given then lemmatization is used (i.e. Running -> run)
Configuration for filtering stop words.

Keys
---
* **:post-tags** POS tags for *go words* (see namespace docs)
* **:word-form-fn** function run on the token in [[go-word-form]]; for example
  if `#(-> % :lemma s/lower-case)` is given then lemmatization is
  used (i.e. Running -> run)
sourceraw docstring

go-word-formclj

(go-word-form token)

Conical string word count form of a token.

Conical string word count form of a token.
sourceraw docstring

go-word-formsclj

(go-word-forms tokens)

Filter tokens per go-word? and return their form based on go-word-form.

Filter tokens per [[go-word?]] and return their *form*
based on [[go-word-form]].
sourceraw docstring

go-word?clj

(go-word? token)

Return whether a token is a go token.

Return whether a token is a *go* token.
sourceraw docstring

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close