Functions for building a CoreNLP pipeline and extracting text annotations.
The functions are designed to be chained using the threading macro or through
function composition. Please note that any annotation can be accessed using
the basic annotation
function, you are not limited to using the convenience
functions otherwise provided in this namespace.
The functions here mirror the annotation system of Stanford CoreNLP: once the
return value isn't an instance of TypesafeMap or a seq of TypesafeMap objects,
the annotation functions cannot retrieve anything from it. One example of this
might be dependency-graph
which returns a SemanticGraph object.
As a general rule, functions with names that are pluralised have a seqable
output, e.g. sentences
or tokens
. This does not matter when chaining these
functions, as all of the annotation functions will implicitly map to seqs.
Functions for building a CoreNLP pipeline and extracting text annotations. The functions are designed to be chained using the threading macro or through function composition. Please note that *any* annotation can be accessed using the basic `annotation` function, you are not limited to using the convenience functions otherwise provided in this namespace. The functions here mirror the annotation system of Stanford CoreNLP: once the return value isn't an instance of TypesafeMap or a seq of TypesafeMap objects, the annotation functions cannot retrieve anything from it. One example of this might be `dependency-graph` which returns a SemanticGraph object. As a general rule, functions with names that are pluralised have a seqable output, e.g. `sentences` or `tokens`. This does not matter when chaining these functions, as all of the annotation functions will implicitly map to seqs.
Functions dealing with dependency grammar graphs, AKA Semantic Graphs.
CoreNLP contains some duplicate field and method names, e.g. governor is the same as source. This namespace only retains a single name for these terms.
Some easily replicated convenience function cruft has also not been retained:
Nor have any useless utility functions that are easily replicated:
The methods in SemanticGraphUtils are mostly meant for internal consumption, though a few are useful enough to warrant wrapping here, e.g. subgraph.
Functions dealing with semgrex in CoreNLP (dependency grammar patterns) have
been wrapped so as to mimic the existing Clojure Core regex functions. The
sem-result
function also mimics re-groups and serves a similar purpose,
although rather than returning groups it returns named nodes/relations defined
in the pattern.
Additionally, any mutating functions have deliberately not been wrapped!
Functions dealing with dependency grammar graphs, AKA Semantic Graphs. CoreNLP contains some duplicate field and method names, e.g. governor is the same as source. This namespace only retains a single name for these terms. Some easily replicated convenience function cruft has also not been retained: - matchPatternToVertex - variations on basic graph functionality, e.g. getChildList - isNegatedVerb, isNegatedVertex, isInConditionalContext, etc. - getSubgraphVertices, yield seem equal in functionality to descendants Nor have any useless utility functions that are easily replicated: - toRecoveredSentenceString and the like - empty, size - sorting methods; just use Clojure sort, e.g. (sort (vertices g)) The methods in SemanticGraphUtils are mostly meant for internal consumption, though a few are useful enough to warrant wrapping here, e.g. subgraph. Functions dealing with semgrex in CoreNLP (dependency grammar patterns) have been wrapped so as to mimic the existing Clojure Core regex functions. The `sem-result` function also mimics re-groups and serves a similar purpose, although rather than returning groups it returns named nodes/relations defined in the pattern. Additionally, any mutating functions have deliberately not been wrapped!
This namespace contains an implementation of loom.io/view
that uses
dk.simongray.datalinguist.dependency/formatted-string
instead of
loom.io/dot-str
.
The main reason for this is that loom.io/render-to-bytes
explicitly requires
graphs made of pure data, so the function won't work with e.g. SemanticGraph
even though the class satifies loom's Graph protocol.
See loom.io
for documentation of the functions.
This namespace contains an implementation of `loom.io/view` that uses `dk.simongray.datalinguist.dependency/formatted-string` instead of `loom.io/dot-str`. The main reason for this is that `loom.io/render-to-bytes` explicitly requires graphs made of pure data, so the function won't work with e.g. SemanticGraph even though the class satifies loom's Graph protocol. See `loom.io` for documentation of the functions.
Everything to do with trees, chiefly of the constituency grammar kind.
Functions dealing with tregex in CoreNLP (constituency grammar patterns) have
been wrapped so as to mimic the existing Clojure Core regex functions. The
tregex-result
function also mimics re-groups and serves a similar purpose,
although rather than returning groups it returns named nodes defined in the
pattern.
Everything to do with trees, chiefly of the constituency grammar kind. Functions dealing with tregex in CoreNLP (constituency grammar patterns) have been wrapped so as to mimic the existing Clojure Core regex functions. The `tregex-result` function also mimics re-groups and serves a similar purpose, although rather than returning groups it returns named nodes defined in the pattern.
Functions dealing with (subject; relation; object) triples.
Functions dealing with (subject; relation; object) triples.
Various utility functions used from the other namespaces, along with collections of more or less static data.
Various utility functions used from the other namespaces, along with collections of more or less static data.
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close