A Clojure library providing the natural language processing functions of Stava. It currently provides access to the Swedish compound analyser and the Swedish part-of-speech tagger.
It entirely depends on a native library, which is only supplied in a pre-compiled form for GNU/Linux. You may be able to build a native library for other operating systems.
This library wraps the Stava program with a Java and Clojure API. Stava was created by Viggo Kann (viggo@csc.kth.se) och Joachim Hollman (joachim@algoritmica.se). Stava is licensed under GPL.
The basic idea of this library is:
src/c/Makefile
)src/c/libwrapper.c
).src/c/JavaStava.i
and target directory src/java/jobtech/
).src/clj/
).├── src
│ ├── c - a minimal C-wrapper and a SWIG definition, with a GNU Makefile.
│ └── java - SWIG-generated Java classes, wrapping the C functions.
│ ├── clj - Clojure wrappers for the Java classes.
├── test
│ └── clj - Clojure tests.
The compiled native library, and the Stava resources are both placed
in resources/
by the build system. This directory is included in the
target JAR artefact.
First, edit src/c/Makefile
, and check the include paths for
OpenJDK. They should point to the folders containing jni.h
and
jni_md.h
.
Then invoke the build system like this:
lein build-lib
First, update the version number where it occurs in this project's files (in a future version, an automatic way to do this will be provided).
Then re-build the library. Finally, upload it to Clojars:
lein deploy clojars
Invoke the automatic tests like this:
lein test
An extra test script for developers with a local Docker host, to run all tests in a clean environment:
sh bin/test-repl.sh
lein build-lib
):$ lein repl
user=> (load-file "src/clj/jobtech_nlp_stava/compound_splitter/stava.clj")
#'jobtech-nlp-stava.compound-splitter.stava/split
user=> (jobtech-nlp-stava.compound-splitter.stava/split "turbofläkt")
(["turbo" "fläkt"])
user=> (load-file "src/clj/jobtech_nlp_stava/tagger/stava.clj")
#'jobtech-nlp-stava.tagger.stava/tag
user=> (jobtech-nlp-stava.tagger.stava/tag "turbofläkten")
(nn.utr.sin.def.nom)
Copyright © 2019
This project is licensed under the terms of the GPL v2 license.
Can you improve this documentation?Edit on GitHub
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close