Liking cljdoc? Tell your friends :D

jobtech-nlp-stava

Build Status codecov Clojars Project

A Clojure library providing the natural language processing functions of Stava. It currently provides access to the Swedish compound analyser and the Swedish part-of-speech tagger.

It entirely depends on a native library, which is only supplied in a pre-compiled form for GNU/Linux. You may be able to build a native library for other operating systems.

Components

This library wraps the Stava program with a Java and Clojure API. Stava was created by Viggo Kann (viggo@csc.kth.se) och Joachim Hollman (joachim@algoritmica.se). Stava is licensed under GPL.

Structure

The basic idea of this library is:

  1. Stava is downloaded and compiled (by src/c/Makefile)
  2. A thin C wrapper defines functions to export (in src/c/libwrapper.c).
  3. SWIG is used to generate Java wrappers for the exported C functions (definitions src/c/JavaStava.i and target directory src/java/jobtech/).
  4. A Clojure API loads the Java classes, and initialises Stava with its resources (src/clj/).
├── src
│   ├── c    - a minimal C-wrapper and a SWIG definition, with a GNU Makefile.
│   └── java - SWIG-generated Java classes, wrapping the C functions.
│   ├── clj  - Clojure wrappers for the Java classes.
├── test
│   └── clj  - Clojure tests.

The compiled native library, and the Stava resources are both placed in resources/ by the build system. This directory is included in the target JAR artefact.

Build

Requirements

  • WGet
  • OpenJDK
  • GNU Make
  • GCC
  • SWIG
  • Leiningen

Invoking the build system

First, edit src/c/Makefile, and check the include paths for OpenJDK. They should point to the folders containing jni.h and jni_md.h.

Then invoke the build system like this:

lein build-lib

Publishing a new version on Clojars

First, update the version number where it occurs in this project's files (in a future version, an automatic way to do this will be provided).

Then re-build the library. Finally, upload it to Clojars:

lein deploy clojars

Testing

Invoke the automatic tests like this:

lein test

An extra test script for developers with a local Docker host, to run all tests in a clean environment:

sh bin/test-repl.sh

Usage

Example usage from a Clojure REPL (assumes a prior lein build-lib):

$ lein repl
user=> (load-file "src/clj/jobtech_nlp_stava/compound_splitter/stava.clj")
#'jobtech-nlp-stava.compound-splitter.stava/split
user=> (jobtech-nlp-stava.compound-splitter.stava/split "turbofläkt")
(["turbo" "fläkt"])
user=> (load-file "src/clj/jobtech_nlp_stava/tagger/stava.clj")
#'jobtech-nlp-stava.tagger.stava/tag
user=> (jobtech-nlp-stava.tagger.stava/tag "turbofläkten")
(nn.utr.sin.def.nom)

License

Copyright © 2019

This project is licensed under the terms of the GPL v2 license.

Can you improve this documentation?Edit on GitHub

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close