WARNING: Alpha Software Subject to Change
A Clojure DSL to query in memory triple models with a SPARQL like language. Matcha provides simple BGP (Basic Graph Pattern) style queries on in memory graphs of linked data triples.
Whilst Matcha is intended to query RDF models it can also be used to query arbitrary clojure data, so long as it consists of Clojure values stored in 3/tuple vectors, each entity of the triple is assumed to follow Clojure value equality semantics.
The primary use cases for Matcha are to make handling graphs of RDF
data easy by querying data with SPARQL-like queries. A typical
workflow is to CONSTRUCT
data from a backend SPARQL query, and then
use Matcha to query this graph locally.
index-triples
. In order to
be queried Matcha needs to have indexed the data; if your data is
unindexed it will index it before running the query, and then
dispose of the index. This can lead to poor performance when you
want to query the same set of data multiple times.The initial implementation is macro heavy. This means use cases where you want to dynamically create in memory queries may be more awkward.
Currently there is no support for the following SPARQL-like features:
BIND
or VALUES
clauses.Matcha defines some primary query functions select
, select-1
,
construct
, construct-1
and ask
.
First lets define an in memory database of triples, in reality this
could come from a SPARQL query CONSTRUCT
, but here we'll just define
some RDF-like data inline.
Triples can be vectors of clojure values or any datastructure that
supports positional destructuring via clojure.lang.Indexed
, this
allows Matcha to work grafter.rdf.protocols.Statement
records. Matcha works with
any clojure values in the triples, be they java URI's, or clojure
keywords.
(def friends [[:rick :rdfs/label "Rick"]
[:martin :rdfs/label "Martin"]
[:katie :rdfs/label "Katie"]
[:julie :rdfs/label "Julie"]
[:rick :foaf/knows :martin]
[:rick :foaf/knows :katie]
[:katie :foaf/knows :julie]])
Now we can build our query functions:
There are two main concepts to Matcha queries. They typically define:
BGPs have some semantics you need to be aware of:
?
are treated specially as query
variables.select
select
compiles a query function from your arguments, that returns
results as a sequence of tuples. It is directly analagous to SPARQL's
SELECT
query:
(def rick-knows (select [?name]
[[:rick :foaf/knows ?p2]
[?p2 :rdfs/label ?name]]))
When called with two arguments select
expects the first argument to
be a vector of variables to project into the solution sequence, the
second argument is analagous to a SPARQL WHERE
clause and should be
a BGP.
We can then run the query like so:
(rick-knows friends) ;; ["Martin" "Katie"]
There is also select-1
which is just like select
but returns just
the first solution.
construct
CONSTRUCT
s are the most powerful query type, as they allow you to
construct arbitrary clojure data structures directly from your query
results, and position the projected query variables where ever you
want within the structure.
(def query (construct {:grafter.rdf/uri :rick
:foaf/knows {:grafter.rdf/uri ?p
:rdfs/label ?name}}
[[:rick :foaf/knows ?p]
[?p :rdfs/label ?name]]))
Produces:
{:grafter.rdf/uri :rick
:foaf/knows #{{:grafter.rdf/uri :martin, :rdfs/label "Martin"}
{:grafter.rdf/uri :katie, :rdfs/label "Katie"}}}
Maps in a projection that contain the special key of
:grafter.rdf/uri
trigger extra behaviour, and cause the query
engine to group solutions by subject, and merge values into clojure
sets. For example in the above query you'll notice that foaf:knows
groups its solutions. If you don't want these maps to be grouped,
don't include the magic key :grafter.rdf/uri
in the top level
projection.
There is also construct-1
which is just like construct
but returns
only the first solution.
See the unit tests for more examples, including examples that use Matcha with Grafter Statements and vocabularies.
ask
ask
is the only query that doesn't specify an explicit projection.
It accepts a BGP, like the other query types and returns a boolean
result if there were any matches found.
(def any-triples? (ask [[?s ?p ?o]])
(any-triples? friends) ;; => true
Matcha is intended to be used on modest sizes of data, typically thousands of triples, and usually no more than a few hundred thousand triples. Proper benchmarking hasn't yet been done but finding all solutions on a database of a million triples can be done on a laptop in less than 10 seconds. Query time scaling seems to be roughly linear with the database size.
Copyright © Swirrl IT Ltd 2018
Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.
Can you improve this documentation?Edit on GitHub
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close