Liking cljdoc? Tell your friends :D

Design

Philosophy

Crux is an unbundled database, where parts are pluggable and can be swapped out for alternative implementations. We are attempting to follow the Unix philosophy of each part doing one thing well.

This document walks through some of the high-level Crux component.

ObjectStore

Crux contains an ObjectStore for storing and retreiving documents.

link:../src/crux/db.clj[role=include]

The main implementation is KvObjectStore wrapped by CachedObjectstore.

Objects are stored against used their hashed value.
The KvObjectStore wraps the lower-level KvStore

KvStore

link:../src/crux/kv.clj[role=include]

The KvStore exposes the operations Crux needs to use a Key Value store for indexing purposes, and for storing Objects.

Implementations of the KvStore include:

Table 1. KvStores
ImplementationDescription

crux.kv.rocksdb/RocksKv

Uses RocksDB and the standard Java API that ships with RocksDB

crux.kv.rocksdb.jnr/RocksJNRKv

Uses RocksDB and a custom built JNR bridge

crux.kv.lmdb/LMDBKv

Uses LMBD

crux.kv.memdb/MemKv

An in-memory KvStore

TxLog

Crux uses a replayable transaction log that is used to derive indicies.

link:../src/crux/db.clj[role=include]

The transaction log is used to write both transaction and documents to an appendable transaction log.

Table 2. TxLogs
ImplementationDescription

crux.kafka.KafkaTxLog

(default), using Kafka with separate Kafka topics for both documents and transactions

crux.tx.KvTxLog

Uses the KvStore as the transaction log. This useful if Kafka isn’t desired, and the topology of having distributed Crux nodes feeding of a centralised transacton log isn’t required

Indexer

The indexer indexes documents and transactions for query purposes. This might be on the back of subscriptions to Kafka topics, or by direct calls made in a single node topological setup.

link:../src/crux/db.clj[role=include]

The single implememention is crux.tx.KvIndexer, which makes use of the KvStore to persist indices.

Query Design

Index

Crux has the fundamental notion of an index.

link:../src/crux/db.clj[role=include]

The two operations are seek and next.

Layered Index

The layered index exists to faciliate the idea of navigating up and down an index, in a tree like manner.

link:../src/crux/db.clj[role=include]

For example the index attribute+value+entity+content-hash is the following tree:

digraph G
{
attribute->value
value->entity
entity->content

}

open-level gives instructions to open and move down a level. In the above example if could be moving the index down to point at the values within a given attribute. That is to say that if we have an attribute :name, the index will interate across all values for that attribute, until there are no more name values.

close-level moves the index back-up, so in the above example, we can iterate at the higher level of attribute.

Virtual Index

A Virtual Index comprises together multiple child indices. This is to join indices together, returning key/value pairs on where they match.

A join condition in a query could be reflected by a Virtual Index. A Virtual Index will maintain state as to where Index is currently positioned.

UnaryJoinVirtualIndex comprises of multiple child indices and implements both Index and LayeredIndex. Calling seek-values on it will advance all the child indices internally until they all contain the same key. This involves calling seek-values on each child index until the indices match at the same level. Calling next-values would move all the indices along until the next common key that all the indices share.

Table 3. Virtual Index Example

a

0

1

3

4

5

6

b

0

2

3

5

c

2

3

4

5

6

In the above example, where UnaryJoinVirtualIndex joins three RelationVirtualIndexs (a,b,c). Calling seek-values would return 3 as the first value found.

Calling next-values would jump ahead to the value 5.

TODO: JP: I need to understand the Nary vs Unary testing index-test.

Indexes

The Indexer writes to these indices when indexing documents:

Table 4. Crux Document Indices
ImplementationDescription

attribute+value+entity+content-hash-key

For querying against attribute values

attribute+entity+value+entity+content-hash-key

For querying against attribute values within entities

The Indexer writes to these indices when indexing transactions:

Table 5. Crux Transaction Indices
ImplementationDescription

entity+valid-time+transaction-time+transaction-id

Use for as-of document retrieval

Can you improve this documentation?Edit on GitHub

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close