Liking cljdoc? Tell your friends :D

What is Crux?

Introduction

Crux—to use Martin Kleppmann’s phrase—is an unbundled database.

What do we have to gain from turning the database inside out? Simpler code, better scalability, better robustness, lower latency, and more flexibility for doing interesting things with data.

— Martin Kleppmann (2014)

It is a database turned inside out, using:

  • Apache Kafka for the primary storage of transactions and documents as semi-immutable logs.

  • RocksDB or LMDB to host indexes for rich query support.

This decoupled design enables Crux to be maintained as a small and efficient core that can be scaled and evolved to support a large variety of use-cases.

Crux Kafka Node Diagram

Crux is an open source document database with bitemporal graph queries.

Document database with graph queries

Bitemporal

Crux is a bitemporal database that stores transaction time and valid time histories. While a [uni]temporal database enables "time travel" querying through the transactional sequence of database states from the moment of database creation to its current state, Crux also provides "time travel" querying for a discrete valid time axis without unnecessary design complexity or performance impact. This means a Crux user can populate the database with past and future information regardless of the order in which the information arrives, and make corrections to past recordings to build an ever-improving temporal model of a given domain.

Bitemporal modelling is broadly useful for event-based architectures and is a critical requirement for systems in any industry with strong auditing regulations, where you need to be able to answer the question "what did you know and when did you know it?".

Read more about Bitemporality in Crux or specifically the known uses for these capabilities.

Query

Crux supports a Datalog query interface for reading data and traversing relationships across all documents. Queries are executed so that the results are lazily streamed from the underlying indexes.

Crux is ultimately a store of versioned EDN documents. The fields within these documents are automatically indexed as Entity-Attribute-Value triples to support efficient graph queries.

Schemaless

Crux does not enforce any schema for the documents it stores. One reason for this is that data might come from many different places, and may not ultimately be owned by the service using Crux to query the data. This design enables schema-on-write and/or schema-on-read to be achieved outside of the core of Crux to meet the exact application requirements.

Distributed

Nodes can come and go, with local indexes stored in a Key/Value store such as RocksDB, whilst reading and writing master data to central log topics (only Kafka is currently supported). Queries are not distributed and there is no sharding of documents across nodes.

Crux can also run in a non-distributed "standalone" mode, where the transaction and document logs exist only inside of a local Key/Value store such as RocksDB. This is appropriate for non-critical usage where availability and durability requirements do not justify the need for Kafka.

Eviction

Crux supports eviction of active and historical data to assist with technical compliance for information privacy regulations.

The main transaction log contains only hashes and is immutable. All document content is stored in a dedicated document log that can be evicted by compaction.

Can you improve this documentation? These fine people already did:
Jeremy Taylor, Daniel Mason, Jon Pither, Tom Taylor, Mike Thompson, Ivan Fedorov & Håkan Råberg
Edit on GitHub

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close