Liking cljdoc? Tell your friends :D

Configuration

To start a Crux node, use the Java API or the Clojure crux.api.

Within Clojure, we call start-node from within crux.api, passing it a set of options for the node. There are a number of different configuration options a Crux node can have, grouped into topologies.

Table 1. Crux Topologies
Name	Transaction Log	Topology
Standalone	Uses local event log	`:crux.standalone/topology`
Kafka	Uses Kafka	`:crux.kafka/topology`
JDBC	Uses JDBC event log	`:crux.jdbc/topology`

Use a Kafka node when horizontal scalability is required or when you want the guarantees that Kafka offers in terms of resiliency, availability and retention of data.

Multiple Kafka nodes participate in a cluster with Kafka as the primary store and as the central means of coordination.

The JDBC node is useful when you don’t want the overhead of maintaining a Kafka cluster. Read more about the motivations of this setup here.

The Standalone node is a single Crux instance which has everything it needs locally. This is good for experimenting with Crux and for small to medium sized deployments, where running a single instance is permissible.

Crux nodes implement the ICruxAPI interface and are the starting point for making use of Crux. Nodes also implement java.io.Closeable and can therefore be lifecycle managed.

Properties

The following properties are within the topology used as a base for the other topologies, crux.node:

Table 2. `crux.node` configuration
Property	Default Value
`:crux.node/object-store`	`'crux.object-store/kv-object-store`

From version 20.01-1.7.0-alpha-SNAPSHOT the kv-store should be specified by including an extra module in the node’s topology vector. For example a rocksdb backend looks like {:crux.node/topology '[crux.standalone/topology crux.kv.rocksdb/kv-store]}

The following set of options are used by KV backend implementations, defined within crux.kv:

Table 3. `crux.kv` options
Property	Description	Default Value
`:crux.kv/db-dir`	Directory to store K/V files	data
`:crux.kv/sync?`	Sync the KV store to disk after every write?	false
`:crux.kv/check-and-store-index-version`	Check and store index version upon start?	true

Standalone Node

Using a Crux standalone node is the best way to get started. Once you’ve started a standalone Crux instance as described below, you can then follow the getting started example.

Table 4. Standalone configuration
Property	Description	Default Value
`:crux.standalone/event-log-kv-store`	Key/Value store to use for standalone event-log persistence	'crux.kv.rocksdb/kv
`:crux.standalone/event-log-dir`	Directory used to store the event-log and used for backup/restore, i.e. `"data/eventlog-1"`
`:crux.standalone/event-log-sync?`	Sync the event-log backend KV store to disk after every write?	false

Project Dependency

link:./deps.edn[role=include]

Getting started

The following code creates a default crux.standalone node which runs completely within memory (with both the event-log store and db store using crux.kv.memdb/kv):

link:./src/docs/examples.clj[role=include]

link:./src/docs/examples.clj[role=include]

You can later stop the node if you wish:

link:./src/docs/examples.clj[role=include]

RocksDB

RocksDB is often used as Crux’s primary store (in place of the in-memory kv store in the example above). In order to use RocksDB within Crux, however, you must first add RocksDB as a project dependency:

Project Dependency

./deps.edn

Starting a node using RocksDB

link:./src/docs/examples.clj[role=include]

You can create a node with custom RocksDB options by passing extra keywords in the topology. These are:

:crux.kv.rocksdb/disable-wal?, which takes a boolean (if true, disables the write ahead log)
:crux.kv.rocksdb/db-options, which takes a RocksDB 'Options' object (see more here, from the RocksDB javadocs)

To include rocksdb metrics in monitoring crux.kv.rocksdb/kv-store-with-metrics should be included in the topology map instead of the above.

LMDB

An alternative to RocksDB, LMDB provides faster queries in exchange for a slower ingest rate.

Project Dependency

./deps.edn

Starting a node using LMDB

link:./src/docs/examples.clj[role=include]

Kafka Nodes

When using Crux at scale it is recommended to use multiple Crux nodes connected via a Kafka cluster.

Kafka nodes have the following properties:

Table 5. Kafka node configuration
Property	Description	Default value
`:crux.kafka/bootstrap-servers`	URL for connecting to Kafka	localhost:9092
`:crux.kafka/tx-topic`	Name of Kafka transaction log topic	crux-transaction-log
`:crux.kafka/doc-topic`	Name of Kafka documents topic	crux-docs
`:crux.kafka/create-topics`	Option to automatically create Kafka topics if they do not already exist	true
`:crux.kafka/doc-partitions`	Number of partitions for the document topic	1
`:crux.kafka/replication-factor`	Number of times to replicate data on Kafka	1
`:crux.kafka/kafka-properties-file`	File to supply Kafka connection properties to the underlying Kafka API
`:crux.kafka/kafka-properties-map`	Map to supply Kafka connection properties to the underlying Kafka API

Project Dependencies

link:./deps.edn[role=include]
link:./deps.edn[role=include]

Getting started

Use the API to start a Kafka node, configuring it with the bootstrap-servers property in order to connect to Kafka:

link:./src/docs/examples.clj[role=include]

If you don’t specify kv-store then by default the Kafka node will use RocksDB. You will need to add RocksDB to your list of project dependencies.

You can later stop the node if you wish:

link:./src/docs/examples.clj[role=include]

Embedded Kafka

Crux is ready to work with an embedded Kafka for when you don’t have an independently running Kafka available to connect to (such as during development).

Project Depencies

./deps.edn
./deps.edn

Getting started

link:./src/docs/examples.clj[role=include]

link:./src/docs/examples.clj[role=include]

You can later stop the Embedded Kafka if you wish:

link:./src/docs/examples.clj[role=include]

JDBC Nodes

JDBC Nodes use next.jdbc internally and pass through the relevant configuration options that you can find here.

Below is the minimal configuration you will need:

Table 6. Minimal JDBC Configuration
Property	Description
`:crux.jdbc/dbtype`	One of: postgresql, oracle, mysql, h2, sqlite
`:crux.jdbc/dbname`	Database Name

Depending on the type of JDBC database used, you may also need some of the following properties:

Table 7. Other JDBC Properties
Property	Description
`:crux.kv/db-dir`	For h2 and sqlite
`:crux.jdbc/host`	Database Host
`:crux.jdbc/user`	Database Username
`:crux.jdbc/password`	Database Password

Project Dependencies

link:./deps.edn[role=include]
link:./deps.edn[role=include]

Getting started

Use the API to start a JDBC node, configuring it with the required parameters:

link:./src/docs/examples.clj[role=include]

HTTP

Crux can be used programmatically as a library, but Crux also ships with an embedded HTTP server, that allows clients to use the API remotely via REST.

Set the server-port configuration property on a Crux node to expose a HTTP port that will accept REST requests:

Table 8. HTTP Nodes Configuration
Component	Property	Description
crux.http-server	`port`	Port for Crux HTTP Server e.g. `8080`

Visit the guide on using the REST api for examples of how to interact with Crux over HTTP.

Starting a HTTP Server

Project Dependency

link:./deps.edn[role=include]

You can start up a HTTP server on a node by including crux.http-server/module in your topology, optionally passing the server port:

link:./src/docs/examples.clj[role=include]

Using a Remote API Client

Project Dependency

link:./deps.edn[role=include]

To connect to a pre-existing remote node, you need a URL to the node and the above on your classpath. We can then call crux.api/new-api-client, passing the URL. If the node was started on localhost:3000, you can connect to it by doing the following:

link:./src/docs/examples.clj[role=include]

The remote client requires valid and transaction time to be specified for all calls to crux/db.

Docker

If you wish to use Crux with Docker (no JVM/JDK/Clojure install required!) we have the following:

Crux HTTP Node: An image of a standalone Crux node (using a in memory kv-store by default) & HTTP server, useful if you wish to a freestanding Crux node accessible over HTTP, only having to use Docker.

Artifacts

Alongside the various images available on Dockerhub, there are a number of artifacts available for getting started quickly with Crux. These can be found on the latest release of Crux. Currently, these consist of a number of common configuration uberjars and a custom artifact builder.

To create your own custom artifacts for crux, do the following:

Download and extract the crux-builder.tar.gz from the latest release
You can build an uberjar using either Clojure’s deps.edn or Maven (whichever you’re more comfortable with)
- For Clojure, you can add further Crux dependencies in the deps.edn file, set the node config in crux.edn, and run build-uberjar.sh
- For Maven, it’s the same, but dependencies go in pom.xml
Additionally, you can build a Docker image using the build-docker.sh script in the docker directory.

Backup and Restore

Crux provides utility APIs for local backup and restore when you are using the standalone mode.

An additional example of backup and restore is provided that only applies to a stopped standalone node here.

In a clustered deployment, only Kafka’s official backup and restore functionality should be relied on to provide safe durability. The standalone mode’s backup and restore operations can instead be used for creating operational snapshots of a node’s indexes for scaling purposes.

Monitoring

Crux can display metrics through a variety of interfaces. Internally, it uses dropwizard’s metrics library to register all the metrics and then passes the registry around to reporters to display the data in a suitable application.

Project Dependency

In order to use any of the crux-metrics reporters, you will need to include the following dependency on crux-metrics:

link:./deps.edn[role=include]

The various types of metric reporters bring in their own sets of dependencies, so we expect these to be provided by the user in their own project (in order to keep the core of crux-metrics as lightweight as possible). Reporters requiring further dependencies will have an 'additional dependencies' section.

Getting Started

By default indexer and query metrics are included. It is also possible to add rocksdb metrics when it is being used. These arguments can be used whenever any of the topologies to display metrics are included.

Table 9. Registry arguments
Field	Property	Default	Description
`:crux.metrics/with-indexer-metrics?`	`boolean`	`true`	Includes indexer metrics in the metrics registry
`:crux.metrics/with-query-metrics?`	`boolean`	`true`	Includes query metrics in the metrics registry

RocksDB metrics

To include the RocksDB metrics when monitoring the 'crux.kv.rocksdb/kv-store-with-metrics module should be included in the topology map (in place of 'crux.kv.rocksdb/kv-store):

(api/start-node {:crux.node/topology ['crux.standalone/topology
                                      'crux.kv.rocksdb/kv-store-with-metrics
                                      ...]
                 ...})

Reporters

Crux currently supports the following outputs:

Console stdout
CSV file
JMX
Prometheus (reporter & http exporter)
AWS Cloudwatch metrics

Console

This component logs metrics to sysout at regular intervals.

(api/start-node {:crux.node/topology ['crux.standalone/topology
                                      'crux.metrics.dropwizard.console/reporter]
                 ...
                 })

Table 10. Console metrics arguments
Field	Property	Description
`:crux.metrics.dropwizard.console/report-frequency`	`int`	Interval in seconds between output dump
`:crux.metrics.dropwizard.console/rate-unit`	`time-unit`	Unit which rates are displayed
`:crux.metrics.dropwizard.console/duration-unit`	`time-unit`	Unit which durations are displayed

CSV

This component logs metrics to a csv file at regular intervals. Only filename is required.

(api/start-node {:crux.node/topology ['crux.standalone/topology
                                      'crux.metrics.dropwizard.csv/reporter]
                 :crux.metrics.dropwizard.csv/file-name "csv-out"
                 ...
                 })

Table 11. CSV metrics arguments
Field	Property	Required	Description
`:crux.metrics.dropwizard.csv/file-name`	`string`	`true`	Output folder name (must already exist)
`:crux.metrics.dropwizard.csv/report-frequency`	`int`	`false`	Interval in seconds between file write
`:crux.metrics.dropwizard.csv/rate-unit`	`time-unit`	`false`	Unit which rates are displayed
`:crux.metrics.dropwizard.csv/duration-unit`	`time-unit`	`false`	Unit which durations are displayed

JMX

Provides JMX mbeans output.

Additional Dependencies

You will need to add the following dependencies, alongside crux-metrics, in your project:

link:../project.clj[role=include]

Getting Started

(api/start-node {:crux.node/topology ['crux.standalone/topology
                                      'crux.metrics.dropwizard.jmx/reporter]
                 ...
                 })

Table 12. JMX metrics arguments
Field	Property	Description
`:crux.metrics.dropwizard.jmx/domain`	`string`	Change metrics domain group
`:crux.metrics.dropwizard.jmx/rate-unit`	`time-unit`	Unit which rates are displayed
`:crux.metrics.dropwizard.jmx/duration-unit`	`time-unit`	Unit which durations are displayed

Prometheus

Additional Dependencies

You will need to add the following dependencies, alongside crux-metrics, in your project:

link:../project.clj[role=include]

HTTP-Exporter

The prometheus http exporter starts a standalone server hosting prometheus metrics by default at http://localhost:8080/metrics. The port can be changed with an argument, and jvm metrics can be included in the dump.

Getting Started

(api/start-node {:crux.node/topology ['crux.standalone/topology
                                      'crux.metrics.dropwizard.prometheus/http-exporter]
                 ...
                 })

Table 13. Prometheus exporter metrics arguments
Field	Property	Description
`:crux.metrics.dropwizard.prometheus/port`	`int`	Desired port number for prometheus client server. Defaults to `8080`
`:crux.metrics.dropwizard.prometheus/jvm-metrics?`	`boolean`	If `true` jvm metrics are included in the metrics dump

Reporter

This component pushes prometheus metrics to a specified pushgateway at regular durations (by default 1 second).

Getting Started

(api/start-node {:crux.node/topology ['crux.standalone/topology
                                      'crux.metrics.dropwizard.prometheus/reporter]
                 :crux.metric.dropwizard.prometheus/pushgateway "localhost:9090"
                 ...
                 })

Table 14. Prometheus reporter metrics arguments
Field	Property	Description
`:crux.metrics.dropwizard.prometheus/push-gateway`	`string`	Address of the prometheus server. This field is required
`:crux.metrics.dropwizard.prometheus/report-frequency`	`duration`	Time in ISO-8601 standard between metrics push. Defaults to "PT1S".
`:crux.metrics.dropwizard.prometheus/prefix`	`string`	Prefix all metric titles with this string

AWS Cloudwatch metrics

Pushes metrics to Cloudwatch. This is indented to be used with a crux node running inside a EBS/Fargate instance. It attempts to get the relevant credentials through system variables. Crux uses this in its aws benchmarking system which can be found here.

Additional Dependencies

You will need to add the following dependencies, alongside crux-metrics, in your project:

link:../project.clj[role=include]

Getting Started

(api/start-node {:crux.node/topology ['crux.standalone/topology
                                      'crux.metrics.dropwizard.cloudwatch/reporter]
                 ...
                 })

Table 15. Cloudwatch metrics arguments
Field	Property	Description
`:crux.metrics.dropwizard.prometheus/duration`	`duration`	Time between metrics push
`:crux.metrics.dropwizard.prometheus/dry-run?`	`boolean`	When `true` the reporter outputs to `cloujure.logging/log*`
`:crux.metrics.dropwizard.prometheus/jvm-metrics?`	`boolean`	Should jvm metrics be included in the pushed metrics?
`:crux.metrics.dropwizard.prometheus/jvm-dimensions`	`string-map`	Should jvm metrics be included in the pushed metrics?
`:crux.metrics.dropwizard.prometheus/region`	`string`	Cloudwatch region for uploading metrics. Not required inside a EBS/Fargate instance but needed for local testing.
`:crux.metrics.dropwizard.prometheus/ignore-rules`	`string-list`	A list of strings to ignore specific metrics, in gitignore format. e.g. `["crux.tx" "!crux.tx.ingest"]` would ignore crux.tx.*, except crux.tx.ingest

Tips for running

To upload metrics to Cloudwatch locally the desired region needs to be specified with :crux.metrics.dropwizard.prometheus/region, and your aws credentials at ~/.aws/credentials need to be visible (If ran in docker, mount these as a volume).

When ran on aws if using cloudformation the node needs to have the permission 'cloudwatch:PutMetricData'. For a example see Crux’s benchmarking system here.

❮(ns crux.api)Thanks❯

Can you improve this documentation? These fine people already did:
Daniel Mason, Tom Taylor, Jeremy Taylor, James Henderson, Jon Pither, Dan Mason, Antonelli712, Ivan Fedorov, Ben Gerard & Alex DavisEdit on GitHub

cljdoc builds & hosts documentation for Clojure/Script libraries

Keyboard shortcuts

`Ctrl`+`k`	Jump to recent docs
`←`	Move to previous article
`→`	Move to next article
`Ctrl`+`/`	Jump to the search field

Raise an issue Browse cljdoc source Chat on Slack

× close