HTTP — juxt/crux-test 20.09-1.11.0

HTTP

Crux offers a REST API layer in the crux-http-server module that allows you to send transactions and run queries over HTTP.

Table of Contents

Starting a HTTP Server
- Project Dependency
Using a Remote API Client
- Project Dependency
Using the HTTP API

Using Crux in this manner is a valid use-case but it cannot support all of the features and benefits that running the Crux node inside of your application provides, in particular the ability to efficiently combine custom code with multiple in-process Datalog queries.

Your application only needs to communicate with one Crux node when using the REST API. Multiple Crux nodes can placed be behind a HTTP load balancer to spread the writes and reads over a horizontally-scaled cluster transparently to the application. Each Crux node in such a cluster will be independently catching up with the head of the transaction log, and since different queries might go to different nodes, you have to be slightly conscious of read consistency when designing your application to use Crux in this way. Fortunately, you can readily achieve read-your-writes consistency with the ability to query consistent point-in-time snapshots using specific temporal coordinates.

The REST API also provides an experimental endpoint for SPARQL 1.1 Protocol queries under /sparql/, rewriting the query into the Crux Datalog dialect. Only a small subset of SPARQL is supported and no other RDF features are available.

Starting a HTTP Server

Project Dependency

link:example$deps.edn[role=include]

You can start up a HTTP server on a node by including crux.http-server/module in your topology, optionally passing the server port:

link:example$src/docs/examples.clj[role=include]

Using a Remote API Client

In addition to calling the HTTP endpoints directly you can also use the remote API client, which implements the same interfaces/protocols as a local Crux node, where possible.

Project Dependency

link:example$deps.edn[role=include]

To connect to a pre-existing remote node, you need a URL to the node and the above on your classpath. We can then call crux.api/new-api-client, passing the URL. If the node was started on localhost:3000, you can connect to it by doing the following:

link:example$src/docs/examples.clj[role=include]

Using the HTTP API

Table 1. API
uri	method	description
`/`	GET	returns various details about the state of the database
`/entity/[:key]`	GET	Returns an entity for a given ID and optional valid-time/transaction-time co-ordinates
`/entity-tx/[:key]`	GET	Returns the transaction that most recently set a key
`/entity-history/[:key]`	GET	Returns the history of the given entity and optional valid-time/transaction-time co-ordinates
`/query`	POST	Takes a datalog query and returns its results
`/sync`	GET	Wait until the Kafka consumer’s lag is back to 0
`/tx-log`	GET	Returns a list of all transactions
`/tx-log`	POST	The "write" endpoint, to post transactions.

GET `/`

Returns various details about the state of the database. Can be used as a health check.

curl -X GET $nodeURL/

{:crux.kv/kv-store "crux.kv.rocksdb/kv",
 :crux.kv/estimate-num-keys 92,
 :crux.kv/size 72448,
 :crux.tx/last-completed-tx
   {:crux.tx/tx-id 19,
    :crux.tx/tx-time #inst "2019-01-08T11:06:41.869-00:00"}
 :crux.zk/zk-active? true,
 :crux.tx-log/consumer-state
   {:crux.kafka.topic-partition/crux-docs-0
      {:next-offset 25,
       :time #inst "2019-01-08T11:06:41.867-00:00",
       :lag 0},
    :crux.kafka.topic-partition/crux-transaction-log-0
      {:next-offset 19,
       :time #inst "2019-01-08T11:06:41.869-00:00",
       :lag 0}}}

estimate-num-keys is an (over)estimate of the number of transactions in the log (each of which is a key in RocksDB). RocksDB does not provide an exact key count.

GET `/entity/[:key]`

Takes a key and, optionally, a :valid-time and/or :transact-time (defaulting to now). Returns the value stored under that key at those times.

See Bitemporality for more information.

curl -X GET \
     -H "Content-Type: application/edn" \
     $nodeURL/entity/:tommy

{:crux.db/id :tommy, :name "Tommy", :last-name "Petrov"}

curl -X GET \
     -H "Content-Type: application/edn" \
     $nodeURL/entity/:tommy?valid-time=1999-01-08T14%3A03%3A27%3A254-00%3A00

nil

GET `/entity-tx`

Takes a key and, optionally, :valid-time and/or :transact-time (defaulting to now). Returns the :put transaction that most recently set that key at those times.

See Bitemporality for more information.

curl -X GET \
     -H "Content-Type: application/edn" \
     $nodeURL/entity-tx/:foobar

{:crux.db/id "8843d7f92416211de9ebb963ff4ce28125932878",
 :crux.db/content-hash "7af0444315845ab3efdfbdfa516e68952c1486f2",
 :crux.db/valid-time #inst "2019-01-08T16:34:47.738-00:00",
 :crux.tx/tx-id 0,
 :crux.tx/tx-time #inst "2019-01-08T16:34:47.738-00:00"}

GET `/entity-history/[:key]`

Returns the history for the given entity

curl -X GET $nodeURL/entity-history/:ivan?sort-order=desc

Also accepts the following as optional query parameters: * with-corrections - includes bitemporal corrections in the response, inline, sorted by valid-time then transaction-time (default false) * with-docs - includes the documents in the response sequence, under the :crux.db/doc key (default false) * start-valid-time, start-transaction-time - bitemporal co-ordinates to start at (inclusive, default unbounded) * end-valid-time, end-transaction-time - bitemporal co-ordinates to stop at (exclusive, default unbounded)

[{:crux.db/id "a15f8b81a160b4eebe5c84e9e3b65c87b9b2f18e",
  :crux.db/content-hash "c28f6d258397651106b7cb24bb0d3be234dc8bd1",
  :crux.db/valid-time #inst "2019-01-07T14:57:08.462-00:00",
  :crux.tx/tx-id 14,
  :crux.tx/tx-time #inst "2019-01-07T16:51:55.185-00:00"
  :crux.db/doc {...}}

 {...}]

POST `/query`

Takes a Datalog query and returns its results.

curl -X POST \
     -H "Content-Type: application/edn" \
     -d '{:query {:find [e] :where [[e :last-name "Petrov"]]}}' \
     $nodeURL/query

#{[:boris][:ivan]}

Note that you are able to add :full-results? true to the query map to easily retrieve the source documents relating to the entities in the result set. For instance to retrieve all documents in a single query:

curl -X POST \
     -H "Content-Type: application/edn" \
     -d '{:query {:find [e] :where [[e :crux.db/id _]] :full-results? true}}' \
     $nodeURL/query

GET `/sync`

Wait until the Kafka consumer’s lag is back to 0 (i.e. when it no longer has pending transactions to write). Timeout is 10 seconds by default, but can be specified as a parameter in milliseconds. Returns the transaction time of the most recent transaction.

curl -X GET $nodeURL/sync?timeout=500

#inst "2019-01-08T11:06:41.869-00:00"

GET `/tx-log`

Returns a list of all transactions, from oldest to newest transaction time.

curl -X GET $nodeURL/tx-log

({:crux.tx/tx-time #inst "2019-01-07T15:11:13.411-00:00",
  :crux.api/tx-ops [[
    :crux.tx/put "c28f6d258397651106b7cb24bb0d3be234dc8bd1"
    #inst "2019-01-07T14:57:08.462-00:00"]],
  :crux.tx/tx-id 0}

 {:crux.tx/tx-time #inst "2019-01-07T15:11:32.284-00:00",
  ...})

POST `/tx-log`

Takes a vector of transactions (any combination of :put, :delete, :match, and :evict) and executes them in order. This is the only "write" endpoint.

curl -X POST \
     -H "Content-Type: application/edn" \
     -d '[[:crux.tx/put {:crux.db/id :ivan, :name "Ivan" :last-name "Petrov"}],
          [:crux.tx/put {:crux.db/id :boris, :name "Boris" :last-name "Petrov"}],
          [:crux.tx/delete :maria  #inst "2012-05-07T14:57:08.462-00:00"]]' \
     $nodeURL/tx-log

{:crux.tx/tx-id 7, :crux.tx/tx-time #inst "2019-01-07T16:14:19.675-00:00"}

`Ctrl`+`k`	Jump to recent docs
`←`	Move to previous article
`→`	Move to next article
`Ctrl`+`/`	Jump to the search field