Liking cljdoc? Tell your friends :D

Checkpointing

Crux nodes can save checkpoints of their query indices on a regular basis, so that new nodes can start to service queries faster.

Crux nodes that join a cluster have to obtain a local set of query indices before they can service queries. These can be built by replaying the transaction log from the beginning, although this may be slow for clusters with a lot of history. Checkpointing allows the nodes in a cluster to share checkpoints into a central 'checkpoint store', so that nodes joining a cluster can retrieve a recent checkpoint of the query indices, rather than replaying the whole history.

The checkpoint store is a pluggable module - there are a number of officially supported implementations:

  • Java’s NIO FileSystem (below)

  • AWS’s S3

  • GCP’s Cloud Storage

Crux nodes in a cluster don’t explicitly communicate regarding which one is responsible for creating a checkpoint - instead, they check at random intervals to see whether any other node has recently created a checkpoint, and create one if necessary. The desired frequency of checkpoints can be set using approx-frequency.

Setting up

You can enable checkpoints on your index-store by adding a :checkpointer dependency to the underlying KV store:

JSON
{
  "crux/index-store": {
    "kv-store": {
      "crux/module": "crux.rocksdb/->kv-store",
      ...
      "checkpointer": {
        "crux/module": "crux.checkpoint/->checkpointer",
        "store": {
          "crux/module": "crux.checkpoint/->filesystem-checkpoint-store",
          "path": "/path/to/cp-store"
        },
        "approx-frequency": "PT6H"
      }
    }
  },
  ...
}
Clojure
{:crux/index-store {:kv-store {:crux/module 'crux.rocksdb/->kv-store
                               ...
                               :checkpointer {:crux/module 'crux.checkpoint/->checkpointer
                                              :store {:crux/module 'crux.checkpoint/->filesystem-checkpoint-store
                                                      :path "/path/to/cp-store"}
                                              :approx-frequency (Duration/ofHours 6)}}}
 ...}
EDN
{:crux/index-store {:kv-store {:crux/module crux.rocksdb/->kv-store
                               ...
                               :checkpointer {:crux/module crux.checkpoint/->checkpointer
                                              :store {:crux/module crux.checkpoint/->filesystem-checkpoint-store
                                                      :path "/path/to/cp-store"}
                                              :approx-frequency "PT6H"}}}
 ...}

Checkpointer parameters

  • approx-frequency (required, Duration): approximate frequency for the cluster to save checkpoints

  • store: (required, CheckpointStore): see the individual store for more details.

  • checkpoint-dir (string/File/Path): temporary directory to store checkpoints in before they’re uploaded

  • keep-dir-between-checkpoints? (boolean, default true): whether to keep the temporary checkpoint directory between checkpoints

  • keep-dir-on-close? (boolean, default false): whether to keep the temporary checkpoint directory when the node shuts down

FileSystem Checkpoint Store parameters

  • path (required, string/File/Path/URI): path to store checkpoints.

Can you improve this documentation? These fine people already did:
James Henderson & Dan Mason
Edit on GitHub

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close