Readme — com.fulcrologic/datomic-cloud-backup 0.0.5

Datomic Cloud Backup and Restore

This small library allows you to incrementally save the transaction log of a Datomic cloud database (may also work with on-prem, though there is an official story there). Datomic Cloud (at the time of this writing) only has a way to pull data into dev-local, but no way to actually store a proper incremental backup for business continuity reasons.

Usage

This library supports creating a clone of a Datomic database by reading the transaction log and saving that to a durable storage (incrementally). Restoration of the database involves playing back this transaction log into a new database. The one major complication is that the new database will assign different IDs to the new entities as they are transacted, thus, we must keep track of what these new IDs are in the new database so we can remap them in later incremental steps.

Thus, there is a need for two kinds of durable storage (they have different characteristics, so they are represented as two protocols):

BackupStore - A place to store the actual transactions
IDMapper - A fast durable K/V store that can track ID mappings in the new database over (a potentially long) time.

This library comes with a sample BackupStore based on AWS s3, and an IDMapper that uses Redis (via carmine).

Extra Dependencies May Be Required

If you want to use these, then you MUST add the following two libraries to your dependencies (they are not included as transitive dependencies in case you want to use something else):

com.datomic/dev-local - if you want to play with it in dev mode
com.taoensso/carmine - if you want to use the RedisIDMapper
com.amazonaws/aws-java-sdk-s3 - if you want to use the S3BackupStore
com.taoensso/nippy - for S3BackupStore

You will then need to add this library to your application that you are deploying to the cloud, and run, on a periodic basis, an incremental backup.

Sample Backup Strategy

Here’s a very naive implementation that would allow you to clone an entire database using resources in the cloud.

(ns com.fulcrologic.datomic-cloud-backup.cloning-test
  (:require
    [datomic.client.api :as d]
    [com.fulcrologic.datomic-cloud-backup.protocols :as dcbp]
    [com.fulcrologic.datomic-cloud-backup.cloning :as cloning]
    [com.fulcrologic.datomic-cloud-backup.ram-stores :refer [new-ram-store new-ram-mapper]]
    [com.fulcrologic.datomic-cloud-backup.s3-backup-store :refer [new-s3-store aws-credentials?]]
    [com.fulcrologic.datomic-cloud-backup.redis-id-mapper :refer [new-redis-mapper available? clear-mappings!]]
    [fulcro-spec.core :refer [specification behavior component assertions =>]])
  (:import (java.util UUID)))

(defonce client (d/client {:server-type :dev-local
                           :storage-dir :mem
                           :system      "test"}))

(defn backup! [dbname source-connection target-store]
  ;; Normally, you'd not write this as a tight loop, but would instead monitor if it is returning
  ;; positive numbers and pause when it isn't, as an infinite loop to keep streaming.
  (loop [n (cloning/backup-next-segment! dbname source-connection target-store 2)]
    (when (pos? n)
      (recur (cloning/backup-next-segment! dbname source-connection target-store 2)))))

(defn restore! [dbname target-conn db-store mapper]
  ;; Normally you'd have some hot standby continuously running this loop. The `next-start` may not yet
  ;; be available, so if it stays the same, delay for a bit to see if a new one arrives.
  (loop [start-t 0]
    (let [next-start (cloning/restore-segment! dbname target-conn db-store mapper start-t {})]
      (when (< start-t next-start)
        (recur next-start)))))

(comment
  ;; Clone a db using RAM resources. Probably runs out of memory for anything
  ;; but small dbs.
  (let [c (d/connect client {:db-name "some-database"})
        target-c (d/connect client {:db-name "restored"})
        store (new-ram-store)
        mapper (new-ram-mapper)]
    (backup! :some-database c store)
    (restore! :some-database target-c store mapper))

  ;; Same thing, but uses S3 resources and Redis. Should handle any db size as
  ;; long as Redis has space for all of the ID mappings
  (let [c (d/connect client {:db-name "some-database"})
        target-c (d/connect client {:db-name "restored"})
        store (new-s3-store "my-s3-bucket")
        mapper (new-redis-mapper {:spec {:host "localhost"}})]
    (backup! :some-database c store)
    (restore! :some-database target-c store mapper)))

The Redis connection options are as described in the Carmine docs.

Notes:

The call to backup-next-segment! figures out where to start automatically. You tell it how many transactions you want it to try to do (it stops if there are no more) and it returns how many were done. Thus, if it returns 0 (or less than you asked) then it is at or very near the end. If you sleep for some amount of time and try again, then you’ll be doing a hot streaming replication of your database.
The call to restore-segment! returns the next segment start time that it needs in order to continue. If you call it with a start-t that does not yet exist, then it will return that same start-t. Thus, you could sleep for some time and try again. Doing so in an infinite loop results in streaming restoration of your database.

The Redis mapper is only needed, as you can see, on the restore side. The store needs to be visible on both sides. Thus, an S3 bucket with proper permissions is the perfect way to share the streaming data across regions/zones.

Low-level Backups

The naive implementation above uses the store itself to find the next place to resume the backup.

The base utility function backup-segment! is a little more general-purpose. It allows you to explicitly specify the transaction range to back up. This would allow you to track your progress in an alternate fashion, or build more complex approaches to backing up large amounts of data.

One such approach is documented in the next section.

Parallel Backups

There is a function that is useful when you want to run an initial backup on a large database called parallel-backup!. It runs a complete backup of the database using pmap to provide concurrency that will adjust to the hardware on hand.

My preliminary tests using a t3-medium instances with an s3 store in an alternate region while backing up 1000 transactions per s3 object can back up about 40k transactions per minute using the parallel backup utility. This speed, of course, could vary dramatically based on your average datom size, and might change speed as the I/O subsystems scale out.

NOTES

The :db/id of entities in the new database will not match the :db/id of the old entities in the old database. Refs will be fixed with respect to this, but if you store external data (e.g. s3 data indexed by :db/id) then those things will need to be remapped as well (which the IDMapper can help with).
So far this library is very lightly tested (soon to be remedied).
You cannot write transactions to the target database during a restore and expect the restore to be able to continue. The restore tries to include the reified transactions to maintain any auditing data you may have added to the database. As such it is also restoring the original tx time, which it cannot do if you transact something that has a time close to "now".
Using an elision predicate on the restore can cause transactions to become empty. Those transactions will be skipped.

License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.