Copy a database with history via transaction logs. Based on code from Cognitect.
Copy a database with history via transaction logs. Based on code from Cognitect.
Maximum size for ID caches. Default is 1,000,000 entries (~48MB per cache). Can be bound to a different value before calling restore functions.
Maximum size for ID caches. Default is 1,000,000 entries (~48MB per cache). Can be bound to a different value before calling restore functions.
(-load-transactions backup-store dbname start-t)Protocols don't mock well. This is just a wrapper to facilitate a testing scenario.
Protocols don't mock well. This is just a wrapper to facilitate a testing scenario.
(all-refs db)Return a set of all :db.type/ref entity IDs from the provided DB
Return a set of all :db.type/ref entity IDs from the provided DB
(avet-lookup db old-id)Performs the actual AVET index lookup for an original ID. Returns the new entity ID or nil if not found.
Performs the actual AVET index lookup for an original ID. Returns the new entity ID or nil if not found.
(backup! dbname
connection
store
{:keys [txns-per-segment starting-segment parallel?]
:or {txns-per-segment 10000 starting-segment 0 parallel? true}})Run a backup.
This function retries if there is any kind of transitional failure on any segment. A persistent failure can cause the backup to terminate early with an exception.
Run a backup.
* dbname - What to call this backup in the store. Does not have to match the actual database name on the connection
* connection - The connection of the database to back up
* store - The TransactionStore to write to
* options:
* :txns-per-segment - The number of transactions to try to put in each segment of the backup. Defaults to 10,000
* :starting-segment - The starting segment of the backup (default 0). Use this to continue a previous backup. The starting-segment
times the txns-per-segment determines the transaction on which the backup will start. Defaults to 0.
* :parallel? - Default true. Run segment backups in parallel. WARNING: This can stress the DynamoDB layer. The backup
utility will retry on capacity exceptions, but beware that your application could also receive capacity exceptions.
This function retries if there is any kind of transitional failure on any segment. A persistent failure can cause
the backup to terminate early with an exception.
(backup-gaps all-segments)Looks for gaps in the given segments (maps of :start-t and :end-t inclusive, which is what the db store's
saved-segment-info returns).
Returns a possibly empty sequece of maps containing the :start-t (inclusive) and :end-t (exclusive) of any gaps that
are found in the segment list. These are what can be passed to backup-segment! to fill that gap.
Looks for gaps in the given segments (maps of :start-t and :end-t inclusive, which is what the db store's `saved-segment-info` returns). Returns a possibly empty sequece of maps containing the :start-t (inclusive) and :end-t (exclusive) of any gaps that are found in the segment list. These are what can be passed to `backup-segment!` to fill that gap.
(backup-next-segment! database-name source-conn backup-store max-txns)Computes the next transaction range that is missing from backup-store and writes up to
max-txns to the store.
Returns the number of transactions written (if that is max-txns, then
there may be more transaction that need backing up). This call has to analyze the names of all segments, and
therefore should not be called too frequently if that is expensive on your backup-store.
See also backup-segment! if you want to implement your own segmentation strategy.
Computes the next transaction range that is missing from `backup-store` and writes up to `max-txns` to the store. Returns the number of transactions written (if that is `max-txns`, then there may be more transaction that need backing up). This call has to analyze the names of all segments, and therefore should not be called too frequently if that is expensive on your backup-store. See also `backup-segment!` if you want to implement your own segmentation strategy.
(backup-segment! database-name source-conn backup-store start-t end-t)Takes an explicit transaction range, and copies it from the database to
the backup-store. end-t is exclusive.
Returns the real {:start-t a :end-t b} that was actually stored, which may be nil if nothing was stored.
Takes an explicit transaction range, and copies it from the database to
the `backup-store`. `end-t` is exclusive.
Returns the real {:start-t a :end-t b} that was actually stored, which may be nil
if nothing was stored.
(bookkeeping-txn {:keys [db] :as env} {:keys [t data]})Generates a transaction that includes a CAS that verifies the current basis t of the database is the intended target, and the adds ::original-id datoms for every tempid necessary in the transaction to track the original source (as in source database) entity. This allows future transactions to simply look up the current ID of an original ID during ID resolution.
Generates a transaction that includes a CAS that verifies the current basis t of the database is the intended target, and the adds ::original-id datoms for every tempid necessary in the transaction to track the original source (as in source database) entity. This allows future transactions to simply look up the current ID of an original ID during ID resolution.
(cache-lookup {:keys [cache max-eidx]} old-id)Look up old-id in the cache. Uses monotonic detection to skip cache for definitely-new IDs. Returns the new-id if found, nil otherwise.
Look up old-id in the cache. Uses monotonic detection to skip cache for definitely-new IDs. Returns the new-id if found, nil otherwise.
(cache-store! {:keys [cache max-eidx]} old-id new-id)Store old-id -> new-id mapping and update max-eidx if needed.
Store old-id -> new-id mapping and update max-eidx if needed.
(eid->eidx eid)Extract the 42-bit entity index from a Datomic entity ID.
Extract the 42-bit entity index from a Datomic entity ID.
Mask for extracting the 42-bit entity index from an entity ID.
Mask for extracting the 42-bit entity index from an entity ID.
(find-segment-start-t store dbname desired-start-t)When resuming a backup the last t in the database won't likely be on a segment boundary. This scans the real segments to find the correct start-t for loading.
When resuming a backup the last t in the database won't likely be on a segment boundary. This scans the real segments to find the correct start-t for loading.
(get-id-cache db-name)Get or create an ID cache state for the given database name. Returns a map with :cache (the LRU cache) and :max-eidx (atom tracking max entity index). Uses id-cache-max-size for the cache size limit.
Get or create an ID cache state for the given database name. Returns a map with :cache (the LRU cache) and :max-eidx (atom tracking max entity index). Uses *id-cache-max-size* for the cache size limit.
Global ID cache state. Maps db-name -> {:cache LRUCache :max-eidx atom}
Global ID cache state. Maps db-name -> {:cache LRUCache :max-eidx atom}
(id-cache-stats {:keys [cache max-eidx]})Get stats for an ID cache state, including both cache stats and max-eidx.
Get stats for an ID cache state, including both cache stats and max-eidx.
(is-new-id? {:keys [max-eidx]} old-id)Returns true if old-id's entity index is greater than max-seen-eidx, meaning it MUST be a new entity (never been restored before).
Returns true if old-id's entity index is greater than max-seen-eidx, meaning it MUST be a new entity (never been restored before).
A map from database name -> tx data that should be moved to the next txn
A map from database name -> tx data that should be moved to the next txn
(record-new-ids! cache-state tempids)After a successful transaction, record all the new entity ID mappings. The tempids map has string keys (original IDs as strings) and new DB IDs as values.
After a successful transaction, record all the new entity ID mappings. The tempids map has string keys (original IDs as strings) and new DB IDs as values.
(repair-backup! dbname conn db-store)Looks for problems with the backup of dbname in db-store and repairs them. This means it:
WARNING: repairing large gaps can cause a lot of I/O on the database. You may want to check the gaps
using backup-gaps before running this blindly.
Looks for problems with the backup of `dbname` in `db-store` and repairs them. This means it: * Finds gaps in the backup segments and runs backups of them. WARNING: repairing large gaps can cause a lot of I/O on the database. You may want to check the gaps using `backup-gaps` before running this blindly.
(reset-id-cache! db-name)Reset the ID cache for a database. Useful for testing.
Reset the ID cache for a database. Useful for testing.
(resolve-id {:keys [db tx-id id->attr cache-state verify?]} old-id)Finds the new database's :db/id for the given :db/id, or returns a stringified version of the ID for use as a new tempid.
When a cache-state is provided, uses the cache to avoid AVET lookups:
When verify? is true and we detect a 'new' ID via monotonic check, we verify the assumption ~1% of the time by actually checking the database.
Finds the new database's :db/id for the given :db/id, or returns a stringified version of the ID for use as a new tempid. When a cache-state is provided, uses the cache to avoid AVET lookups: - If entity index > max-seen-eidx, it's definitely NEW (no lookup needed) - Otherwise checks the cache, falling back to AVET lookup on cache miss When verify? is true and we detect a 'new' ID via monotonic check, we verify the assumption ~1% of the time by actually checking the database.
(resolved-txn {:keys [db id->attr source-refs] :as env}
{:keys [t data] :as tx-entry})This function rewrites an incoming transaction log tx-entry into
a Datomic transaction that remaps the necessary IDs to maintain referential integrity.
The transaction will include what bookkeeping-txn outputs, and all of the transaction data as
new transaction operations that have had their IDs remapped according to what has already been restored (e and a, and
v when a is a ref).
This function requires the current target db (for resolving original IDs), id->attr for finding the mappings from
the base datomic schema in the old database to the new one, and a set of attribute db ids (in the source db)
that represent ref attributes in the source database (source-refs).
This function rewrites an incoming transaction log `tx-entry` into a Datomic transaction that remaps the necessary IDs to maintain referential integrity. The transaction will include what `bookkeeping-txn` outputs, and all of the transaction data as new transaction operations that have had their IDs remapped according to what has already been restored (e and a, and v when a is a ref). This function requires the current target db (for resolving original IDs), `id->attr` for finding the mappings from the base datomic schema in the old database to the new one, and a set of attribute db ids (in the source db) that represent ref attributes in the source database (`source-refs`).
(restore!! source-database-name
target-conn
backup-store
{:keys [blacklist rewrite verify?] :or {blacklist #{} verify? true}})Restore as much of the database as possible. This is an interruptible call (you can reboot and nothing will be harmed). This function does as much pipelining and optimization as possible to try to restore the database as quickly as possible. You should check DynamoDB write provisioning to make sure it is not throttling writes, and make sure you're using a large enough node that you are not CPU bound.
This function can run for a very long time, depending on database size (days or even weeks).
See restore-segment! for a slower version that returns after each segment is restored.
The arguments are:
(fn [attr value] new-value) that can be used to obfuscate the original value to a new one.
** :verify? - Enable 1% verification of new ID assertions (default true). Set to false for maximum speed.Restore as much of the database as possible. This is an interruptible call (you can reboot and nothing will be harmed). This function does as much pipelining and optimization as possible to try to restore the database as quickly as possible. You should check DynamoDB write provisioning to make sure it is not throttling writes, and make sure you're using a large enough node that you are not CPU bound. This function can run for a *very* long time, depending on database size (days or even weeks). See `restore-segment!` for a slower version that returns after each segment is restored. The arguments are: * source-database-name - The name used for this backup. * target-conn - The connection of the target db on which to restore. * backup-store - A durable store that was used to save the database * options - A map of other options ** :blacklist - A set of keywords (attributes) to elide from all transactions. ** :rewrite - A map from attribute keyword to a `(fn [attr value] new-value)` that can be used to obfuscate the original value to a new one. ** :verify? - Enable 1% verification of new ID assertions (default true). Set to false for maximum speed.
(restore-segment! source-database-name
target-conn
backup-store
{:keys [blacklist rewrite verify?]
:or {blacklist #{} verify? true}})Restore the next segment of a backup. Auto-detects (from the target database) where to resume.
Returns one of three possible values:
The arguments are:
(fn [attr value] new-value) that can be used to obfuscate the original value to a new one.
** :verify? - Enable 1% verification of new ID assertions (default true). Set to false for maximum speed.Restore the next segment of a backup. Auto-detects (from the target database) where to resume. Returns one of three possible values: * :restored-segment - Real work was done to restore more data. * :nothing-new-available - The backup is restored to the current end. There was nothing to do. * :transaction-failed! - The restore tried to restore data, but something went wrong. Check the logs. MAY work if re-attempted (e.g. if it was a temporary Datomic outage) * :partial-segment - The restore found a segment that was supposed to have more data in it than it really had. The arguments are: * source-database-name - The name used for this backup. * target-conn - The connection of the target db on which to restore. * backup-store - A durable store that was used to save the database * options - A map of other options ** :blacklist - A set of keywords (attributes) to elide from all transactions. ** :rewrite - A map from attribute keyword to a `(fn [attr value] new-value)` that can be used to obfuscate the original value to a new one. ** :verify? - Enable 1% verification of new ID assertions (default true). Set to false for maximum speed.
(rewrite-and-filter-txn {:keys [to-one? id->attr rewrite blacklist]}
transaction)(should-verify?)Returns true approximately verification-rate of the time.
Returns true approximately *verification-rate* of the time.
(tx-time {:keys [data]})Returns the transaction time of an entry from a tx-entry.
Returns the transaction time of an entry from a tx-entry.
(verify-new-id-assertion! db old-id)Verifies that an ID we asserted was NEW actually doesn't exist in the database. Throws an exception if the assertion was wrong (the ID exists when we thought it was new).
This is called on ~1% of 'new' IDs to catch any bugs in the monotonic assumption.
Verifies that an ID we asserted was NEW actually doesn't exist in the database. Throws an exception if the assertion was wrong (the ID exists when we thought it was new). This is called on ~1% of 'new' IDs to catch any bugs in the monotonic assumption.
cljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs |
| ← | Move to previous article |
| → | Move to next article |
| Ctrl+/ | Jump to the search field |