This document captures the design decisions behind stratum's bitemporal write and read surface.
We follow Snodgrass / SQL:2011 with two consistent renames for context:
| Concept | Where | Term |
|---|---|---|
| When a fact was true in the modelled world | everywhere | valid-time |
| When the DB recorded the fact (Clojure / EDN APIs) | stratum query DSL, Clojure Datalog surfaces | tx-time |
| Same concept, surfaced through SQL | stratum SQL, PG-wire SQL surfaces | system-time |
| Reading "two axes at once" | everywhere | bitemporal |
The axis values use SQL:2011 column names verbatim:
_valid_from, _valid_to, _system_from, _system_to. All are
:int64 :temporal-unit :micros. Open-ended intervals use
Long/MAX_VALUE as the sentinel for _to columns.
A dataset opts into bitemporal semantics via :metadata:
(st/make-dataset
cols
{:metadata
{:bitemporal
{:valid {:from-col :_valid_from
:to-col :_valid_to
:unit :micros}
:system {:from-col :_system_from
:to-col :_system_to
:unit :micros}}}})
Either axis is optional. Datasets with only :valid are valid-only;
datasets with only :system are system-versioned; datasets with
both are fully bitemporal. Datasets with neither are non-temporal —
existing semantics, no overhead.
The two axes are processed identically by every code path. The following all apply per-axis:
:temporal-unit at make-dataset / load_from and _toFOR VALID_TIME / FOR SYSTEM_TIME loweringstratum.dataset/bitemporal-config returns the validated config map
or nil. Helpers valid-time-config and system-time-config return
the per-axis subset.
Configured axis columns are stamped :nullable? false in their
column metadata. Every write primitive (append!, upsert!,
retract!, bounded-update!) rejects nil row-map entries for axis
columns at runtime, and the SQL surface refuses an INSERT /
INSERT FOR PORTION OF whose period bound or axis-column value
would land NULL.
This frees the int64 NULL sentinel value (Long/MIN_VALUE) on axis
columns to mean START_OF_TIME unambiguously — the SQL
preprocessor at sql/rewrite.clj lowers the START_OF_TIME literal
to Long/MIN_VALUE, and END_OF_TIME to Long/MAX_VALUE. Allen
predicates over these sentinels work as expected: _valid_from < END_OF_TIME is always true; _valid_from >= START_OF_TIME is
always true. There is no overlap with NULL semantics on axis cols
because axes can't be NULL.
If you need "unknown valid-time" in your domain, model it as an absent row (the row doesn't appear until the valid-from is known) rather than as a NULL axis bound.
System-time is already implicit at the konserve commit layer — every
sync! produces a commit with a timestamp. So why store
_system_from / _system_to per row?
Granularity. Commit-level system-time answers "what did the whole
dataset look like as of commit C?". Row-level system-time answers
"what did the DB know about this row at instant S?" — the
distinction matters whenever a single commit touches many entities
(common in batch loads) or when readers want to ask (vt, st) 2D
queries directly off the row.
The cost is two extra long columns per row when the dataset opts
in. Adapters that don't need row-level system-time leave the config
unset, and stratum falls back to commit-time. The choice is per-table,
not global.
_system_from is set at write time and never overwritten — it pins
when the database first knew the row. _system_to starts at
Long/MAX_VALUE ("still open") and is closed only when an SCD2
correction supersedes the row. Together the pair forms a half-open
period [from, to) describing the system-time interval during which
the row's valid-time window was the database's current belief.
Concretely, after a backdated correction:
_valid_from and _valid_to
untouched. Its _system_to is closed to now. It still represents
"what the DB used to believe", reachable by FOR SYSTEM_TIME AS OF <past> queries._system_from = now, carrying the new valid-time window(s) and any
updated values.A reader filters with _system_from <= s AND _system_to > s to get
the rows the DB believed at system-instant s, plus the standard
_valid_from <= v AND _valid_to > v for the valid-time slice. Two
half-open intervals, two AND'd predicates, four columns — that's the
whole model.
_system_to instead of keeping rows append-only?Stratum stores rows in a classical SCD2 layout: every row is a tuple
that physically exists at one location in its columnar dataset. To
distinguish "row was the DB's belief at time s" from "row has been
superseded", we need some per-row marker of the supersession point.
The two natural shapes are:
_system_to on the
old row when it's superseded. One predicate at read time
(_system_to > s), one column write per correction._system_from <= s for
each (eid, vt-slice).The closed-period approach is the right fit for stratum's read-heavy analytical workload: a single predicate filter is cheaper than a per-entity merge pass over event history. The append-only approach is what event-sourced bitemporal stores do; it trades read-time work for write-time simplicity.
_system_to at past system-timeThe closed-period design has one subtlety. If you query
SELECT _system_to FROM t FOR SYSTEM_TIME AS OF <past>, the value
you'll see for the old row is the closure instant, not the
Long/MAX_VALUE it held at the queried system-time. The
(from <= s < to) predicate still answers visibility correctly —
the row's row-visibility at past system-time is preserved — but the
literal column value is the post-mutation one.
Practical impact for accounting / audit workloads: low. Queries ask
"which rows were the DB's belief at time s?", not "what was each
row's _system_to value at s?". The visibility filter is the
audit-relevant question, and it works. Workloads that need the
literal historical _system_to (regulatory reconstructions of "what
did the DB show at exactly that instant") should derive it from
the next-later row's _system_from for the same entity — or use a
storage layer that's append-only by construction.
Three primitives on IDataset (called on a transient dataset, mutate
in-place, return the transient for threading):
(append! tds rows-or-cols
{:valid-from inst ; defaults to now (wall-clock or :db/txInstant override)
:valid-to inst ; defaults to Long/MAX_VALUE (open)
:system-from inst ; defaults to now (shared with valid-from default)
:system-to inst ; defaults to Long/MAX_VALUE (open)})
(upsert! tds {:where [pred ...] ; stratum WHERE clause
:set {col val}
:auto-split? false}
{:valid-from inst ; defaults to now
:valid-to inst}) ; defaults to Long/MAX_VALUE
(retract! tds {:where [pred ...]
:auto-split? false}
{:valid-from inst ; defaults to now (close-point)
:valid-to inst}) ; explicit presence selects bounded
; surgery (FOR PORTION OF semantic);
; absence = open-window close
All three accept an optional tx-meta second arg with the temporal
keys (:valid-from, :valid-to, :system-from, :system-to); any
key absent falls back to wall-clock now for :_from and
Long/MAX_VALUE for :_to. Pinning the clock for deterministic
test runs is done via :db/txInstant in tx-meta or by binding the
dynamic var the dataset reads (*clock-time-millis*).
append! writes new rows with the supplied or defaulted vt /
system windows. No close. Pure insertion.
upsert! is the SCD2 close-and-reopen primitive:
:where whose _valid_to (or _system_to)
still spans the new window._valid_to to the new :valid-from (write the close
intervals).:set values and the new vt /
system windows.retract! has two shapes determined by tx-meta presence:
:valid-to in tx-meta): "close without reopen"
— find matching rows whose _valid_to is Long/MAX_VALUE, set
_valid_to to :valid-from. Logical retraction at the close point.FOR PORTION OF (:valid-to in tx-meta): per-row
surgery — drop / truncate-vt / truncate-vf / split — preserving
non-overlap parts. Implements SQL:2011 FOR PORTION OF VALID_TIME … DELETE semantics.Physical purge stays on ds-delete-rows! — never auto-triggered by
retract!.
Stratum core doesn't know what "an entity" is. The :eid convention
is application-specific (one consumer uses :eid, a text-search
adapter would use a doc-id, a vector-search adapter a vector-id). A
predicate-driven primitive stays generic and reuses the WHERE-clause
DSL the planner already understands — zone-map pruning and parallel
scan come for free.
Callers that do have an entity key wrap the primitive trivially:
(defn upsert-entity! [ds eid attrs vt-from]
(-> ds transient
(upsert! {:where [[:= :eid eid]] :set attrs}
{:valid-from vt-from})
persistent!))
INSERT and DELETE both have a "no close needed / no reopen needed"
half-shape. UPDATE has both. Folding them into one vt-write! with
mode flags conflates ergonomics and obscures the SCD2 invariant. Three
primitives mirror SQL:2011 DML and SCD2 textbook patterns.
A write whose [vf, vt) window overlaps an already-closed historical
window on the same matching rows is a backdated correction. There
are three reasonable behaviors:
:auto-split? true, implemented). Reshape the
matching rows so the new write slots in without violating the
without-overlaps invariant. Per row, the action depends on its
position relative to the new write's [new-vf, MAX):
row-vf < new-vf < row-vt — partial left overlap; truncate
the row's vt down to new-vf. The historical prefix
[row-vf, new-vf) is preserved.row-vf >= new-vf — row is entirely superseded by the new
write. How the supersession is recorded depends on the
table's bitemporal mode:
:bitemporal {:valid …}, no :system):
the superseded row is dropped physically via
ds-delete-rows! (fans idx-delete! across every
column). No system-time view of the prior state is kept.:bitemporal {:valid … :system …}):
the superseded row is closed logically — its
_system_to is shifted from MAX to the current sys-now
and the row stays in storage. The system-time view at
t < sys-now continues to see the prior state; only
the as-of-now view treats it as gone. This is the
standard SCD2-on-both-axes behaviour and is what makes
the audit trail walkable.
Then upsert! appends the new merged row (or degenerates to a
plain insert if no :close-safe rows matched). retract! runs
the same reshape with no append step. SQL:2011
"application-time-period tables" allow this surgical behavior
under the non-sequenced UPDATE mode.INSERT FOR PORTION OF VALID_TIME and UPDATE FOR PORTION OF VALID_TIME are append-only with respect to pre-existing data.
Stratum follows SQL:2011 non-sequenced semantics: each matching row
gets independently updated (or, for INSERT, the new row is appended
without checking whether the slice was already populated).
If two pre-existing rows already overlap each other within the
target slice (a data-quality issue), bounded UPDATE produces two
overlapping output slices — both with fresh _system_from. Reads
at the new system-time will see both. Stratum does not enforce
WITHOUT OVERLAPS at write time. Event-sourced bitemporal stores
can defer this to a read-time merge step that resolves which
event "wins" at each (vt-time, system-time) point; stratum's
mutate-in-place storage leaves the responsibility on the writer.
To get surgical replacement (drop the slice + assert one new row),
compose DELETE FOR PORTION OF VALID_TIME … and INSERT FOR PORTION OF VALID_TIME … as two statements in your transaction.
Reject-by-default matches kontor's audit constraints (an accountant filing a backdated invoice shouldn't have one write produce three rows of audit trail without explicit opt-in). Auto-split is opt-in and per-call, not per-dataset, so the policy decision lives at the edit site.
Classification of matching rows vs the new write's [new-vf, MAX)
window (implemented in upsert! / retract!):
:close-safe — open row (vt = MAX) AND vf < new-vf. Closing
its vt to new-vf produces a valid, non-empty window. The SCD2
close-and-reopen path applies.:no-conflict — closed row entirely before the new write
(vt <= new-vf). Ignored; no overlap.:overlaps — any other matching row. Reject (default) or
auto-split (:auto-split? true). Includes:
vf >= new-vf (would create a backwards window) —
auto-split drops it.vt > new-vf (the new write would land inside
an already-closed historical period) — auto-split truncates if
row-vf < new-vf, drops otherwise.Right-partial and middle-overlap classes don't appear in the current
API because the new write's vt is always MAX in upsert! /
retract!. They become reachable once SQL FOR PORTION OF VALID_TIME FROM x TO y lowers to bounded windows in Phase D.
Cost: one extra full scan over matching rows per write (predicate eval in pure Clojure). Future Phase C+ work can push the predicate through the planner's zone-map pruner for sublinear scaling.
upsert! degenerate-to-insert. When no rows match the :where
predicate at all, upsert! falls through to a plain append! of the
:set payload — matching the spirit of SQL INSERT … ON CONFLICT UPDATE. The new row's axis columns are auto-stamped just like a
direct append!.
Stratum's SQL recognises SQL:2011 temporal forms directly:
-- per-query SQL:2011 SELECT forms (both axes supported and
-- composable on the same SELECT)
SELECT * FROM employees
FOR VALID_TIME AS OF '2024-04-15'
FOR SYSTEM_TIME AS OF '2024-04-20';
SELECT * FROM employees
FOR VALID_TIME BETWEEN '2024-01-01' AND '2024-12-31';
-- system-time-only: "what did the DB believe about this entity
-- before the correction at sys-instant S?" — the audit demo of
-- closed-period SCD2.
SELECT salary FROM employees
FOR SYSTEM_TIME AS OF '2024-04-10'
WHERE id = 42;
-- bitemporal DML
INSERT INTO employees (id, salary, _valid_from, _valid_to)
VALUES (1, 100000, '2024-01-01', '2024-07-01');
UPDATE employees FOR PORTION OF VALID_TIME FROM '2024-07-01' TO '2024-12-01'
SET salary = 110000
WHERE id = 1;
DELETE FROM employees FOR PORTION OF VALID_TIME FROM '2024-09-01' TO '2024-10-01'
WHERE id = 1;
-- clock pinning for repeatable runs (session-scoped)
SET datahike.clock_time = '2024-04-15T00:00:00Z';
JSqlParser 5.2 doesn't grok these tokens, so the path is a pre-tokenise
rewriter in stratum.sql.rewrite/preprocess-sql: strip
FOR PORTION OF VALID_TIME FROM x TO y to a side-channel :period
map before JSqlParser sees the SQL, then attach :period to the
parsed :ddl for the translator.
DELETE FOR PORTION OF VALID_TIME lowers to dataset/retract!
with :valid-from + :valid-to in tx-meta. retract! performs
the bounded surgery (truncate / shift / split / drop) per
overlapping row.INSERT FOR PORTION OF VALID_TIME lowers to dataset/append! with
the period values stamping the configured _valid_from /
_valid_to columns. The SQL must provide an explicit column list
(INSERT INTO t (a, b) VALUES (...)) — the period fills in the
temporal columns.UPDATE FOR PORTION OF VALID_TIME lowers to dataset/bounded-update!
(SQL:2011 non-sequenced UPDATE): per row, the overlap portion gets
the merged :set values, the non-overlap parts retain the
original values. Internally decomposes into retract! over the
slice + an append! per captured overlap.INSERT … ON CONFLICT … FOR PORTION OF VALID_TIME is explicitly
rejected — the conflict target's semantics under non-sequenced
semantics are not well-defined; users can compose DELETE FOR PORTION OF + INSERT FOR PORTION OF (or INSERT FOR PORTION OF
UPDATE FOR PORTION OF) instead.FOR (ALL | PORTION OF) SYSTEM_TIME DML is not exposed via SQLFOR PORTION OF SYSTEM_TIME and FOR ALL SYSTEM_TIME on
INSERT / UPDATE / DELETE are rejected with a clear error
(:sql/for-portion-of-system-time-unsupported) rather than
silently dropped. This is a deliberate design choice, not a
deferred feature.
Rationale: system-time is the auto-stamped recording-fact axis.
The whole audit story stratum's closed-period SCD2 preserves
(superseded rows kept in storage with _system_to shifted to
sys-now; queryable via the SELECT-side FOR SYSTEM_TIME AS OF
grammar above) depends on the engine being the sole authority
on what the DB recorded and when. Allowing SQL clients to rewrite
past system-time slices would turn the audit trail into
mutable storage, undermining its evidentiary value.
This matches the posture of every major SQL system that
implements WITH SYSTEM VERSIONING:
| System | FOR PORTION OF SYSTEM_TIME DML? | Stance |
|---|---|---|
| SQL Server | ✗ Rejected | System-time is engine-controlled |
| MariaDB | ✗ Rejected | Same |
| PostgreSQL (17/18 syntax landing) | ✗ Not exposed | Same |
| Oracle Flashback | ✗ N/A | Undo-log replay; not user-writable |
| IBM Db2, Teradata | ✓ Supported | Allow per-row regulated tampering |
Stratum takes the audit-integrity stance.
Escape hatch for legitimate historical-import use cases: the Clojure DSL exposes the underlying primitive directly —
;; Append a row with a user-pinned past system-from (e.g. for
;; migrating data we *knew* on 2020-03-15 but are only now
;; recording in stratum):
(-> ds
transient
(dataset/append! {:eid 42 :salary 100000
:_valid_from <inst> :_valid_to <inst>}
{:system-from #inst "2020-03-15"})
persistent!)
upsert! / retract! / bounded-update! all honor
:system-from in tx-meta via system-now-from-tx-meta. This is
appropriate for trusted internal tooling (data migrations,
regulator replay) — code that's already running with full
authority over the dataset. SQL is the line where untrusted /
external write paths cross, and that's where the rejection
applies.
Plain DELETE WHERE / INSERT VALUES / UPDATE WHERE (without
FOR PORTION OF) on an index-backed table also route through
ds-delete-rows! / append! / array-rebuild appropriately.
SELECT … FROM t FOR (VALID_TIME | SYSTEM_TIME) (AS OF x | BETWEEN x AND y | FROM x TO y | ALL) is rewritten by the
preprocessor into equivalent WHERE predicates over the table's
axis columns. Convention is the SQL:2011 default naming
(_valid_from/_valid_to, _system_from/_system_to); tables
that opt into custom axis column names via :bitemporal {axis {:from-col …}} would need a view to expose the canonical names
for SELECT-side use (or extend the preprocessor to consult the
registry — deferred).
-- valid-time: point-in-vt
SELECT salary FROM salaries
FOR VALID_TIME AS OF '2024-04-01'
WHERE eid = 1;
-- valid-time: range overlap
SELECT eid, salary FROM salaries
FOR VALID_TIME BETWEEN '2024-01-01' AND '2024-07-01';
-- valid-time: explicit half-open
SELECT eid, salary FROM salaries
FOR VALID_TIME FROM '2024-01-01' TO '2024-07-01';
-- no temporal filter (full vt-history)
SELECT eid, salary FROM salaries FOR ALL VALID_TIME;
-- system-time: "what did the DB believe at instant S?"
-- The audit demo of closed-period SCD2 — superseded rows are
-- preserved with `_system_to` shifted to sys-now, so this query
-- can reconstruct the pre-correction view.
SELECT salary FROM salaries
FOR SYSTEM_TIME AS OF '2024-04-10'
WHERE eid = 1;
-- bitemporal: one clause per axis on the same SELECT (composable)
SELECT salary FROM salaries
FOR VALID_TIME AS OF '2024-04-15'
FOR SYSTEM_TIME AS OF '2024-04-20'
WHERE eid = 1;
The rewriter strips each clause and injects the equivalent
predicate into WHERE. Both predicates are ANDed when a SELECT
carries both axes.
Multi-clause-per-axis limit: two FOR VALID_TIME (or two
FOR SYSTEM_TIME) clauses on the same SELECT are rejected with
:sql/multi-for-valid-time-unsupported or
:sql/multi-for-system-time-unsupported. The emitted predicate
uses unqualified column names; two clauses on the same axis on a
joined SELECT would both reference the same columns and the
planner cannot disambiguate them. The proper fix is
qualifier-aware predicate emission; until then, callers needing
per-table temporal filters on a join should write the WHERE
predicates explicitly with <table>._valid_from /
<table>._system_from refs (which the planner DOES handle since
the qualifier is part of the column identifier the parser sees).
One clause per axis composes freely — that's the canonical
bitemporal cut.
Note on _system_to literal value at past system-time
queries: stratum's closed-period SCD2 mutates _system_to in
place when a row is superseded. A query SELECT _system_to FROM t FOR SYSTEM_TIME AS OF '<past>' returns the closure value
(the sys-now at which the surgery fired), not the Long/MAX_VALUE
the column literally held at the queried instant. Row visibility
is correct — the (from <= s < to) predicate the rewriter emits
filters correctly. Workloads that need the literal historical
_system_to ("regulatory reconstruction of exactly what the DB
displayed at instant S") should derive it from the next-later
row's _system_from for the same entity, or use an append-only
storage layer instead. For accounting / audit, the visibility
filter is the relevant question and it works.
OVERLAPS, EQUALS_PERIOD, CONTAINS_PERIOD, PRECEDES /
STRICTLY_PRECEDES / IMMEDIATELY_PRECEDES, SUCCEEDS /
STRICTLY_SUCCEEDS / IMMEDIATELY_SUCCEEDS, and MEETS are
available as 4-arg WHERE-clause functions taking
(a_from, a_to, b_from, b_to). Generic over any int64 column
pair — works for the bitemporal axis, application-domain date
ranges, or arbitrary intervals.
DELETE FROM events
WHERE OVERLAPS(event_start, event_end, blackout_start, blackout_end);
Allen predicates work in both DML and SELECT WHERE clauses. The
DML evaluator handles col-vs-col natively; the SELECT planner
resolves keyword RHS args via eval-pred-scalar's resolve-arg
and column-pruning retains RHS keyword refs (via
plan/pred-columns's RHS scan), so an Allen predicate referencing
two columns evaluates correctly through the non-SIMD mask path.
SIMD-acceleration of col-vs-col is the residual deferred item —
small datasets and bitemporal-pattern queries (where the planner
falls back to the scalar evaluator) are unaffected.
Both temporal axes are first-class columns; composing constraints
across them is just two predicates in the same :where. Via the
stratum query DSL:
(st/q {:from dataset
:where [[:<= :_valid_from query-vt] ; valid-time AS OF query-vt
[:> :_valid_to query-vt]
[:<= :_system_from query-st] ; system-time AS OF query-st
[:> :_system_to query-st]]
:select [:eid :salary]})
Via SQL:
SELECT eid, salary FROM salaries
FOR VALID_TIME AS OF '2024-04-15'
FOR SYSTEM_TIME AS OF '2024-04-20';
The two-axis case is just the one-axis case with two more predicates. The columns are stored side-by-side in the same row; the zone-map pruner reads both pairs and the SIMD scan filters in one pass. There is no per-axis HashMap or wrapper indirection — the axes are columns the planner already knows how to handle.
Consumers that embed stratum in a higher-level system (Datalog or
otherwise) typically expose temporal navigation as that system's
native API (e.g. as-of / valid-at-style wrappers); those wrappers
translate down to the same per-column predicates the SQL/DSL forms
produce here.
Stratum is a columnar analytics library that doubles as the storage
substrate for higher-level systems (a Datalog secondary index, a SQL
front-end, application code in other repos). Without a canonical
write-side bitemporal primitive in stratum core, every consumer that
needed SCD2 had to re-invent close-and-reopen on top of
assoc-column and append! — duplicating ~100-200 LOC each, with no
guarantee they implemented the system-time-symmetry invariant
correctly.
With this work in place:
replace-row-bitemporal!) lives in
stratum core, exercised by append! / upsert! / retract! /
bounded-update!.:bitemporal config through make- dataset and the primitives do the right thing per-axis.FOR PORTION OF VALID_TIME directly without a custom translator.Bitemporal storage shows up in three architectural shapes today; the one you pick should match your workload, not your taste in databases. This section sketches the alternatives and where Stratum sits.
A longer engine-by-engine comparison — including non-SQL bitemporal systems like Datomic and the bitemporal capabilities of Datahike — will live in the Datahike docs once Datahike's own bitemporal support ships. This section focuses on the picking decision for users evaluating Stratum specifically.
| If your workload is mainly… | Pick |
|---|---|
| Analytics / OLAP on a recent snapshot with occasional valid-time corrections (accounting, claims, ERP back-office, IoT late-telemetry, audit-grade SCD2) | Stratum |
| Mixed read/write where most reads are "as of now" on a wide table and the bitemporal axes are columns you'd touch anyway | Stratum |
| Bulk supersession of thousands of rows in one transaction (period-end re-pricing, regulatory restatement) | Stratum |
| Distributed write path with append-only event log, multi-writer coordination handled by the engine, long per-entity histories accessed by primary key | XTDB v2 |
| Schemaless / "document" model with a bitemporal substrate | XTDB v2 |
Off-the-shelf relational DB with only system-time versioning (no _valid_from) and you can live without true bitemporal | PostgreSQL temporal_tables / MariaDB / IBM Db2 / SQL Server |
| Cloud-warehouse "Time Travel" (one-axis, system-time only, fixed retention window) | Snowflake / BigQuery / Delta Lake — only if a single axis is enough |
Per-column primitive arrays (long[] / double[]) wrapped in
chunked persistent indices with per-chunk zone-map stats. Bitemporal
axes are just four extra int64 columns
(_valid_from/_valid_to/_system_from/_system_to) stored
side-by-side with user columns. Writes use mutate-in-place SCD2
surgery: close the matched row's _system_to, append the new row
under a shared sys-now stamp. Reads compose VT and ST predicates as
two extra terms in the same SIMD pass that already evaluates the
user's WHERE. JVM-embedded; one writer per dataset; readers scale
out over the underlying Konserve store (tiered backends — S3 +
konserve-sync for local caches — give independent reader processes
their own latency-shaped view, each pointing at the latest commit
root).
Bitemporality is per-dataset opt-in. The :bitemporal field in
dataset metadata is what tells write primitives to do SCD2 surgery
and the SQL rewriter which columns the FOR VALID_TIME clause should
emit predicates against. A dataset configured without it is just a
flat columnar dataset; system-time history is still available via the
Konserve commit graph (each sync! is an audit-grade checkpoint of
the dataset), but there's no per-row _system_to. The four shapes —
no bitemporality, valid-time only, system-time only, full bitemporal
— all use the same storage and query engine; only the write
primitives and SQL rewriter behave differently.
Heterogeneous tables compose freely. Because the bitemporal
columns are just int64 columns, a Parquet file loaded as a flat
dataset can be joined against a bitemporally-configured dataset in
the same query, and FOR VALID_TIME AS OF t applies to the side
that has those columns. Verified with both the Clojure DSL and the
SQL surface:
SELECT o.order_id, o.amount, c.name
FROM orders o -- flat (e.g. from Parquet)
JOIN customers c FOR VALID_TIME AS OF 150 -- bitemporal
ON o.customer_id = c.customer_id;
The bet: most read workloads are "as-of-now" or "as-of-recent" on a wide table, where one materialised row per logical entity is dramatically more cache- and SIMD-friendly than reconstructing visibility from an event log. Filter-aggregate, group-by, and SCD2 update workloads inherit Stratum's existing OLAP performance for free; bitemporal isn't a tax on the query path.
Apache Arrow trie files on a BufferPool over local disk or object
storage. A bitemporal update is one immutable event row in an
append-only log; visibility at a queried (system-time, valid-time)
instant is reconstructed at read time per entity. The on-disk schema
is fixed (_iid, _system_from, _valid_from, _valid_to, op);
_system_to is derived, never persisted. A single-writer indexer
thread serialises ops, flushes blocks of a log-structured trie, and
compactors merge them; readers fan out across replicas.
Where it beats Stratum: multi-writer coordination handled by the
engine, deep per-entity histories (the _iid trie gives O(log N)
navigation to the entity's events), schemaless documents, log-tailing
reads. Where it loses: bulk filter-aggregate over recent data — every
read still has to project per-entity polygons before the user
predicate fires, with no SIMD-fused path. Note that Stratum's
reader fanout is comparable in shape (multiple JVMs reading the
same content-addressed Konserve store); the gap is on the write side
— Stratum core has one writer per dataset, while Datahike layers a
distributed-writer protocol on top and the same primitive could be
upstreamed into Stratum if needed.
PostgreSQL (via the temporal_tables extension or recent
postgres_temporal work), MariaDB 10.3+, IBM Db2, and Microsoft SQL
Server all implement system-versioned tables — one row per logical
entity in the "current" table plus a history table that records the
old row on UPDATE. The SQL surface is similar to Stratum's
(FOR SYSTEM_TIME AS OF, FROM…TO, ALL), but none of them
implement valid-time out of the box. If your domain truly needs
both axes (you care what the system believed at time T about what
was true at time T'), you're either layering valid-time on top in
the application, or you're picking a bitemporal-native engine.
These systems shine when you already operate a mainstream RDBMS and the only thing you need is "show me yesterday's row state"; they struggle when valid-time corrections are routine or when the audit-window is unbounded (history tables grow without natural compaction).
A snapshot-based system-time facility with a bounded retention
window (typically 1–90 days). Useful for accidental-delete recovery
and short-term temporal queries; not a bitemporal substrate. No
valid-time, no audit-grade unbounded history, no SQL:2011 FOR PORTION OF DML.
| Capability | Stratum | XTDB v2 | SQL:2011 system-versioned RDBMSes | Time Travel |
|---|---|---|---|---|
| Valid-time axis | ✔ | ✔ | ✗ | ✗ |
| System-time axis | ✔ | ✔ | ✔ | ✔ (bounded) |
FOR (SYSTEM\|VALID)_TIME AS OF / BETWEEN / FROM…TO / ALL | ✔ | ✔ | system-time only | system-time only |
FOR PORTION OF VALID_TIME DML | ✔ | ✔ | ✗ | ✗ |
| Allen predicates (OVERLAPS, CONTAINS, …) | ✔ (10 fns) | ✔ (period-typed) | ◐ (some have OVERLAPS) | ✗ |
MERGE / upsert with VT portion | ✗ (two-statement workaround) | ✔ (PATCH INTO) | n/a (no VT) | n/a |
ERASE / hard purge | Clojure DSL only | ✔ SQL ERASE FROM | varies | retention-based |
| Reader fanout over shared storage | ✔ (Konserve content-addressed; tiered + konserve-sync caches) | ✔ | RDBMS-dependent | ✔ |
| Multi-writer coordination in-engine | ✗ (one writer per dataset; Datahike adds it on top) | ✔ | ✔ | n/a (writes go through the warehouse) |
| Log-tailing / streaming reads | ✗ | ✔ | RDBMS-dependent | ✗ |
| Per-entity primary-key index | ✗ (column scan + zone maps) | ✔ (_iid hash trie) | ✔ (B-tree) | n/a |
| SIMD-fused filter+aggregate over temporal columns | ✔ | ✗ | depends on engine | depends |
| Embedded / single-JAR | ✔ | ✗ (indexer + log + replicas) | ✗ | ✗ (managed service) |
| Schemaless / dynamic columns | ✗ | ✔ | ✗ | ◐ |
So you know precisely when not to pick Stratum:
konserve-sync gives a
locally-cached copy when read latency matters more than
freshness). Datahike layers a distributed writer protocol on top,
and the same primitive could be upstreamed into Stratum — but
today it lives a layer up._system_to mutate-in-place caveat (see Caveat: reading
_system_to at past system-time above).:eid column with zone-map
pruning. Asymptotically beaten by XTDB v2's _iid hash trie for
highly selective WHERE eid = ? on tables with many entities
— likely irrelevant for analytical workloads, relevant for
OLTP-flavoured key-value access.sync! boundaries. Real-time consumers need a different
pattern (or a thin layer above Stratum that polls).MERGE / SQL upsert with VT portion isn't supported.
DELETE FOR PORTION OF + INSERT FOR PORTION OF as two
statements covers the same ground; a fused single-statement form
is rejected at parse time as ambiguous.The work landed in phases for review-ability:
| Phase | Scope |
|---|---|
| A | Schema redesign: :bitemporal {:valid :system} |
| B | Write primitives: append! / upsert! / retract! |
| C | Overlap detection: reject by default |
| C+ | Auto-split: truncate partial-left + supersede (valid-only: drop physically via ds-delete-rows!; fully-bitemporal: close _system_to to sys-now, keep in storage for audit) |
| D | SQL grammar FOR PORTION OF VALID_TIME (DELETE) + bounded retract! + index-backed SQL DELETE |
| D+ | FOR PORTION OF VALID_TIME on UPDATE + INSERT (via bounded-update!); UPSERT rejected with clear error |
Downstream consumers (secondary-index adapters, PG-wire SQL passthrough) live in their own repos and land independently.
The shipped suite covers:
append! round-trip (insert with explicit window, with defaulted
window, on both axes).upsert! SCD2 (close + reopen) — :where predicate matching,
:set value merging.retract! close-only and bounded FOR PORTION OF surgery.replace-row-bitemporal!).bounded-update! 3-way split with shared sys-now stamp.{:valid-at t} annotation in :where so a single
query can join two relations at different vt instants. Useful for
reconciliation queries. Open API design call.range-bins-style: "as of each Friday
of 2024, what was the trial balance?"). Useful for reporting; out
of scope for the substrate._system_from on writes; bitemporally correcting system-time
the way valid-time can be corrected is a documented limitation.SET col = <expression> in UPDATE … FOR PORTION OF VALID_TIME: only literal values today. Expression evaluation
requires SELECT/UPDATE engine convergence.FOR VALID_TIME clauses: the
rewriter emits unqualified _valid_from / _valid_to refs that
the planner can't disambiguate across joins. Rejected at parse
time with a hint to use explicit <table>.col qualifiers. Real
fix requires qualifier-aware column resolution in the planner._valid_from / _system_from column naming convention.Can you improve this documentation?Edit on GitHub
cljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs |
| ← | Move to previous article |
| → | Move to next article |
| Ctrl+/ | Jump to the search field |