Where pg-datahike's architecture aligns with — and where it intentionally
diverges from — real PostgreSQL (compared against the ../postgres checkout).
The goal is a faithful PostgreSQL interface for realistic use — covering the
feature behind what real clients (drivers, ORMs, migration tools, analysts) do —
not passing every adversarial conformance string.
This document describes the design state: the execution model, what that model covers, the capabilities that fall outside it, and the boundaries that are intrinsic to Datahike.
pg-datahike compiles each SQL statement into a single Datalog query
(:find / :where / :in / :with / :order-by) over one immutable
Datahike snapshot, plus light post-processing (HAVING, window functions,
ORDER BY, FOR UPDATE locks). PostgreSQL is a tuple-at-a-time iterator
(volcano) executor with arbitrary nesting over MVCC storage.
That single-query model is elegant and covers the bulk of CRUD and analytics. Within it:
:where with shared logic vars; OUTER joins via
or-join; correlated EXISTS via not-join.d/db-with).A few capabilities live outside the single-query model (structural gaps below); most remaining differences are incremental type/catalog coverage, or boundaries intrinsic to Datahike.
Capabilities implemented and exercised by the conformance suites:
FOR UPDATE.CREATE; INSERT; SELECT in one string — the
SELECT sees the INSERT (implicit-transaction model across the batch).generate_series,
unnest) materialised into the virtual-table path.CREATE TYPE … AS (…), ROW(...), named and
anonymous records, text and binary codecs, composite parameters.::T and ::T[] casts.pg_type,
pg_class, pg_attribute, pg_namespace, and the recursive type-info
queries asyncpg/JDBC issue to build codecs.These need execution beyond one d/q over one snapshot:
:where cannot
express a per-outer-row inner evaluation; this requires a nested-loop step on
the SELECT path. (LATERAL generate_series(1, t.n), JOIN LATERAL (subquery).)Execute with a row cap
(PortalSuspended) — driver fetchSize, server-side cursors over large
results. Today all rows materialise from one d/q.SELECT generate_series(1,3)
→ N rows; PG's ProjectSet). Currently serialised rather than expanded to rows.int4range, tsrange, …): text round-trips,
but the binary Range codec and range constructor functions are not yet
implemented.interval, the internal "char", aclitem,
full numeric precision edges, integer-range overflow enforcement
(an int4 column currently accepts values > 2³¹; PG errors).pg_stat_activity, pg_index, pg_constraint,
pg_proc, etc. — added as tools require them.42P01): a missing table reads as empty (EAV)
rather than erroring.CREATE FUNCTION / plpgsql / DO: not implemented.These are inherent to the storage model and documented as boundaries, not gaps to close:
FOR UPDATE / advisory locks / savepoints are
approximated; genuinely concurrent isolation levels are not the model. This is
the most important boundary for pooled workloads.pg_temp isolation: drop-on-disconnect covers sequential use;
concurrent same-name temp tables across sessions are not namespaced.Can you improve this documentation?Edit on GitHub
cljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs |
| ← | Move to previous article |
| → | Move to next article |
| Ctrl+/ | Jump to the search field |