Virtual PostgreSQL catalog materialization + system-query routing.
Datahike has no real pg_catalog.* or information_schema.* tables —
pgwire synthesizes them on demand from the user schema. Every
catalog table the pg_* ecosystem expects is a {:schema [...] :data-fn (fn [user-schema cte-db] ...)} entry that builds its row
set lazily when a query references it.
The registry is extensible at runtime: libraries built on top of
pg-datahike can add their own virtual catalog tables via
register-catalog-table! (e.g. Odoo's internal metadata tables,
a pg_stat_activity-style probe, etc.) without modifying this
namespace.
Two public entry points:
register-catalog-table! / unregister-catalog-table!
— the extension seamextract-empty-catalog-shape / system-query?
— called by the wire handler to short-circuit common boot probes
(pgjdbc's field-metadata, Hibernate's feature detection) into
fast paths before JSqlParser even runs.Virtual PostgreSQL catalog materialization + system-query routing.
Datahike has no real `pg_catalog.*` or `information_schema.*` tables —
pgwire synthesizes them on demand from the user schema. Every
catalog table the pg_* ecosystem expects is a `{:schema [...]
:data-fn (fn [user-schema cte-db] ...)}` entry that builds its row
set lazily when a query references it.
The registry is extensible at runtime: libraries built on top of
pg-datahike can add their own virtual catalog tables via
`register-catalog-table!` (e.g. Odoo's internal metadata tables,
a `pg_stat_activity`-style probe, etc.) without modifying this
namespace.
Two public entry points:
- `register-catalog-table!` / `unregister-catalog-table!`
— the extension seam
- `extract-empty-catalog-shape` / `system-query?`
— called by the wire handler to short-circuit common boot probes
(pgjdbc's field-metadata, Hibernate's feature detection) into
fast paths before JSqlParser even runs.When bound (by the server's handler factory), a seq of strings — the
names of databases registered at server-start time. Surfaced as rows
in the virtual pg_database catalog so tools that enumerate databases
(psql \l, pg_dump --list, pgjdbc's DatabaseMetaData) see them.
Unbound (nil) falls back to the legacy single-database shape — ["template0" "template1" "datahike"] — so tests that start a bare handler without a registry still discover the expected row set.
When bound (by the server's handler factory), a seq of strings — the names of databases registered at server-start time. Surfaced as rows in the virtual `pg_database` catalog so tools that enumerate databases (psql \l, pg_dump --list, pgjdbc's DatabaseMetaData) see them. Unbound (nil) falls back to the legacy single-database shape — ["template0" "template1" "datahike"] — so tests that start a bare handler without a registry still discover the expected row set.
(catalog-data-for table-name user-schema cte-db)Build one catalog table's row data. Same precedence as catalog-schema-for.
Build one catalog table's row data. Same precedence as catalog-schema-for.
(catalog-data-for* table-name user-schema cte-db)Built-in catalog data — see catalog-schema-for*. Dispatches a per-table body; library consumers add more via the registry.
Built-in catalog data — see catalog-schema-for*. Dispatches a per-table body; library consumers add more via the registry.
(catalog-schema-for table-name)Resolve a catalog table's Datahike schema. Checks extensions first (allowing userland overrides in theory, though we don't rely on that), then falls back to the built-ins.
Resolve a catalog table's Datahike schema. Checks extensions first (allowing userland overrides in theory, though we don't rely on that), then falls back to the built-ins.
(catalog-schema-for* table-name)Built-in catalog schema — every table-name is a key in a case expression. Library consumers register additional tables via register-catalog-table! (above); the consolidated lookup goes through catalog-schema-for.
Built-in catalog schema — every table-name is a key in a case expression. Library consumers register additional tables via register-catalog-table! (above); the consolidated lookup goes through catalog-schema-for.
(catalog-table-name t)Normalize a JSqlParser Table node to the internal catalog key used
by catalog-tables / catalog-schema-for / catalog-data-for.
Returns nil if the node isn't a recognized catalog.
Handles: pg_type → "pg_type" pg_catalog.pg_type → "pg_type" information_schema.columns → "information_schema_columns"
Normalize a JSqlParser Table node to the internal catalog key used by `catalog-tables` / `catalog-schema-for` / `catalog-data-for`. Returns nil if the node isn't a recognized catalog. Handles: pg_type → "pg_type" pg_catalog.pg_type → "pg_type" information_schema.columns → "information_schema_columns"
(catalog-tables)Set of every recognized catalog table — built-ins + runtime- registered extensions.
Set of every recognized catalog table — built-ins + runtime- registered extensions.
(catalog-tables-in-stmt stmt)Same as catalog-tables-used but accepts any Statement (PlainSelect, ParenthesedSelect, SetOperationList). Used by sql/parse-sql's top-level dispatch to pick up catalog refs under UNIONs and top-level ParenthesedSelects.
Same as catalog-tables-used but accepts any Statement (PlainSelect, ParenthesedSelect, SetOperationList). Used by sql/parse-sql's top-level dispatch to pick up catalog refs under UNIONs and top-level ParenthesedSelects.
(catalog-tables-used stmt)Walk a PlainSelect's FROM + JOIN items (plus any nested derived
tables, UNION branches, and WHERE subqueries) and return the set of
catalog table names referenced. Mutates matching Table nodes to the
normalized internal name so the SQL translator sees pg_attribute
instead of pg_catalog.pg_attribute.
Walk a PlainSelect's FROM + JOIN items (plus any nested derived tables, UNION branches, and WHERE subqueries) and return the set of catalog table names referenced. Mutates matching Table nodes to the normalized internal name so the SQL translator sees `pg_attribute` instead of `pg_catalog.pg_attribute`.
Classifier :kind values that route to the system-type dispatch in server.clj. Kinds NOT in this set fall through to JSqlParser or the complex-pattern catalog probes below.
Classifier :kind values that route to the system-type dispatch in server.clj. Kinds NOT in this set fall through to JSqlParser or the complex-pattern catalog probes below.
(collect-in-stmt! stmt)(collect-in-stmt! acc stmt)Walk a SELECT / ParenthesedSelect / SetOperationList statement and collect the set of catalog table names it references (anywhere — top-level FROM/JOINs, derived tables, UNION branches, WHERE subqueries, CTE bodies). Mutates matching Table nodes to the normalized internal catalog name. Returns the accumulated set.
Walk a SELECT / ParenthesedSelect / SetOperationList statement and collect the set of catalog table names it references (anywhere — top-level FROM/JOINs, derived tables, UNION branches, WHERE subqueries, CTE bodies). Mutates matching Table nodes to the normalized internal catalog name. Returns the accumulated set.
(extract-empty-catalog-shape sql)Parse a SELECT that was classified as :empty-catalog and return
{:names [String…] :oids [int…]} matching the projection shape the
client expects in RowDescription. Used when we respond to a known-
empty catalog query with zero rows — clients like pgJDBC's
DatabaseMetaData.getTables issue 12-column SELECTs and will raise
column index out of range if the RowDescription doesn't match.
For SELECT * and anything we can't parse, returns nil so callers
can fall back to a minimal 1-column shape (which is wrong, but
harmless for psycopg2 / asyncpg that always go by column name).
Types are all OID_TEXT. That's the honest answer (we don't know),
and it matches PG's unknown-to-text coercion at the wire
boundary for untyped columns.
Parse a SELECT that was classified as `:empty-catalog` and return
`{:names [String…] :oids [int…]}` matching the projection shape the
client expects in RowDescription. Used when we respond to a known-
empty catalog query with zero rows — clients like pgJDBC's
`DatabaseMetaData.getTables` issue 12-column SELECTs and will raise
`column index out of range` if the RowDescription doesn't match.
For `SELECT *` and anything we can't parse, returns nil so callers
can fall back to a minimal 1-column shape (which is wrong, but
harmless for psycopg2 / asyncpg that always go by column name).
Types are all OID_TEXT. That's the honest answer (we don't know),
and it matches PG's `unknown`-to-`text` coercion at the wire
boundary for untyped columns.(register-catalog-table! table-name entry)Register an additional virtual catalog table. entry must supply
:schema (a seq of Datahike schema entry maps — include a row-marker
attr) and :data-fn (a fn of [user-schema cte-db] returning a seq of
row entity-maps).
Init-time only. The registry is a process-local atom — it is
not persisted to Datahike and does not survive a server restart.
Call this from your application's startup code, before
start-server, not lazily on first query. If your host process
restarts (deploy, crash), each registration must be re-applied
before the first connection lands; otherwise clients hitting an
un-re-registered table see "relation does not exist".
Built-in catalogs (pg_class, pg_attribute, …) are unaffected by restart — they're derived from the user schema in Datahike, which is durable on file/kv backends.
Register an additional virtual catalog table. `entry` must supply :schema (a seq of Datahike schema entry maps — include a row-marker attr) and :data-fn (a fn of [user-schema cte-db] returning a seq of row entity-maps). **Init-time only.** The registry is a process-local atom — it is not persisted to Datahike and does not survive a server restart. Call this from your application's startup code, before `start-server`, not lazily on first query. If your host process restarts (deploy, crash), each registration must be re-applied before the first connection lands; otherwise clients hitting an un-re-registered table see "relation does not exist". Built-in catalogs (pg_class, pg_attribute, …) are unaffected by restart — they're derived from the user schema in Datahike, which is durable on file/kv backends.
(system-query? sql)Check if a SQL string is a system/catalog query and return the handler keyword for the dispatch in server.clj; nil otherwise.
Delegates the leading-keyword routing to datahike.pg.classify for structural correctness (keyword-inside-string, keyword-inside- comment, case mix). A handful of complex pgjdbc / Odoo catalog probes still use substring matching on deep SELECT bodies — those stay here until we grow an AST-shape matcher.
Check if a SQL string is a system/catalog query and return the handler keyword for the dispatch in server.clj; nil otherwise. Delegates the leading-keyword routing to datahike.pg.classify for structural correctness (keyword-inside-string, keyword-inside- comment, case mix). A handful of complex pgjdbc / Odoo catalog probes still use substring matching on deep SELECT bodies — those stay here until we grow an AST-shape matcher.
(system-query?* sql cls-info)Inner implementation — takes an already-computed classify result so callers that have one (parse-sql) can avoid classifying twice per statement.
Two-stage routing: kinds from the classifier's whitelist go straight through; everything else is a candidate SELECT body that shape/catalog-probe inspects structurally (see shape.clj for the probe catalogue).
Inner implementation — takes an already-computed classify result so callers that have one (parse-sql) can avoid classifying twice per statement. Two-stage routing: kinds from the classifier's whitelist go straight through; everything else is a candidate SELECT body that shape/catalog-probe inspects structurally (see shape.clj for the probe catalogue).
(unregister-catalog-table! table-name)Remove a previously-registered catalog table. No-op if unregistered.
Remove a previously-registered catalog table. No-op if unregistered.
cljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs |
| ← | Move to previous article |
| → | Move to next article |
| Ctrl+/ | Jump to the search field |