Token-driven SQL source rewrites. Normalize SQL before JSqlParser sees it by excising or injecting source-level spans — all based on positions captured by the datahike.pg.sql.classify tokenizer.
Each rule is a pure function (tokens) -> seq of spans, where a
span is [start end replacement]. The rewriter applies all non-
overlapping spans right-to-left (so earlier offsets stay stable)
and returns the new SQL string.
Why this exists: the previous preprocess-sql was a pile of regex
str/replace calls that could false-positive on keywords inside
string literals, dollar-quotes, or comments (SELECT 'REFERENCES'
was vulnerable to the inline-REFERENCES stripper). Token-based
rules only match tokens of the right kind, so a keyword inside a
:string or :comment is invisible to them.
Callers: sql/preprocess-sql.
Token-driven SQL source rewrites. Normalize SQL before JSqlParser sees it by excising or injecting source-level spans — all based on positions captured by the datahike.pg.sql.classify tokenizer. Each rule is a pure function `(tokens) -> seq of spans`, where a span is `[start end replacement]`. The rewriter applies all non- overlapping spans right-to-left (so earlier offsets stay stable) and returns the new SQL string. Why this exists: the previous preprocess-sql was a pile of regex `str/replace` calls that could false-positive on keywords inside string literals, dollar-quotes, or comments (`SELECT 'REFERENCES'` was vulnerable to the inline-REFERENCES stripper). Token-based rules only match tokens of the right kind, so a keyword inside a :string or :comment is invisible to them. Callers: sql/preprocess-sql.
(alter-column-drop-default-rule toks)Match ALTER COLUMN [<quote>]<name>[<quote>] DROP DEFAULT [,] anywhere
in the token stream and remove it. JSqlParser would parse the form
but our schema can't honour it (we don't store DEFAULT values that
could be dropped); the regex tail in preprocess-sql used to handle
this. Token rule is immune to the substring appearing in literals or
comments.
Match `ALTER COLUMN [<quote>]<name>[<quote>] DROP DEFAULT [,]` anywhere in the token stream and remove it. JSqlParser would parse the form but our schema can't honour it (we don't store DEFAULT values that could be dropped); the regex tail in preprocess-sql used to handle this. Token rule is immune to the substring appearing in literals or comments.
(boolean-is-rule toks)Wrap the LHS of <expr> IS [NOT] (TRUE|FALSE|UNKNOWN) in parens
when it isn't already a single parenthesised group. JSqlParser's
grammar requires a boolean_primary on the left; an IN (…) or
EXISTS (…) parses standalone but trips when followed by IS.
Walking back: balanced-paren walk until a clause-boundary keyword
(WHERE / AND / OR / …), comma, semicolon, or unmatched ( — the
token immediately after that boundary is where the LHS starts.
Wrap the LHS of `<expr> IS [NOT] (TRUE|FALSE|UNKNOWN)` in parens when it isn't already a single parenthesised group. JSqlParser's grammar requires a `boolean_primary` on the left; an `IN (…)` or `EXISTS (…)` parses standalone but trips when followed by IS. Walking back: balanced-paren walk until a clause-boundary keyword (WHERE / AND / OR / …), comma, semicolon, or unmatched `(` — the token immediately after that boundary is where the LHS starts.
(collate-rule toks)Strip COLLATE <name>, COLLATE <qual>.<name>, COLLATE "<name>".
Token-driven: the matcher only fires on :ident COLLATE tokens, so
the substring COLLATE inside a string literal or a comment is
invisible to it (those are :string / :comment tokens).
Strip `COLLATE <name>`, `COLLATE <qual>.<name>`, `COLLATE "<name>"`. Token-driven: the matcher only fires on `:ident COLLATE` tokens, so the substring `COLLATE` inside a string literal or a comment is invisible to it (those are `:string` / `:comment` tokens).
(create-index-anonymous-rule toks)CREATE [UNIQUE] INDEX ON … → inject idx_auto_<N> between INDEX
and ON. PG allows unnamed indexes; JSqlParser doesn't. The counter
is process-wide and monotonic; collisions across handler sessions
are harmless because the name is thrown away by the :create-index
no-op handler anyway.
`CREATE [UNIQUE] INDEX ON …` → inject `idx_auto_<N>` between INDEX and ON. PG allows unnamed indexes; JSqlParser doesn't. The counter is process-wide and monotonic; collisions across handler sessions are harmless because the name is thrown away by the :create-index no-op handler anyway.
(create-sequence-no-clause-rule toks)Strip NO MINVALUE, NO MAXVALUE, NO CYCLE token pairs anywhere
they appear (typically inside a CREATE SEQUENCE statement). Replaces
each with a single space.
We don't restrict to CREATE SEQUENCE context because a NO MINVALUE
pair would only appear there in well-formed SQL, and the token-driven
matcher is comment- and string-literal-safe via classify.
Strip `NO MINVALUE`, `NO MAXVALUE`, `NO CYCLE` token pairs anywhere they appear (typically inside a CREATE SEQUENCE statement). Replaces each with a single space. We don't restrict to CREATE SEQUENCE context because a `NO MINVALUE` pair would only appear there in well-formed SQL, and the token-driven matcher is comment- and string-literal-safe via classify.
(default-fn-call-paren-rule toks)Match DEFAULT <fn>(...) for fn ∈ {nextval, currval, lastval} where
the call isn't already wrapped in extra parens, and inject parens
around the call. Same semantics, parser-friendly form.
Skipped when the token after DEFAULT is already ( — assume the
user already wrapped, leave alone.
Match `DEFAULT <fn>(...)` for fn ∈ {nextval, currval, lastval} where
the call isn't already wrapped in extra parens, and inject parens
around the call. Same semantics, parser-friendly form.
Skipped when the token after DEFAULT is already `(` — assume the
user already wrapped, leave alone.Token-driven rewrites applied in datahike.pg.sql/preprocess-sql
before JSqlParser sees the SQL. Each rule operates on tokens, not
on raw source, so a keyword the rule matches inside a string literal
or comment is invisible to it.
Order matters only for rules that target the same source span; all rules here are disjoint.
Token-driven rewrites applied in `datahike.pg.sql/preprocess-sql` before JSqlParser sees the SQL. Each rule operates on tokens, not on raw source, so a keyword the rule matches inside a string literal or comment is invisible to it. Order matters only for rules that target the same source span; all rules here are disjoint.
(inline-references-rule toks)Inline col TYPE … REFERENCES name [(cols)] [ON (DELETE|UPDATE) action].
JSqlParser doesn't accept the inline form, so we rewrite it. Two paths:
No action / RESTRICT / no action clause — just strip the
REFERENCES … span. Our existing FK plumbing only enforces
table-level FOREIGN KEY (col) REFERENCES …, and NO ACTION /
RESTRICT have no operational consequence beyond blocking the
parent delete (which we then can't enforce, but Odoo's
_auto_init flow doesn't depend on it).
CASCADE on DELETE — lift to a table-level FOREIGN KEY so
our FK plumbing tracks it and enforces cascade at runtime.
We strip the inline span AND inject a synthetic
, FOREIGN KEY (col) REFERENCES name(cols) ON DELETE CASCADE
just before the closing ) of the CREATE TABLE column list.
SET NULL / SET DEFAULT / ON UPDATE non-trivial — raise 0A000; not yet implemented at the runtime side.
Distinguishes inline from table-level by checking the previous
non-comment token: if it's ) (from FOREIGN KEY (col)), we
leave the whole REFERENCES alone so JSqlParser parses it natively
as a ForeignKeyIndex.
Inline `col TYPE … REFERENCES name [(cols)] [ON (DELETE|UPDATE) action]`. JSqlParser doesn't accept the inline form, so we rewrite it. Two paths: 1. **No action / RESTRICT / no action clause** — just strip the `REFERENCES …` span. Our existing FK plumbing only enforces table-level `FOREIGN KEY (col) REFERENCES …`, and NO ACTION / RESTRICT have no operational consequence beyond blocking the parent delete (which we then can't enforce, but Odoo's _auto_init flow doesn't depend on it). 2. **CASCADE on DELETE** — lift to a table-level `FOREIGN KEY` so our FK plumbing tracks it and enforces cascade at runtime. We strip the inline span AND inject a synthetic `, FOREIGN KEY (col) REFERENCES name(cols) ON DELETE CASCADE` just before the closing `)` of the CREATE TABLE column list. 3. **SET NULL / SET DEFAULT / ON UPDATE non-trivial** — raise 0A000; not yet implemented at the runtime side. Distinguishes inline from table-level by checking the previous non-comment token: if it's `)` (from `FOREIGN KEY (col)`), we leave the whole REFERENCES alone so JSqlParser parses it natively as a ForeignKeyIndex.
(operator-paren-rule toks)Match OPERATOR ( <ident> . <op> ) and replace the whole span with
the bare operator symbol. The op token may be any :op — single or
multi-char (~, !~, ~*, ||, etc.).
Match `OPERATOR ( <ident> . <op> )` and replace the whole span with the bare operator symbol. The op token may be any `:op` — single or multi-char (`~`, `!~`, `~*`, `||`, etc.).
(partition-by-rule toks)Strip PARTITION BY <strategy> (<expr>) after a top-level CREATE TABLE
body. Replaces the matched span with a single space; the trailing
; stays in place so statement boundaries are unaffected.
Walks the token stream looking for partition by <ident>
followed by a (...) group. The clause is paired with a CREATE
TABLE — not a CREATE INDEX or other DDL — but the rule doesn't
need that context: PARTITION BY only appears in CREATE TABLE in
any well-formed PG SQL, and the pre-parse rewrite is conservative
(we'd at worst delete a syntactically-similar but semantically-
absurd substring elsewhere).
Strip `PARTITION BY <strategy> (<expr>)` after a top-level CREATE TABLE body. Replaces the matched span with a single space; the trailing `;` stays in place so statement boundaries are unaffected. Walks the token stream looking for `partition` `by` <ident> followed by a `(...)` group. The clause is paired with a CREATE TABLE — not a CREATE INDEX or other DDL — but the rule doesn't need that context: PARTITION BY only appears in CREATE TABLE in any well-formed PG SQL, and the pre-parse rewrite is conservative (we'd at worst delete a syntactically-similar but semantically- absurd substring elsewhere).
(primary-key-only-body-rule toks)Match ( PRIMARY KEY ( <ident> [, <ident>]* ) ) as a complete
parenthesised body — i.e. nothing else inside the outer (...) —
and replace with (id serial). Used by Odoo's bootstrap-DDL where
tables declared with INHERITS (parent) carry only a PK.
Match `( PRIMARY KEY ( <ident> [, <ident>]* ) )` as a complete parenthesised body — i.e. nothing else inside the outer `(...)` — and replace with `(id serial)`. Used by Odoo's bootstrap-DDL where tables declared with `INHERITS (parent)` carry only a PK.
(quote-reserved-alias-rule toks)Find AS <reserved-kw> outside of CAST(... AS ...) contexts and
replace <reserved-kw> with "<reserved-kw>" so JSqlParser
accepts it as an identifier. PG already treats the two forms
equivalently (both produce the same column label).
Find `AS <reserved-kw>` outside of `CAST(... AS ...)` contexts and replace `<reserved-kw>` with `"<reserved-kw>"` so JSqlParser accepts it as an identifier. PG already treats the two forms equivalently (both produce the same column label).
(reserved-column-name-rule toks)Match <INDEX|KEY> varchar and quote the first ident: "INDEX" varchar.
Match `<INDEX|KEY> varchar` and quote the first ident: `"INDEX" varchar`.
(rewrite sql rules)Apply rules to sql and return the rewritten string.
Each rule is a (tokens) -> seq of [start end replacement] fn.
Throws exceptions from rules upward (callers rely on this for
unsupported-feature detection — e.g. FK ON DELETE CASCADE).
Apply rules to sql and return the rewritten string. Each rule is a `(tokens) -> seq of [start end replacement]` fn. Throws exceptions from rules upward (callers rely on this for unsupported-feature detection — e.g. FK ON DELETE CASCADE).
(select-from-rule toks)SELECT FROM … (empty projection) → SELECT 1 FROM …. PG allows
projection-less SELECT in EXISTS subqueries; JSqlParser doesn't.
`SELECT FROM …` (empty projection) → `SELECT 1 FROM …`. PG allows projection-less SELECT in EXISTS subqueries; JSqlParser doesn't.
(type-using-rule toks)Match TYPE <ident>[(...)] [<more idents>]* USING <anything> and strip
from USING to the end of the statement (next ; at depth 0 or
end of input).
The TYPE-prefixed gating is what makes this safe: a bare USING
keyword (e.g. JOIN ... USING (col)) never has a preceding TYPE
token in the same clause, so it isn't matched.
Match `TYPE <ident>[(...)] [<more idents>]* USING <anything>` and strip from `USING` to the end of the statement (next `;` at depth 0 or end of input). The `TYPE`-prefixed gating is what makes this safe: a bare `USING` keyword (e.g. JOIN ... USING (col)) never has a preceding `TYPE` token in the same clause, so it isn't matched.
cljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs |
| ← | Move to previous article |
| → | Move to next article |
| Ctrl+/ | Jump to the search field |