Structural SQL classifier — routes statements to the right handler before JSqlParser sees them.
Kills the regex sprawl across system-query?, parse-sql's CT6
guard, and the arg-extraction regexes (savepoint names, advisory-
lock keys, temporal SET values) by feeding those sites a real
tokenizer.
The tokenizer handles enough of PG lexical syntax to classify correctly in the face of keyword-inside-a-string, keyword-inside- a-comment, dollar-quoted strings, and case mix. It does NOT parse expressions — it yields a flat token stream for a keyword-dispatch classifier.
Classifier output: {:kind <statement-kind keyword> :name <savepoint/variable name if relevant> :args <vector of literal args — advisory keys, pg_sleep dur> :var <SET/RESET/SHOW variable name> :value <SET value (string) when captured> :reject-kind <opt-out knob key> :tag <synthetic command tag on silent-accept>}
:kind :generic-sql means 'pass to JSqlParser unchanged'.
Non-goals: full PG lexer, expression parsing. The token-driven preprocess-sql rewriter lives in datahike.pg.rewrite and consumes this tokenizer's output.
Structural SQL classifier — routes statements to the right handler
before JSqlParser sees them.
Kills the regex sprawl across `system-query?`, `parse-sql`'s CT6
guard, and the arg-extraction regexes (savepoint names, advisory-
lock keys, temporal SET values) by feeding those sites a real
tokenizer.
The tokenizer handles enough of PG lexical syntax to classify
correctly in the face of keyword-inside-a-string, keyword-inside-
a-comment, dollar-quoted strings, and case mix. It does NOT parse
expressions — it yields a flat token stream for a keyword-dispatch
classifier.
Classifier output:
{:kind <statement-kind keyword>
:name <savepoint/variable name if relevant>
:args <vector of literal args — advisory keys, pg_sleep dur>
:var <SET/RESET/SHOW variable name>
:value <SET value (string) when captured>
:reject-kind <opt-out knob key>
:tag <synthetic command tag on silent-accept>}
:kind :generic-sql means 'pass to JSqlParser unchanged'.
Non-goals: full PG lexer, expression parsing. The token-driven
preprocess-sql rewriter lives in datahike.pg.rewrite and consumes
this tokenizer's output.(classify sql)Classify a SQL string. See namespace docstring for the output contract.
Classify a SQL string. See namespace docstring for the output contract.
(tokenize sql)Lazy seq of non-comment tokens. Skips whitespace and comments. Stops at end-of-input; does not attempt to recover from mid-string EOF (caller will see :generic-sql and JSqlParser will report the real syntax error).
Lazy seq of non-comment tokens. Skips whitespace and comments. Stops at end-of-input; does not attempt to recover from mid-string EOF (caller will see :generic-sql and JSqlParser will report the real syntax error).
(tokenize-all sql)Lazy seq of tokens INCLUDING :comment tokens (line and block).
Useful for source-rewriting callers that need to know comment
spans so they don't excise into or across a commented region.
Classification itself uses tokenize, which filters comments
out for speed.
Lazy seq of tokens INCLUDING :comment tokens (line and block). Useful for source-rewriting callers that need to know comment spans so they don't excise into or across a commented region. Classification itself uses `tokenize`, which filters comments out for speed.
cljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs |
| ← | Move to previous article |
| → | Move to next article |
| Ctrl+/ | Jump to the search field |