Token-driven hand-parser for COPY ... FROM STDIN (and COPY ... TO STDOUT, deferred). JSqlParser 5.x doesn't recognise COPY at
all — UnsupportedStatement — so the wire-protocol layer needs
structured access before it can drive the COPY-IN sub-protocol.
Also exposes row->entity-map: shared helper used by the COPY-IN
exec handler (server.clj) to turn a vector of decoded
String|::null values into a Datahike entity map keyed by
:<ns>/<col> attributes, with per-column string→type coercion
driven by :db/valueType.
Mirrors the structure of datahike.pg.sql.database: tokenise,
parse the prefix (table + optional column list + FROM/TO target),
then parse the option list in either:
WITH (key [=] value [, ...])[WITH] kw1 kw2 ... (e.g.
WITH BINARY, WITH CSV HEADER, DELIMITER '|' NULL '\N' CSV)
still in the wild from old pg_dump output.PG syntax (from ../postgres/doc/src/sgml/ref/copy.sgml):
COPY [schema.]table [ ( col [, ...] ) ] FROM { 'file' | PROGRAM 'cmd' | STDIN } [ [ WITH ] ( option [, ...] ) ]
Options accepted:
FORMAT 'text' | 'csv' | 'binary' DELIMITER 'X' — single byte NULL 'X' — null marker HEADER BOOL | MATCH — 1st row treatment QUOTE 'X' — CSV quote char ESCAPE 'X' — CSV escape char (defaults to QUOTE) FORCE_NOT_NULL ( col, ... ) | * FORCE_NULL ( col, ... ) | * FORCE_QUOTE ( col, ... ) | * — TO-only; we accept for COPY FROM as a no-op ENCODING 'X' — accepted, ignored (UTF-8 internal) FREEZE [ BOOL ] — accepted, ignored DEFAULT 'X' — defaults-marker (PG 16+) OIDS [ BOOL ] — legacy, removed in PG 12; rejected
Output shape (for COPY FROM STDIN): {:type :copy-from-stdin :ns string ;; lowercase table namespace :table string ;; original-case table name :columns [string ...] ;; lowercase, or nil if no col-list given :options {:format :text|:csv|:binary :delimiter String :null-marker String :quote String :escape String :header :true|:false|:match :force-not-null #{string ...} | :all :force-null #{string ...} | :all :encoding String :freeze? boolean :default-marker String}}
Defaults (filled by parse-copy-from):
text: delimiter "\t", null-marker "\N"
csv: delimiter ",", null-marker "",
quote """, escape = quote, header :false
binary: rejected at this layer (returns :feature-not-supported)
Token-driven hand-parser for `COPY ... FROM STDIN` (and `COPY ...
TO STDOUT`, deferred). JSqlParser 5.x doesn't recognise COPY at
all — UnsupportedStatement — so the wire-protocol layer needs
structured access before it can drive the COPY-IN sub-protocol.
Also exposes `row->entity-map`: shared helper used by the COPY-IN
exec handler (server.clj) to turn a vector of decoded
String|::null values into a Datahike entity map keyed by
`:<ns>/<col>` attributes, with per-column string→type coercion
driven by `:db/valueType`.
Mirrors the structure of `datahike.pg.sql.database`: tokenise,
parse the prefix (table + optional column list + FROM/TO target),
then parse the option list in either:
- **Modern paren form** — `WITH (key [=] value [, ...])`
- **Legacy keyword form** — `[WITH] kw1 kw2 ...` (e.g.
`WITH BINARY`, `WITH CSV HEADER`, `DELIMITER '|' NULL '\N' CSV`)
still in the wild from old pg_dump output.
PG syntax (from `../postgres/doc/src/sgml/ref/copy.sgml`):
COPY [schema.]table [ ( col [, ...] ) ]
FROM { 'file' | PROGRAM 'cmd' | STDIN }
[ [ WITH ] ( option [, ...] ) ]
Options accepted:
FORMAT 'text' | 'csv' | 'binary'
DELIMITER 'X' — single byte
NULL 'X' — null marker
HEADER BOOL | MATCH — 1st row treatment
QUOTE 'X' — CSV quote char
ESCAPE 'X' — CSV escape char (defaults to QUOTE)
FORCE_NOT_NULL ( col, ... ) | *
FORCE_NULL ( col, ... ) | *
FORCE_QUOTE ( col, ... ) | * — TO-only; we accept for COPY FROM as a no-op
ENCODING 'X' — accepted, ignored (UTF-8 internal)
FREEZE [ BOOL ] — accepted, ignored
DEFAULT 'X' — defaults-marker (PG 16+)
OIDS [ BOOL ] — legacy, removed in PG 12; rejected
Output shape (for COPY FROM STDIN):
{:type :copy-from-stdin
:ns string ;; lowercase table namespace
:table string ;; original-case table name
:columns [string ...] ;; lowercase, or nil if no col-list given
:options {:format :text|:csv|:binary
:delimiter String
:null-marker String
:quote String
:escape String
:header :true|:false|:match
:force-not-null #{string ...} | :all
:force-null #{string ...} | :all
:encoding String
:freeze? boolean
:default-marker String}}
Defaults (filled by `parse-copy-from`):
text: delimiter "\t", null-marker "\N"
csv: delimiter ",", null-marker "",
quote "\"", escape = quote, header :false
binary: rejected at this layer (returns :feature-not-supported)(options->map opts-vec)Translate the raw [["key" value] ...] option list (from either paren or legacy parser) into a normalised map with defaults applied. Throws :feature-not-supported for FORMAT 'binary' and :syntax-error for unknown options.
Translate the raw [["key" value] ...] option list (from either paren or legacy parser) into a normalised map with defaults applied. Throws :feature-not-supported for FORMAT 'binary' and :syntax-error for unknown options.
(parse-copy-from-stdin toks)Parse a tokenised COPY ... FROM STDIN statement.
Returns: {:db-name nil ;; for parity with database.clj parse shape :ns lowercase-string-or-nil :table original-case-string :columns [lowercase-string ...] | nil :options normalised-options-map}
Throws ex-info with :error :syntax-error on malformed input,
or :feature-not-supported for COPY BINARY / OIDS.
Parse a tokenised `COPY ... FROM STDIN` statement.
Returns:
{:db-name nil ;; for parity with database.clj parse shape
:ns lowercase-string-or-nil
:table original-case-string
:columns [lowercase-string ...] | nil
:options normalised-options-map}
Throws ex-info with `:error :syntax-error` on malformed input,
or `:feature-not-supported` for COPY BINARY / OIDS.(row->entity-map row columns ns row-marker schema row-idx)Build a Datahike entity map from a single COPY data row.
row — vector of (String | text-null-sentinel | csv-null-sentinel)
columns — vector of lower-case column-name strings (length
must match row)
ns — table namespace (string)
row-marker — row-existence marker keyword (e.g.
:users/db-row-exists) — set true on every
entity so SELECT * row-marker filtering finds
them, matching the convention pg-datahike's INSERT
translator uses (stmt.clj:856).
schema — Datahike :schema map
row-idx — sequential row index (used to mint a fresh
:db/id "copy-row-<n>" tempid)
NULL values are dropped from the entity (Datahike treats missing
keys as 'no datom for this attr', which is the same as SQL NULL).
Empty fields are kept as empty strings; not-null enforcement
happens later via apply-column-constraints.
Build a Datahike entity map from a single COPY data row.
row — vector of (String | text-null-sentinel | csv-null-sentinel)
columns — vector of lower-case column-name strings (length
must match row)
ns — table namespace (string)
row-marker — row-existence marker keyword (e.g.
`:users/db-row-exists`) — set true on every
entity so `SELECT *` row-marker filtering finds
them, matching the convention pg-datahike's INSERT
translator uses (stmt.clj:856).
schema — Datahike :schema map
row-idx — sequential row index (used to mint a fresh
`:db/id "copy-row-<n>"` tempid)
NULL values are dropped from the entity (Datahike treats missing
keys as 'no datom for this attr', which is the same as SQL NULL).
Empty fields are kept as empty strings; not-null enforcement
happens later via `apply-column-constraints`.cljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs |
| ← | Move to previous article |
| → | Move to next article |
| Ctrl+/ | Jump to the search field |