stratum.parquet/index-parquet!): reads a Parquet file row-group-by-row-group into chunked PersistentColumnIndex columns, syncing periodically to konserve so the chunk heap is reclaimable. Memory bounded by chunk-size × num-cols × 8 B during reading, independent of file size. Wired into stratum.files/index-file-into-store! so the --index and SQL read_parquet+--index paths use it automatically.stratum.parquet/from-parquet (heap path) rewritten on top of pre-allocated primitive arrays + dict-encoded strings at ingest. No more ArrayList<Object> boxing or per-row String materialization. Same public API; ~5× lower peak heap on column-heavy files.chunk/chunk-to-bytes cast Double/MAX_VALUE (compute-stats's "no values seen" sentinel) to long, throwing Value out of range for chunks where every row is NULL. Now guarded by null-count and encoded as constant-NULL (8 bytes).ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING now correctly routes to the sliding window path instead of falling through to the default running sum.SELECT CORR(x, y) FROM t without GROUP BY no longer crashes.COALESCE(NULL, 1) and N-ary COALESCE(NULL, NULL, 42) now work correctly. Multi-argument COALESCE is nested into binary pairs.WHERE x NOT IN (1, NULL) now correctly returns no rows per SQL three-valued logic.Initial public release.
Can you improve this documentation?Edit on GitHub
cljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs |
| ← | Move to previous article |
| → | Move to next article |
| Ctrl+/ | Jump to the search field |