zero-one.geni.arrow

Liking cljdoc? Tell your friends :D

Clojure only.

collect-to-arrow
typed-action

collect-to-arrow^clj

(collect-to-arrow rdd chunk-size out-dir)

Collects the dataframe on driver and exports it as arrow files. The data gets transfered by partition, and so each partions should be small enough to fit in heap space of the driver. Then the data is saved in chunks of chunk-size rows to disk as arrow files.

rdd Spark dataset chunk-size Number of rows each arrow file will have. Should be small enoungh to make data fit in heap space of driver. out-dir Output dir of arrow files

Collects the dataframe on driver and exports it as arrow files.
The data gets transfered by partition, and so each partions should be small
 enough to fit in heap space of the driver. Then the data is saved in chunks
 of `chunk-size` rows to disk as arrow files.

 `rdd` Spark dataset
 `chunk-size` Number of rows each arrow file will have. Should be small
  enoungh to make data fit in heap space of driver.
 `out-dir` Output dir of arrow files

source raw docstring

typed-action^clj

(typed-action action col-type value-info row-info col-name allocator)

source

cljdoc builds & hosts documentation for Clojure/Script libraries

Keyboard shortcuts

`Ctrl`+`k`	Jump to recent docs
`←`	Move to previous article
`→`	Move to next article
`Ctrl`+`/`	Jump to the search field

Raise an issue Browse cljdoc source Chat on Slack

× close