Helpers for bulk operations such as concatenating a large sequence of csv files.
Helpers for bulk operations such as concatenating a large sequence of csv files.
(batch-csv-rows batch-size row-seq)
(batch-csv-rows batch-size options row-seq)
Given a potentially very large sequence of rows, lazily return batches of rows. Returned object has an efficient iterator, IReduceInit (3 arg reduce) implementations and a fairly inefficient seq implementation. Each previous batch must be completely read before the .hasNext function of the iterator will return an accurate result.
Options:
:header?
- When true, the header row will be returned as the first row
of each batch. Defaults to true.Given a potentially very large sequence of rows, lazily return batches of rows. Returned object has an efficient iterator, IReduceInit (3 arg reduce) implementations and a fairly inefficient seq implementation. Each previous batch must be completely read before the .hasNext function of the iterator will return an accurate result. Options: * `:header?` - When true, the header row will be returned as the first row of each batch. Defaults to true.
(cat-csv-inputs)
(cat-csv-inputs options)
Stateful transducer that, given a sequence of inputs, produces a single sequence
of parsed csv rows. This transducer slices off the header rows of downstream
inputs when :header?
is true.
Options:
:header?
- defaults to true - assume first row of each file is a header row.Options are passed through to read-csv-supplier.
Example:
(transduce (comp (bulk/cat-csv-inputs options) (map tfn)) (charred/write-csv-rf options) fseq)
Stateful transducer that, given a sequence of inputs, produces a single sequence of parsed csv rows. This transducer slices off the header rows of downstream inputs when `:header?` is true. Options: - `:header?` - defaults to true - assume first row of each file is a header row. Options are passed through to read-csv-supplier. Example: ```clojure (transduce (comp (bulk/cat-csv-inputs options) (map tfn)) (charred/write-csv-rf options) fseq) ```
(concatenate-csv output fseq)
(concatenate-csv output options fseq)
Given a sequence of csv files, concatenate into a single csv file.
Returns the number of rows written.
Options:
:header?
- defaults to true - assume first row of each file is a reader row.:tfn
- function from row->row that receives all output rows (header rows, aside from the first
are elided). If this function returns 'nil' that row is then elided from output.Example:
user> (->> (repeat 10 (java.io.File. "/home/chrisn/dev/tech.all/tech.ml.dataset/test/data/stocks.csv"))
(bulk/concatenate-csv "test/data/big-stocks.csv" {:header? false}))
5610
user> (->> (repeat 10 (java.io.File. "/home/chrisn/dev/tech.all/tech.ml.dataset/test/data/stocks.csv"))
(bulk/concatenate-csv "test/data/big-stocks.csv" {:header? true}))
5601
Given a sequence of csv files, concatenate into a single csv file. * fseq - a sequence of java.io.File's or other inputs to read-csv-supplier * output - an output stream or other closeable stream. Returns the number of rows written. Options: - `:header?` - defaults to true - assume first row of each file is a reader row. - `:tfn` - function from row->row that receives all output rows (header rows, aside from the first are elided). If this function returns 'nil' that row is then elided from output. Example: ```clojure user> (->> (repeat 10 (java.io.File. "/home/chrisn/dev/tech.all/tech.ml.dataset/test/data/stocks.csv")) (bulk/concatenate-csv "test/data/big-stocks.csv" {:header? false})) 5610 user> (->> (repeat 10 (java.io.File. "/home/chrisn/dev/tech.all/tech.ml.dataset/test/data/stocks.csv")) (bulk/concatenate-csv "test/data/big-stocks.csv" {:header? true})) 5601 ```
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close