Liking cljdoc? Tell your friends :D

charred.bulk

Helpers for bulk operations such as concatenating a large sequence of csv files.

Helpers for bulk operations such as concatenating a large sequence of csv files.
raw docstring

batch-csv-rowsclj

(batch-csv-rows batch-size row-seq)
(batch-csv-rows batch-size options row-seq)

Given a potentially very large sequence of rows, lazily return batches of rows. Returned object has an efficient iterator, IReduceInit (3 arg reduce) implementations and a fairly inefficient seq implementation. Each previous batch must be completely read before the .hasNext function of the iterator will return an accurate result.

Options:

  • :header? - When true, the header row will be returned as the first row of each batch. Defaults to true.
Given a potentially very large sequence of rows, lazily return batches of rows.
Returned object has an efficient iterator, IReduceInit (3 arg reduce) implementations
and a fairly inefficient seq implementation.  Each previous batch must be completely
read before the .hasNext function of the iterator will return an accurate result.

Options:

* `:header?` - When true, the header row will be returned as the first row
  of each batch.  Defaults to true.
raw docstring

cat-csv-inputsclj

(cat-csv-inputs)
(cat-csv-inputs options)

Stateful transducer that, given a sequence of inputs, produces a single sequence of parsed csv rows. This transducer slices off the header rows of downstream inputs when :header? is true.

Options:

  • :header? - defaults to true - assume first row of each file is a header row.

Options are passed through to read-csv-supplier.

Example:

(transduce (comp (bulk/cat-csv-inputs options) (map tfn)) (charred/write-csv-rf options) fseq)
Stateful transducer that, given a sequence of inputs, produces a single sequence
of parsed csv rows.  This transducer slices off the header rows of downstream
inputs when `:header?` is true.

Options:
 - `:header?` - defaults to true - assume first row of each file is a header row.

Options are passed through to read-csv-supplier.

Example:

```clojure
(transduce (comp (bulk/cat-csv-inputs options) (map tfn)) (charred/write-csv-rf options) fseq)
```
raw docstring

concatenate-csvclj

(concatenate-csv output fseq)
(concatenate-csv output options fseq)

Given a sequence of csv files, concatenate into a single csv file.

  • fseq - a sequence of java.io.File's or other inputs to read-csv-supplier
  • output - an output stream or other closeable stream.

Returns the number of rows written.

Options:

  • :header? - defaults to true - assume first row of each file is a reader row.
  • :tfn - function from row->row that receives all output rows (header rows, aside from the first are elided). If this function returns 'nil' that row is then elided from output.

Example:

user> (->> (repeat 10 (java.io.File. "/home/chrisn/dev/tech.all/tech.ml.dataset/test/data/stocks.csv"))
           (bulk/concatenate-csv "test/data/big-stocks.csv" {:header? false}))
5610
user> (->> (repeat 10 (java.io.File. "/home/chrisn/dev/tech.all/tech.ml.dataset/test/data/stocks.csv"))
           (bulk/concatenate-csv "test/data/big-stocks.csv" {:header? true}))
5601
Given a sequence of csv files, concatenate into a single csv file.
  * fseq - a sequence of java.io.File's or other inputs to read-csv-supplier
  * output - an output stream or other closeable stream.


  Returns the number of rows written.

  Options:
   - `:header?` - defaults to true - assume first row of each file is a reader row.
   - `:tfn` - function from row->row that receives all output rows (header rows, aside from the first
      are elided).  If this function returns 'nil' that row is then elided from output.

  Example:

```clojure
user> (->> (repeat 10 (java.io.File. "/home/chrisn/dev/tech.all/tech.ml.dataset/test/data/stocks.csv"))
           (bulk/concatenate-csv "test/data/big-stocks.csv" {:header? false}))
5610
user> (->> (repeat 10 (java.io.File. "/home/chrisn/dev/tech.all/tech.ml.dataset/test/data/stocks.csv"))
           (bulk/concatenate-csv "test/data/big-stocks.csv" {:header? true}))
5601
```
  
raw docstring

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close