Liking cljdoc? Tell your friends :D

Clojask

Clojure data frame with parallel computing on larger-than-memory datasets

Features

  • Unlimited size

    Theoretically speaking, it supports dataset larger than memory, even to infinity!

  • Fast

    Faster than Dask in most operations, and the larger the dataframe is, the bigger the advantage

  • All native types

    All the datatypes used to store data is native Clojure (or Java) types!

  • From file to file

    Integrate IO inside the dataframe. No need to write your own read-in and output functions!

  • Parallel

    Most operations could be executed into multiple threads or even machines. See the principle in Onyx.

  • Lazy operations

    Most operations will not be executed immediately. Dataframe will intelligently pipeline the operations altogether in computation.

Installation

Available on Clojars.

Insert this line into your project.clj if using Leiningen.

[com.github.clojure-finance/clojask "1.1.0"]

Insert this line into your deps.edn if using CLI.

com.github.clojure-finance/clojask {:mvn/version "1.1.0"}

Documentation

The detailed doc for every API can be found here.

Examples

A separate repository for some typical usage of Clojask can be found here.

Problem Feedback

If your question is not answered in existing issues. Feel free to create a new one.

Can you improve this documentation? These fine people already did:
Yuchen Liu, awoo424, c-sungho, Angel Woo & clojure-finance
Edit on GitHub

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close