Liking cljdoc? Tell your friends :D

zero-one.geni.storage


disk-onlyclj

Flag for controlling the storage of an RDD.

DataFrame is stored only on disk and the CPU computation time is high as I/O involved.

Flag for controlling the storage of an RDD.

DataFrame is stored only on disk and the CPU computation time is high as I/O involved.
sourceraw docstring

disk-only-2clj

Flag for controlling the storage of an RDD.

Same as disk-only storage level but replicate each partition to two cluster nodes.

Flag for controlling the storage of an RDD.

Same as disk-only storage level but replicate each partition to two cluster nodes.
sourceraw docstring

memory-and-diskclj

Flag for controlling the storage of an RDD.

The default behavior of the DataFrame or Dataset. In this Storage Level, The DataFrame will be stored in JVM memory as deserialized objects. When required storage is greater than available memory, it stores some of the excess partitions into a disk and reads the data from disk when it required. It is slower as there is I/O involved.

Flag for controlling the storage of an RDD.

The default behavior of the DataFrame or Dataset. In this Storage Level, The DataFrame will be stored in JVM memory as deserialized objects. When required storage is greater than available memory, it stores some of the excess partitions into a disk and reads the data from disk when it required. It is slower as there is I/O involved.
sourceraw docstring

memory-and-disk-2clj

Flag for controlling the storage of an RDD.

Same as memory-and-disk storage level but replicate each partition to two cluster nodes.

Flag for controlling the storage of an RDD.

Same as memory-and-disk storage level but replicate each partition to two cluster nodes.
sourceraw docstring

memory-and-disk-serclj

Flag for controlling the storage of an RDD.

Same as memory-and-disk storage level difference being it serializes the DataFrame objects in memory and on disk when space not available.

Flag for controlling the storage of an RDD.

Same as `memory-and-disk` storage level difference being it serializes the DataFrame objects in memory and on disk when space not available.
sourceraw docstring

memory-and-disk-ser-2clj

Flag for controlling the storage of an RDD.

Same as memory-and-disk-ser storage level but replicate each partition to two cluster nodes.

Flag for controlling the storage of an RDD.

Same as memory-and-disk-ser storage level but replicate each partition to two cluster nodes.
sourceraw docstring

memory-onlyclj

Flag for controlling the storage of an RDD.

Flag for controlling the storage of an RDD.
sourceraw docstring

memory-only-2clj

Flag for controlling the storage of an RDD.

Same as memory-only storage level but replicate each partition to two cluster nodes.

Flag for controlling the storage of an RDD.

Same as `memory-only` storage level but replicate each partition to two cluster nodes.
sourceraw docstring

memory-only-serclj

Flag for controlling the storage of an RDD.

Same as memory-only but the difference being it stores RDD as serialized objects to JVM memory. It takes lesser memory (space-efficient) then memory-only as it saves objects as serialized and takes an additional few more CPU cycles in order to deserialize.

Flag for controlling the storage of an RDD.

Same as `memory-only` but the difference being it stores RDD as serialized objects to JVM memory. It takes lesser memory (space-efficient) then `memory-only` as it saves objects as serialized and takes an additional few more CPU cycles in order to deserialize.
sourceraw docstring

memory-only-ser-2clj

Flag for controlling the storage of an RDD.

Same as memory-only-ser storage level but replicate each partition to two cluster nodes.

Flag for controlling the storage of an RDD.

Same as `memory-only-ser` storage level but replicate each partition to two cluster nodes.
sourceraw docstring

noneclj

Flag for controlling the storage of an RDD.

No caching.

Flag for controlling the storage of an RDD.

No caching.
sourceraw docstring

off-heapclj

Flag for controlling the storage of an RDD.

Off-heap refers to objects (serialised to byte array) that are managed by the operating system but stored outside the process heap in native memory (therefore, they are not processed by the garbage collector). Accessing this data is slightly slower than accessing the on-heap storage but still faster than reading/writing from a disk. The downside is that the user has to manually deal with managing the allocated memory.

Flag for controlling the storage of an RDD.

Off-heap refers to objects (serialised to byte array) that are managed by the operating system but stored outside the process heap in native memory (therefore, they are not processed by the garbage collector). Accessing this data is slightly slower than accessing the on-heap storage but still faster than reading/writing from a disk. The downside is that the user has to manually deal with managing the allocated memory.
sourceraw docstring

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close