(split ds)
(split ds split-type)
(split ds split-type {:keys [seed parallel?] :as opts})
Split given dataset into train and test datasets as a lazy sequence of maps containing with :train
and :test
keys.
split-type
can be one of the following:
:kfold
- k-fold strategy, :k
defines number of folds (defaults to 5
), produces k
splits:bootstrap
- :ratio
defines ratio of observations put into result (defaults to 1.0
), produces 1
split:holdout
- split into two parts with given ratio (defaults to 2/3
), produces 1
split:loo
- leave one out, produces the same number of splits as number of observationsAdditionally you can provide:
:seed
- for random number generator:repeats
- repeat procedure :repeats
times:partition-selector
- same as in group-by
for stratified splitting to reflect dataset structure in splits.Rows are shuffled before splitting.
In case of grouped dataset each group is processed separately, pairs of grouped dataset are returned.
See more
Split given dataset into train and test datasets as a lazy sequence of maps containing with `:train` and `:test` keys. `split-type` can be one of the following: * `:kfold` - k-fold strategy, `:k` defines number of folds (defaults to `5`), produces `k` splits * `:bootstrap` - `:ratio` defines ratio of observations put into result (defaults to `1.0`), produces `1` split * `:holdout` - split into two parts with given ratio (defaults to `2/3`), produces `1` split * `:loo` - leave one out, produces the same number of splits as number of observations Additionally you can provide: * `:seed` - for random number generator * `:repeats` - repeat procedure `:repeats` times * `:partition-selector` - same as in `group-by` for stratified splitting to reflect dataset structure in splits. Rows are shuffled before splitting. In case of grouped dataset each group is processed separately, pairs of grouped dataset are returned. See [more](https://www.mitpressjournals.org/doi/pdf/10.1162/EVCO_a_00069)
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close