reorder-columns can work on grouped dataset nowDeps updated
Documentation changed to be generated by Clay instead of RMarkdown
Deps updated to fix j/left-join issue.
nil as missing value only, discussion:nil-missing? in more places needed (group-by operations), discussiongroup-by documentation PR115, thanks to MarshallCollections/shuffle removeddataset (copied from TMD), #112rows accepts :nil-missing?(default: true) and copying?(default: false) options.Deps updated
:hashing is available for single column joins too:hashing option determines method of creating an index for multicolumn joins (was hash is identity)Deps updated
map-rows to map each row and produce new columnsrows can return sequence of vectors (:as-vecs)Updated to TMD v7
Differences:
Clojure upgraded to 1.11.1
separate-column infers column names when function is used and target-columns is nil, #78separate-column repleces source column with target on every caseclojure.core/pmap with dtype-next version (related to #325)get-entry introduced
anti-join and semi-join bugs when tables contain missing valuescrosstab - cross tabulationpivot->longer :coerce-to-number option addedpivot->wider no longer coerces column names to strings, it's up to userTMD version bump
[breaking]
replace-missing up/down strategies clarified. :down is replaced by :downup and :up is replaced by :updown. :down and :up work only in one direction now.
https://github.com/techascent/tech.ml.dataset/issues/305
data frame term in the title of docs (discussion)cross-join, expand and complete introduced*warn-on-reflection*Version bump
unroll and fold-by by @holyjak (#60 and #61)select-rows accepts IFn for row selection.pipeline namespace is stripped, all functions are moved to metamorph library. This is temporary solution before removing this namespace completely. Pipelined versions of functions will be moved to metamorph as well later.add-columnapi, is: tc)replace-missing on grouped dataset has swapped argumentsupdate-columns on grouped dataset:as-rows nowadd-column default strategy is :strict now.TMD upgrade, no changes in TC
TMD upgrade
reorder-columns on empty dataset returns nilaggregate-columns didn't keep column order (#35)pipeline functions have doc copied from original onessplit can turn off shuffling now (:shuffle? option)split :holdouts - sequence of consecutive holdoutstech.ml.dataset version bump, this introduces the change of the order of the groups after group-by operation
split :holdout supports any number of splits (minimum 2) [#28]split supports split-names to provide custom names for subdatasetsconcat and concat-copying are working with grouped datasetskfold split failed on small number of rows (due to partition-all behavioursplit->seq to return train/test splits as a sequence or datasets or as map of sequences for grouped datasetstablecloth.pipeline returns a map with dataset under :metamorph/data key (see metamorph)split returns now a dataset or grouped dataset with two new columns indicating train/test and split id. See split->seq for previous behaviour.without-grouping-> threading macro which allows operations on grouping dataset treated as a regular one.group-by accepts any java.util.Map for a collection of indexes (use LinkedHashMap to persist an order)tablecloth.api.group-by functions moved to tablecloth.api.utils, no changes to APIadd-or-replace-column(s) replaced by add-column(s) (add-or-replace-column(s) is marked as deprecated) (#16)mark-as-group wasn't visible in API (#18)map-columns didn't propagate new-type for grouped case (#20)let-dataset - to simulate tibble from Rrows and columns new result: :as-double-arrays - convert rows to 2d double arraytablecloth.pipeline for pipeline operationsconcat-copying exposed.split function for splitting into train-test pairs with :kfold, :bootstrap, :loo and holdout strategies + stratified versionsreplace-missing with new strategy :midpointt.m.d update
t.m.d update
t.m.d update
write-nippy! and read-nippy are deprecated, replaced by write! and datasettech.ml.dataset version 5.0-alpha*
map-columns accepts optional target datatypeds/column->dataset functionality introduced in separate-column:text among others)write-csv! replaced by write! (write-csv! is marked as deprecated)info field :size is replaced by :n-elemsseparate-column 3-arity version accepts separator instead target-columns nowtech.ml.dataset version 4.04
tech.ml.dataset version 4.03
parallel? option set to true). These are: aggregate, unique-by, order-by, join-columns, separate-columns, ungroupaggregation uses now in-place ungrouping which is much fastertech.ml.dataset version 3.06
fill-range-replace to inject data to make continuous seqence in columnwrite-nippy! and read-nippytech.ml.dataset version 2.13
replace-missing new strategies: :mid and :lerp, working also for dates.replace-missing has different conctract and default strategy :mid. value argument is the last argument now.replace-missing :up and :down strategies, when value is nil fills border missing values with nearest value.tech.ml.dataset version 2.06
asof-join addedreshape testspivot->wider accepts :drop-missing? option (default: true)pivot->wider drops missing rows by defaultpivto->wider order of concatenated column names is reversed (first: colnames, last: value), was opposite.pivot->longer :splitter accepts string used for splitting column nameCan you improve this documentation? These fine people already did:
GenerateMe, genmeblog, Carsten Behring, Kira McLean, apanj00 & ashimapanjwaniEdit on GitHub
cljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs |
| ← | Move to previous article |
| → | Move to next article |
| Ctrl+/ | Jump to the search field |