A useful top level enumeration of functionality for reference is the sklearn top level API doc.
cross validation
provide helpers to separate a dataset into train, test, validate
k-fold cross validation helpers (also stratified, maintaining the same class balance as in the full set)
sklearn can be verbose and painful here, would be nice to destructure
by position or keys as in a doseq or let for a particular scope defined
by the kfold (macro?)
Manual indexing in particular is painful, a destructuring in a let style way to:
grid search
from sklearn.datasets import make_regression
from sklearn.multioutput import MultiOutputRegressor
from sklearn.ensemble import GradientBoostingRegressor
X, y = make_regression(n_samples=10, n_targets=3, random_state=1)
MultiOutputRegressor(GradientBoostingRegressor(random_state=0)).fit(X, y).predict(X)
A meta-ensembler that supports boosting/bagging/blending methods as described here
sklearn uses (fit ) and (estimate )
fittransformtransformestimatepredict and predict_proba (goofy name, class probability)predict as well.You want parameter search methods to have access to:
These abstractions could be protocol/interface level, data descriptors a la spec, etc. Ideally
you could pipe any of this data flow description into a DAG:
Can you improve this documentation?Edit on GitHub
cljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs |
| ← | Move to previous article |
| → | Move to next article |
| Ctrl+/ | Jump to the search field |