The main clj-boost namespace. By requiring this namespace you get all functions for serializing, training and predicting XGBoost models.
The main clj-boost namespace. By requiring this namespace you get all functions for serializing, training and predicting XGBoost models.
(cross-validation dmatrix {:keys [params rounds nfold metrics] :as config})
Perform cross-validation on the training set.
The training set must be a DMatrix instance (see
dmatrix
) while config
is a regular map.
It returns a sequence of maps containing train and test
error for every round as defined in the config
map.
config
fields:
:params
-> training parameters (see https://xgboost.readthedocs.io/en/latest/parameter.html):rounds
-> number of boosting iterations:nfold
-> number of folds for cross-validation:metrics
-> metrics to evaluate goodness of fit, must be a vectorExample:
(cross-validation (dmatrix x y)
{:params {:eta 1.0}
:rounds 5
:nfold 3})
Perform cross-validation on the training set. The training set must be a **DMatrix** instance (see [[dmatrix]]) while `config` is a regular map. It returns a sequence of maps containing train and test error for every round as defined in the `config` map. `config` fields: - `:params` -> training parameters (see https://xgboost.readthedocs.io/en/latest/parameter.html) - `:rounds` -> number of boosting iterations - `:nfold` -> number of folds for cross-validation - `:metrics` -> metrics to evaluate goodness of fit, must be a vector Example: ``` (cross-validation (dmatrix x y) {:params {:eta 1.0} :rounds 5 :nfold 3}) ```
Serializes given data to DMatrix.
It is required by the XGBoost API to serialize data structures to use the library (https://xgboost.readthedocs.io/en/latest/jvm/java_intro.html).
dmatrix
tries to make the process as painless as possible internally
dealing with types and variable arguments.
It is possible to pass a map with :x
and optionally :y
keys,
their values must be either a sequence of sequences or a vector of vectors
for :x
and a flat vector or sequence for :y
.
The input can also be a vector of vectors/sequence of sequences for x and optionally a flat vector/sequence for y.
If given a string as input dmatrix
loads a DMatrix from the
given path in string form.
y is required only for training, not for prediction.
Examples:
(def map-with-y {:x [[1 0] [0 1]] :y [1 0]})
(dmatrix m)
(def map-without-y {:x [[1 0] [0 1]]})
(dmatrix m)
(def vec-x [[1 0] [0 1]])
(def vec-y [1 0])
(dmatrix vec-x vec-y)
(dmatrix vec-x)
(def seq-x '((1 0) (0 1)))
(def seq-y '(1 0))
(dmatrix seq-x seq-y)
(dmatrix seq-x)
(dmatrix "path/to/stored/dmatrix")
Serializes given data to **DMatrix**. It is required by the XGBoost API to serialize data structures to use the library (https://xgboost.readthedocs.io/en/latest/jvm/java_intro.html). `dmatrix` tries to make the process as painless as possible internally dealing with types and variable arguments. It is possible to pass a map with `:x` and optionally `:y` keys, their values must be either a sequence of sequences or a vector of vectors for `:x` and a flat vector or sequence for `:y`. The input can also be a vector of vectors/sequence of sequences for *x* and optionally a flat vector/sequence for *y*. If given a string as input `dmatrix` loads a **DMatrix** from the given path in string form. *y* is required only for training, not for prediction. Examples: ``` (def map-with-y {:x [[1 0] [0 1]] :y [1 0]}) (dmatrix m) (def map-without-y {:x [[1 0] [0 1]]}) (dmatrix m) (def vec-x [[1 0] [0 1]]) (def vec-y [1 0]) (dmatrix vec-x vec-y) (dmatrix vec-x) (def seq-x '((1 0) (0 1))) (def seq-y '(1 0)) (dmatrix seq-x seq-y) (dmatrix seq-x) (dmatrix "path/to/stored/dmatrix") ```
(fit dmatrix {:keys [params rounds watches early-stopping booster] :as config})
Train an XGBoost model on the given data.
The training set must be a DMatrix instance (see
dmatrix
) while config
is a regular map.
It returns a trained model (a Booster instance) that can be used for prediction or as a base margin for further training.
config
fields:
:params
-> training parameters (see https://xgboost.readthedocs.io/en/latest/parameter.html):rounds
-> number of boosting iterations:watches
-> a map of data to be evaluated during training. Usually either
{:train train-set}
to evaluate only on the training set or
{:train train-set :valid validation-set}
:early-stopping
-> training stops after this number of rounds of consecutive
increases in any evaluation metric:booster
-> Optionally set an existing model to use as base marginExample:
(fit (dmatrix x y)
{:params {:eta 1.0}
:rounds 2
:watches {:train (dmatrix x y)}
:early-stopping 10})
Train an **XGBoost** model on the given data. The training set must be a **DMatrix** instance (see [[dmatrix]]) while `config` is a regular map. It returns a trained model (a **Booster** instance) that can be used for prediction or as a base margin for further training. `config` fields: - `:params` -> training parameters (see https://xgboost.readthedocs.io/en/latest/parameter.html) - `:rounds` -> number of boosting iterations - `:watches` -> a map of data to be evaluated during training. Usually either `{:train train-set}` to evaluate only on the training set or `{:train train-set :valid validation-set}` - `:early-stopping` -> training stops after this number of rounds of consecutive increases in any evaluation metric - `:booster` -> Optionally set an existing model to use as base margin Example: ``` (fit (dmatrix x y) {:params {:eta 1.0} :rounds 2 :watches {:train (dmatrix x y)} :early-stopping 10}) ```
(load-model path)
Loads a saved XGBoost model.
Given a path to a saved XGBoost model loads it and makes it usable. Returns a Booster instance.
Example:
(load-model "path/to/model")
Loads a saved **XGBoost** model. Given a path to a saved **XGBoost** model loads it and makes it usable. Returns a **Booster** instance. Example: ``` (load-model "path/to/model") ```
(nrow dmatrix)
Returns the number of rows in a DMatrix.
Example:
(nrow (dmatrix train))
;; 50
Returns the number of rows in a **DMatrix**. Example: ``` (nrow (dmatrix train)) ;; 50 ```
Save datasets and models in a format suited for XGBoost.
Save either a DMatrix or a Booster instance to retrieve it for later use.
Example:
(persist (dmatrix dataset) "path/to/dataset")
(persist (fit (dmatrix dataset) config) "path/to/model")
Save datasets and models in a format suited for **XGBoost**. Save either a **DMatrix** or a **Booster** instance to retrieve it for later use. Example: ``` (persist (dmatrix dataset) "path/to/dataset") (persist (fit (dmatrix dataset) config) "path/to/model") ```
(pipe train-dmatrix test-dmatrix config & [path])
Train-test-persist pipeline.
This tries to reproduce a typical workflow: train a model on training data, test the model on the test dataset and optionally save the trained model.
Example:
(pipe (dmatrix train) (dmatrix test) config "path/to/save/model")
Train-test-persist pipeline. This tries to reproduce a typical workflow: train a model on training data, test the model on the test dataset and optionally save the trained model. Example: ``` (pipe (dmatrix train) (dmatrix test) config "path/to/save/model") ```
(predict model dmatrix)
Use a trained model to make predictions.
Given a Booster model and a DMatrix dataset uses the former to make predictions on the latter.
Example:
(predict (fit (dmatrix train) config)
test-dmatrix)
Use a trained model to make predictions. Given a **Booster** model and a **DMatrix** dataset uses the former to make predictions on the latter. Example: ``` (predict (fit (dmatrix train) config) test-dmatrix) ```
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close