scicloj.metamorph.ml.column-metric

Liking cljdoc? Tell your friends :D

Clojure only.

*insist*
classification-metric
insist
regression-metric

Model evaluation metrics for classification and regression tasks.

This namespace provides functions to compute standard machine learning metrics on model predictions vs. ground truth labels, with support for both binary and multiclass classification as well as regression tasks.

Key Functions:

classification-metric: Evaluate classification model predictions
regression-metric: Evaluate regression model predictions

Classification Metrics (from fastmath.stats): Supports binary and multiclass metrics including accuracy, precision, recall, F1-score, and more. Multiclass metrics can be averaged using:

:macro - Unweighted mean of per-class metrics
:micro - Aggregated true/false positives globally Also supports :roc-auc for multiclass AUC scoring.

Regression Metrics (from fastmath.stats): Distance and similarity metrics such as MAE, MSE, RMSE, R², etc.

Data Format:

Input datasets must be tech.ml.dataset (TMD) format
Must have appropriate column metadata (:prediction, :target, etc.)
Support categorical mappings via :categorical-map metadata
Missing values and NaNs are detected and rejected appropriately

Validation: The functions perform extensive validation including:

Column metadata correctness
Missing values and NaN detection
Type and datatype uniformity
Row count alignment between datasets
Single-label assumption (multi-label not yet supported)

Example: (classification-metric y-true y-pred :f1 :macro {}) (regression-metric y-true y-pred :mse)

See also: fastmath.stats documentation for available metric names

Model evaluation metrics for classification and regression tasks.

This namespace provides functions to compute standard machine learning metrics
on model predictions vs. ground truth labels, with support for both binary and
multiclass classification as well as regression tasks.

Key Functions:
- `classification-metric`: Evaluate classification model predictions
- `regression-metric`: Evaluate regression model predictions

Classification Metrics (from fastmath.stats):
Supports binary and multiclass metrics including accuracy, precision, recall,
F1-score, and more. Multiclass metrics can be averaged using:
- `:macro` - Unweighted mean of per-class metrics
- `:micro` - Aggregated true/false positives globally
Also supports `:roc-auc` for multiclass AUC scoring.

Regression Metrics (from fastmath.stats):
Distance and similarity metrics such as MAE, MSE, RMSE, R², etc.

Data Format:
- Input datasets must be tech.ml.dataset (TMD) format
- Must have appropriate column metadata (:prediction, :target, etc.)
- Support categorical mappings via :categorical-map metadata
- Missing values and NaNs are detected and rejected appropriately

Validation:
The functions perform extensive validation including:
- Column metadata correctness
- Missing values and NaN detection
- Type and datatype uniformity
- Row count alignment between datasets
- Single-label assumption (multi-label not yet supported)

Example:
(classification-metric y-true y-pred :f1 :macro {})
(regression-metric y-true y-pred :mse)

See also: `fastmath.stats` documentation for available metric names

raw docstring

insist^clj

source

classification-metric^clj

(classification-metric y-true y-pred metric averaging)

(classification-metric y-true y-pred metric averaging options)

Calculates various classification metrics, supporting binary and multiclass data. Return a single float number

y-true A TMD dataset, having the truth
y-pred A TMD dataset, having the prediction
metric A keyword, supports any metric from: https://generateme.github.io/fastmath/clay/stats.html#binary-classification-metrics and :roc-auc
averaging How the mostly binary metrices get averaged, supports :macro and :micro
options Options for the :metric-fn

Multi-label data is so far not supported.

Both datasets need to have columns containing the appropriate column metadata as foreseen by TMD, see here:https://techascent.github.io/tech.ml.dataset/tech.v3.dataset.column-filters.html , eg:

:column-type being :prediction, :probability-distribution
:inference-target true
:categorical-map column metadata is explicitely supported and get handled properly when present, so gets taken into consideration when comparing columns

The ml/predict fn is producing these type of datasets.

The function validates various aspects and ev. rejects data which has:

wrong column metadata
missing values or NaNs
non-discrete values in :prediction column
non-uniform datatypes
multi-label data ( having > 1 :inference-target column)
mistmatch in shape between y-true and y-pred
others

This might depend on the concrete metric-fn used.

Calculates various classification metrics, supporting binary and multiclass data.
 Return a single float number  
 
 * `y-true` A TMD dataset, having the truth
 * `y-pred` A TMD dataset, having the prediction
 * `metric` A keyword, supports any metric from: https://generateme.github.io/fastmath/clay/stats.html#binary-classification-metrics
            and :roc-auc
 * `averaging` How the mostly binary metrices get averaged, supports :macro and :micro
 * `options` Options for the :metric-fn


Multi-label data is so far not supported.

Both datasets need to have columns containing the appropriate column metadata
as foreseen by TMD, see here:https://techascent.github.io/tech.ml.dataset/tech.v3.dataset.column-filters.html 
 , eg:
 * :column-type being :prediction, :probability-distribution
 * :inference-target true
 * :categorical-map column metadata is explicitely supported and get handled properly when present, so gets taken into consideration
 when comparing columns

 The `ml/predict` fn is producing these type of datasets.

The function validates various aspects and ev. rejects data which has:
 * wrong column metadata
 * missing values or NaNs
 * non-discrete values in :prediction column
 * non-uniform datatypes
 * multi-label data ( having > 1 :inference-target column)
 * mistmatch in shape between `y-true` and `y-pred`
 * others
 
 This might depend on the concrete metric-fn used.

source raw docstring

insist^cljmacro

(insist x)

(insist x message)

Evaluates expression x and throws an AssertionError with optional message if x does not evaluate to logical true.

Assertion checks are omitted from compiled code if 'assert' is false.

Evaluates expression x and throws an AssertionError with optional
message if x does not evaluate to logical true.

Assertion checks are omitted from compiled code if '*assert*' is
false.

source raw docstring

regression-metric^clj

(regression-metric y-true y-pred metric-fn)

Calculates various regression metrics and return a single float number

y-true A TMD dataset, having the truth
y-pred A TMD dataset, having the prediction
metric A keyword, supports any metric from: https://generateme.github.io/fastmath/clay/stats.html#distance-and-similarity-metrics

Both datasets need to have columns containing the appropriate column metadata as foreseen by TMD, see here:https://techascent.github.io/tech.ml.dataset/tech.v3.dataset.column-filters.html , eg:

:column-type being :prediction
:inference-target true

The ml/predict fn is producing these type of datasets.

The function validates various aspects and ev. rejects data which has:

wrong column metadata
missing values or NaNs
non-continous values in :prediction column
non-uniform datatypes
is multi-label data ( having > 1 :inference-target column)
mistmatch in shape between y-true and y-pred
others

This might depend on the concrete metric-fn used.

Calculates various regression metrics and return a single float number  
 
 * `y-true` A TMD dataset, having the truth
 * `y-pred` A TMD dataset, having the prediction
 * `metric` A keyword, supports any metric from: https://generateme.github.io/fastmath/clay/stats.html#distance-and-similarity-metrics



Both datasets need to have columns containing the appropriate column metadata
as foreseen by TMD, see here:https://techascent.github.io/tech.ml.dataset/tech.v3.dataset.column-filters.html 
 , eg:
 * :column-type being :prediction
 * :inference-target true

 The `ml/predict` fn is producing these type of datasets.

The function validates various aspects and ev. rejects data which has:
 * wrong column metadata
 * missing values or NaNs
 * non-continous values in :prediction column
 * non-uniform datatypes
 * is multi-label data ( having > 1 :inference-target column)
 * mistmatch in shape between `y-true` and `y-pred`
 * others
 
 This might depend on the concrete metric-fn used.

source raw docstring

cljdoc builds & hosts documentation for Clojure/Script libraries

Keyboard shortcuts

`Ctrl`+`k`	Jump to recent docs
`←`	Move to previous article
`→`	Move to next article
`Ctrl`+`/`	Jump to the search field

Raise an issue Browse cljdoc source Chat on Slack

× close