Liking cljdoc? Tell your friends :D

scicloj.metamorph.ml.preprocessing

Feature scaling and normalization transformers for metamorph pipelines.

This namespace provides metamorph-compatible transformers for standardizing and normalizing numeric features. These preprocessing steps are essential for many machine learning algorithms to perform well.

Available Transformers:

  • std-scale: Standardization (z-score normalization)
  • min-max-scale: Min-max scaling to a specified range

StandardScaling (std-scale): Centers each numeric column (subtract mean) and/or scales by standard deviation, producing zero-mean unit-variance data. Useful for:

  • Algorithms sensitive to feature magnitude (SVMs, neural networks, KNN)
  • Distance-based models Options:
  • :mean? (default true): Center by subtracting column mean
  • :stddev? (default true): Scale by standard deviation

Min-Max Scaling (min-max-scale): Rescales each numeric column to a specified range (default [-0.5, 0.5]). Options:

  • :min (default -0.5): Target minimum value
  • :max (default 0.5): Target maximum value

Metamorph Integration: Both transformers follow the metamorph pipeline pattern:

  • :fit mode: Learn scaling parameters from training data
  • :transform mode: Apply learned parameters to new data
  • Stores transformation parameters in context under their assigned :metamorph/id

Example Usage (in metamorph pipeline): (preprocessing/std-scale [:age :income] {:mean? true :stddev? true}))

Feature scaling and normalization transformers for metamorph pipelines.

This namespace provides metamorph-compatible transformers for standardizing and
normalizing numeric features. These preprocessing steps are essential for many
machine learning algorithms to perform well.

Available Transformers:
- `std-scale`: Standardization (z-score normalization)
- `min-max-scale`: Min-max scaling to a specified range

StandardScaling (std-scale):
Centers each numeric column (subtract mean) and/or scales by standard deviation,
producing zero-mean unit-variance data. Useful for:
- Algorithms sensitive to feature magnitude (SVMs, neural networks, KNN)
- Distance-based models
Options:
- `:mean?` (default true): Center by subtracting column mean
- `:stddev?` (default true): Scale by standard deviation

Min-Max Scaling (min-max-scale):
Rescales each numeric column to a specified range (default [-0.5, 0.5]).
Options:
- `:min` (default -0.5): Target minimum value
- `:max` (default 0.5): Target maximum value

Metamorph Integration:
Both transformers follow the metamorph pipeline pattern:
- `:fit` mode: Learn scaling parameters from training data
- `:transform` mode: Apply learned parameters to new data
- Stores transformation parameters in context under their assigned `:metamorph/id`

Example Usage (in metamorph pipeline):
  (preprocessing/std-scale [:age :income] {:mean? true :stddev? true}))
raw docstring

min-max-scaleclj

(min-max-scale columns-selector
               {:keys [min max] :or {min -0.5 max 0.5} :as options})

Metamorph transfomer, which scales the column data into a given range.

columns-selector tablecloth columns-selector to choose columns to work on meta-field tablecloth meta-field working with columns-selector

options Options for scaler, can take: min Minimal value to scale to (default -0.5) max Maximum value to scale to (default 0.5)

metamorph.
Behaviour in mode :fitScales the dataset at key :metamorph/data and stores the trained model in ctx under key at :metamorph/id
Behaviour in mode :transformReads trained min-max-scale model from ctx and applies it to data in :metamorph/data
Reads keys from ctxIn mode :transform : Reads trained model to use for from key in :metamorph/id.
Writes keys to ctxIn mode :fit : Stores trained model in key $id
Metamorph transfomer, which scales the column data into a given range.

`columns-selector` tablecloth columns-selector to choose columns to work on
`meta-field` tablecloth meta-field working with `columns-selector`

`options` Options for scaler, can take:
    `min` Minimal value to scale to (default -0.5)
    `max` Maximum value to scale to (default 0.5)

metamorph                            | .
-------------------------------------|----------------------------------------------------------------------------
Behaviour in mode :fit               | Scales the dataset at key `:metamorph/data` and stores the trained model in ctx under key at `:metamorph/id`
Behaviour in mode :transform         | Reads trained min-max-scale model from ctx and applies it to data in `:metamorph/data`
Reads keys from ctx                  | In mode `:transform` : Reads trained model to use for from key in `:metamorph/id`.
Writes keys to ctx                   | In mode `:fit` : Stores trained model in key $id

sourceraw docstring

std-scaleclj

(std-scale columns-selector options)
(std-scale columns-selector
           meta-field
           {:keys [mean? stddev?] :or {mean? true stddev? true} :as options})

Metamorph transfomer, which centers and scales the dataset per column.

columns-selector tablecloth columns-selector to choose columns to work on meta-field tablecloth meta-field working with columns-selector

options are the options for the scaler and can take: mean? If true (default), the data gets shifted by the column means, so 0 centered stddev? If true (default), the data gets scaled by the standard deviation of the column

metamorph.
Behaviour in mode :fitCenters and scales the dataset at key :metamorph/data and stores the trained model in ctx under key at :metamorph/id
Behaviour in mode :transformReads trained std-scale model from ctx and applies it to data in :metamorph/data
Reads keys from ctxIn mode :transform : Reads trained model to use for from key in :metamorph/id.
Writes keys to ctxIn mode :fit : Stores trained model in key $id
Metamorph transfomer, which centers and scales the dataset per column.

`columns-selector` tablecloth columns-selector to choose columns to work on
`meta-field` tablecloth meta-field working with `columns-selector`


`options` are the options for the scaler and can take:
   `mean?` If true (default), the data gets shifted by the column means, so 0 centered
   `stddev?` If true (default), the data gets scaled by the standard deviation of the column

metamorph                            | .
-------------------------------------|----------------------------------------------------------------------------
Behaviour in mode :fit               | Centers and scales the dataset at key `:metamorph/data` and stores the trained model in ctx under key at `:metamorph/id`
Behaviour in mode :transform         | Reads trained std-scale model from ctx and applies it to data in `:metamorph/data`
Reads keys from ctx                  | In mode `:transform` : Reads trained model to use for from key in `:metamorph/id`.
Writes keys to ctx                   | In mode `:fit` : Stores trained model in key $id

sourceraw docstring

cljdoc builds & hosts documentation for Clojure/Script libraries

Keyboard shortcuts
Ctrl+kJump to recent docs
Move to previous article
Move to next article
Ctrl+/Jump to the search field
× close