Liking cljdoc? Tell your friends :D
Clojure only.

scicloj.metamorph.ml.r-model-matrix

R-style formula-based feature engineering and linear regression.

This namespace provides tools to leverage R's powerful formula syntax for feature engineering and linear modeling within Clojure. R formulas enable expressive specification of interactions, transformations, and categorical expansions without manual column manipulation.

Key Functions:

  • r-model-matrix: Convert dataset + R formula to design matrix
  • lm: Simplified linear regression using R formulas

Implementation Backends: The namespace supports multiple R execution backends:

  • :ocpu Remote R via OpenCPU (cloud.opencpu.org) - no local R needed
  • :renjine Java-based R implementation (https://renjin.org/)
  • :clojisr Local R via clojisr (requires R installation)

Model Matrix Capabilities: R formulas handle:

  • Basic features: y ~ x1 + x2
  • Interactions: y ~ x1 * x2 (expands to x1 + x2 + x1:x2)
  • Polynomial terms: y ~ x + I(x^2)
  • Categorical encoding: Automatic dummy variable creation
  • Intercept control: y ~ x - 1 (remove intercept)
  • Exclusions: y ~ . - x3 (all columns except x3)

Linear Regression (lm): Combines formula-based feature engineering with OLS regression training. Returns a ready-to-use trained model for predictions.

Example Usage: (r-model-matrix iris-data "~ Sepal.Length + Sepal.Width" :renjine) (lm iris-data "Sepal.Width ~ Sepal.Length * Petal.Length" :Sepal.Width :ocpu)

Notes:

  • OpenCPU backend is convenient but requires internet connectivity
  • Renjin is standalone but may have some R incompatibilities
  • clojisr requires a local R installation but offers full R compatibility
  • Returned model matrices exclude row names and intercept columns by default

See also: scicloj.metamorph.ml.design-matrix for Clojure-native feature engineering

R-style formula-based feature engineering and linear regression.

This namespace provides tools to leverage R's powerful formula syntax for
feature engineering and linear modeling within Clojure. R formulas enable
expressive specification of interactions, transformations, and categorical
expansions without manual column manipulation.

Key Functions:
- `r-model-matrix`: Convert dataset + R formula to design matrix
- `lm`: Simplified linear regression using R formulas

Implementation Backends:
The namespace supports multiple R execution backends:
- `:ocpu`    Remote R via OpenCPU (cloud.opencpu.org) - no local R needed
- `:renjine` Java-based R implementation (https://renjin.org/)
- `:clojisr` Local R via clojisr (requires R installation)

Model Matrix Capabilities:
R formulas handle:
- Basic features: `y ~ x1 + x2`
- Interactions: `y ~ x1 * x2` (expands to x1 + x2 + x1:x2)
- Polynomial terms: `y ~ x + I(x^2)`
- Categorical encoding: Automatic dummy variable creation
- Intercept control: `y ~ x - 1` (remove intercept)
- Exclusions: `y ~ . - x3` (all columns except x3)

Linear Regression (lm):
Combines formula-based feature engineering with OLS regression training.
Returns a ready-to-use trained model for predictions.

Example Usage:
(r-model-matrix iris-data "~ Sepal.Length + Sepal.Width" :renjine)
(lm iris-data "Sepal.Width ~ Sepal.Length * Petal.Length" 
    :Sepal.Width :ocpu)

Notes:
- OpenCPU backend is convenient but requires internet connectivity
- Renjin is standalone but may have some R incompatibilities
- clojisr requires a local R installation but offers full R compatibility
- Returned model matrices exclude row names and intercept columns by default

See also: `scicloj.metamorph.ml.design-matrix` for Clojure-native feature engineering
raw docstring

lmclj

(lm ds formula target-var formula-impl)

Train a linear model using an R-style formula.

This function combines R formula-based feature engineering with ordinary least squares (OLS) regression. It creates a design matrix from the input dataset using the specified R formula, then trains a linear model on the resulting features.

Parameters:

  • ds A tech.ml.dataset dataset containing the input data with all variables referenced in the formula and target variable.

  • formula A string containing the R formula (e.g., "y ~ x1 + x2 * x3"). The formula is interpreted by the R backend.

  • target-var A keyword or string naming the target variable for regression. This variable must be present in the input dataset.

  • formula-impl An implementation keyword for formula evaluation:

    • :ocpu Uses OpenCPU (cloud.opencpu.org), no local R needed
    • :renjine Uses Renjin, a Java implementation of R
    • :clojisr Uses clojisr with local R installation

Returns: A trained linear model (OLS from fastmath) ready for predictions. The model excludes the intercept column and row names from the design matrix by default.

Example: (lm iris-data "Sepal.Width ~ Sepal.Length + Petal.Length" :Sepal.Width :renjine)

Train a linear model using an R-style formula.

This function combines R formula-based feature engineering with ordinary least
squares (OLS) regression. It creates a design matrix from the input dataset using
the specified R formula, then trains a linear model on the resulting features.

Parameters:
- `ds`             A tech.ml.dataset dataset containing the input data with all
                   variables referenced in the formula and target variable.
- `formula`        A string containing the R formula (e.g., "y ~ x1 + x2 * x3").
                   The formula is interpreted by the R backend.
- `target-var`     A keyword or string naming the target variable for regression.
                   This variable must be present in the input dataset.
- `formula-impl`   An implementation keyword for formula evaluation:

  - `:ocpu`    Uses OpenCPU (cloud.opencpu.org), no local R needed
  - `:renjine` Uses Renjin, a Java implementation of R
  - `:clojisr` Uses clojisr with local R installation

Returns:
A trained linear model (OLS from fastmath) ready for predictions. The model
excludes the intercept column and row names from the design matrix by default.

Example:
(lm iris-data "Sepal.Width ~ Sepal.Length + Petal.Length" :Sepal.Width :renjine)
sourceraw docstring

r-model-matrixclj

(r-model-matrix ds r-formula impl)

Compute a model matrix from a dataset and an R-style formula.

Parameters:

  • ds A tech.ml.dataset dataset representing the input data.
  • r-formula A string containing the R formula to use for model matrix construction. The formua is interpreted by R itself, so should be full compatible
  • impl An implementation keyword, either

Returns a dataset containing the constructed design matrix. Dispatches to the appropriate backend implementation.

Returns a map with

  • :model-matrix-dataset having the TMD containing the design matrix specified by r-formula
  • :attributes the (R) attributes of the model.matrix object
Compute a model matrix from a dataset and an R-style formula.

Parameters:
 
- `ds`         A tech.ml.dataset dataset representing the input data.
- `r-formula`  A string containing the R formula to use for model matrix construction. The formua is interpreted by R itself, so should be full compatible
- `impl`       An implementation keyword, either 
  - `:ocpu`    Uses an online service https://www.opencpu.org/api.html (server: cloud.opencpu.org)
  - `:renjine` Uses https://renjin.org/   
  - `:clojisr` Uses https://github.com/scicloj/clojisr, which requires a local R installation 


Returns a dataset containing the constructed design matrix.
Dispatches to the appropriate backend implementation.

 
Returns a map with 
- `:model-matrix-dataset` having the TMD containing the design matrix specified by `r-formula`
- `:attributes` the (R) attributes of the model.matrix object
 
 
sourceraw docstring

cljdoc builds & hosts documentation for Clojure/Script libraries

Keyboard shortcuts
Ctrl+kJump to recent docs
Move to previous article
Move to next article
Ctrl+/Jump to the search field
× close