Liking cljdoc? Tell your friends :D

fastmath.ml.regression

OLS, WLS and GLM regression models with analysis.

OLS, WLS and GLM regression models with analysis.
raw docstring

->familyclj

(->family family-map)
(->family variance residual-deviance)
(->family default-link
          variance
          initialize
          residual-deviance
          aic
          quantile-residuals-fun
          dispersion)

Create Family record.

Arguments:

  • default-link - canonical link function, default: :identity
  • variance - variance function in terms of mean
  • initialize - initialization of glm, default: the same as in :gaussian
  • residual-deviance - calculates residual deviance
  • aic - calculates AIC, default (constantly ##NaN)
  • quantile-residuals-fun - calculates quantile residuals, default as in :gaussian
  • disperation - value or :estimate (default), :pearson or :mean-deviance

Initialization will be called with ys and weights and should return:

  • ys, possibly changed if any adjustment is necessary
  • init-mu, starting point
  • weights, possibly changes or orignal
  • (optional) any other data used to calculate AIC

AIC function should accept: ys, fitted, weights, deviance, observation, rank (fitted parameters) and additional data created by initialization

Minimum version should define variance and residual-deviance.

Create `Family` record.

Arguments:

* `default-link` - canonical link function, default: `:identity`
* `variance` - variance function in terms of mean
* `initialize` - initialization of glm, default: the same as in `:gaussian`
* `residual-deviance` - calculates residual deviance
* `aic` - calculates AIC, default `(constantly ##NaN)`
* `quantile-residuals-fun` - calculates quantile residuals, default as in `:gaussian`
* `disperation` - value or `:estimate` (default), `:pearson` or `:mean-deviance`

Initialization will be called with `ys` and `weights` and should return:

* ys, possibly changed if any adjustment is necessary
* init-mu, starting point
* weights, possibly changes or orignal
* (optional) any other data used to calculate AIC

AIC function should accept: `ys`, `fitted`, `weights`, `deviance`, `observation`, `rank` (fitted parameters) and additional data created by initialization

Minimum version should define `variance` and `residual-deviance`.
sourceraw docstring

(->link link-map)
(->link g mean mean-derivative)

Creates link record.

Args:

  • g - link function
  • mean - mean, inverse link function
  • mean-derivative - derivative of mean
Creates link record.

Args:

* `g` - link function
* `mean` - mean, inverse link function
* `mean-derivative` - derivative of mean
sourceraw docstring

->stringcljmultimethod

source

analysisclj

(analysis model)

Influence analysis, laverage, standardized and studentized residuals, correlation.

Influence analysis, laverage, standardized and studentized residuals, correlation.
sourceraw docstring

doseclj

(dose glm-model)
(dose glm-model p)
(dose glm-model p coeff-id)
(dose {:keys [link-fun xtxinv coefficients]} p intercept-id coeff-id)

Predict Lethal/Effective dose for given p (default: p=0.5, median).

  • intercept-id - id of intercept, default: 0
  • coeff-id is the coefficient used for calculating dose, default: 1
Predict Lethal/Effective dose for given `p` (default: p=0.5, median).

* intercept-id - id of intercept, default: 0
* coeff-id is the coefficient used for calculating dose, default: 1
sourceraw docstring

familiesclj

source

(family-with-link family)
(family-with-link family params)
(family-with-link family link params)

Returns family with a link as single map.

Returns family with a link as single map.
sourceraw docstring

glmclj

(glm ys xss)
(glm ys
     xss
     {:keys [max-iters tol epsilon family link weights alpha offset
             dispersion-estimator intercept? init-mu simple? transformer names]
      :or {max-iters 25
           tol 1.0E-8
           epsilon 1.0E-8
           family :gaussian
           alpha 0.05
           intercept? true
           simple? false}
      :as params})

Fit a generalized linear model using IRLS method.

Arguments:

  • ys - response vector
  • xss - terms of systematic component
  • optional parameters

Parameters:

  • :tol - tolerance for matrix decomposition (SVD and Cholesky), default: 1.0e-8
  • :epsilon - tolerance for IRLS (stopping condition), default: 1.0e-8
  • :max-iters - maximum numbers of iterations, default: 25
  • :weights - optional weights
  • :offset - optional offset
  • :alpha - significance level, default: 0.05
  • :intercept? - should intercept term be included, default: true
  • :init-mu - initial response vector for IRLS
  • :simple? - returns simplified result
  • :dispersion-estimator - :pearson, :mean-deviance or any number, replaces default one.
  • :family - family, default: :gaussian
  • :link - link
  • :nbinomial-theta - theta for :nbinomial family, default: 1.0.
  • :transformer - an optional function which will be used to transform systematic component xs before fitting and prediction
  • :names - an optional vector of names to use when printing the model

Family is one of the: :gaussian (default), :binomial, :quasi-binomial, :poisson, :quasi-poisson, :gamma, :inverse-gaussian, :nbinomial, custom Family record (see ->family) or a function returning Family (accepting a map as an argument)

Link is one of the: :probit, :identity, :loglog, :sqrt, :inverse, :logit, :power, :nbinomial, :cauchit, :distribution, :cloglog, :inversesq, :log, :clog, custom Link record (see ->link) or a function returning Link (accepting a map as an argument)

Notes:

  • SVD decomposition is used instead of more common QR
  • intercept term is added implicitely if intercept? is set to true (by default)
  • :nbinomial family requires :nbinomial-theta parameter
  • Each family has its own default (canonical) link.

Returned record implementes IFn protocol and contains:

  • :model - set to :glm
  • :intercept? - whether intercept term is included or not
  • :xtxinv - (X^T X)^-1
  • :intercept - intercept term value
  • :beta - vector of model coefficients (without intercept)
  • :coefficients - coefficient analysis, a list of maps containing :estimate, :stderr, :t-value, :p-value and :confidence-interval
  • :weights - weights, :weights (working) and :initial
  • :residuals - a map containing :raw, :working, :pearsons and :deviance residuals
  • :fitted - fitted values for xss
  • :df - degrees of freedom: :residual, :null and :intercept
  • :observations - number of observations
  • :deviance - deviances: :residual and :null
  • :dispersion - default or calculated, used in a model
  • :dispersions - :pearson and :mean-deviance
  • :family - family used
  • :link - link used
  • :link-fun - link function, g
  • :mean-fun - mean function, g^-1
  • :q - (1-alpha/2) quantile of T or Normal distribution for residual degrees of freedom
  • :chi2 and :p-value - Chi-squared statistic and respective p-value
  • :ll - a map containing log-likelihood and AIC/BIC
  • :analysis - laverage, residual and influence analysis - a delay
  • :iters and :converged? - number of iterations and convergence indicator

Analysis, delay containing a map:

  • :residuals - :standardized and :studentized residuals (pearsons and deviance)
  • :laverage - :hat, :sigmas and laveraged :coefficients (leave-one-out)
  • :influence - :cooks-distance, :dffits, :dfbetas and :covratio
  • :influential - list of influential observations (ids) for influence measures
  • :correlation - correlation matrix of estimated parameters
Fit a generalized linear model using IRLS method.

Arguments:

* `ys` - response vector
* `xss` - terms of systematic component
* optional parameters

Parameters:

* `:tol` - tolerance for matrix decomposition (SVD and Cholesky), default: `1.0e-8`
* `:epsilon` - tolerance for IRLS (stopping condition), default: `1.0e-8`
* `:max-iters` - maximum numbers of iterations, default: `25`
* `:weights` - optional weights
* `:offset` - optional offset
* `:alpha` - significance level, default: `0.05`
* `:intercept?` - should intercept term be included, default: `true`
* `:init-mu` - initial response vector for IRLS
* `:simple?` - returns simplified result
* `:dispersion-estimator` - `:pearson`, `:mean-deviance` or any number, replaces default one.
* `:family` - family, default: `:gaussian`
* `:link` - link
* `:nbinomial-theta` - theta for `:nbinomial` family, default: `1.0`.
* `:transformer` - an optional function which will be used to transform systematic component `xs` before fitting and prediction
* `:names` - an optional vector of names to use when printing the model


Family is one of the: `:gaussian` (default), `:binomial`, `:quasi-binomial`, `:poisson`, `:quasi-poisson`, `:gamma`, `:inverse-gaussian`, `:nbinomial`, custom `Family` record (see [[->family]]) or a function returning Family (accepting a map as an argument)

Link is one of the: `:probit`, `:identity`, `:loglog`, `:sqrt`, `:inverse`, `:logit`, `:power`, `:nbinomial`, `:cauchit`, `:distribution`, `:cloglog`, `:inversesq`, `:log`, `:clog`, custom `Link` record (see [[->link]]) or a function returning Link (accepting a map as an argument)

Notes:

* SVD decomposition is used instead of more common QR
* intercept term is added implicitely if `intercept?` is set to `true` (by default)
* `:nbinomial` family requires `:nbinomial-theta` parameter
* Each family has its own default (canonical) link.

Returned record implementes `IFn` protocol and contains:

* `:model` - set to `:glm`
* `:intercept?` - whether intercept term is included or not
* `:xtxinv` - (X^T X)^-1
* `:intercept` - intercept term value
* `:beta` - vector of model coefficients (without intercept)
* `:coefficients` - coefficient analysis, a list of maps containing `:estimate`, `:stderr`, `:t-value`, `:p-value` and `:confidence-interval`
* `:weights` - weights, `:weights` (working) and `:initial`
* `:residuals` - a map containing `:raw`, `:working`, `:pearsons` and `:deviance` residuals
* `:fitted` - fitted values for xss
* `:df` - degrees of freedom: `:residual`, `:null` and `:intercept`
* `:observations` - number of observations
* `:deviance` - deviances: `:residual` and `:null`
* `:dispersion` - default or calculated, used in a model
* `:dispersions` - `:pearson` and `:mean-deviance`
* `:family` - family used
* `:link` - link used
* `:link-fun` - link function, `g`
* `:mean-fun` - mean function, `g^-1`
* `:q` - (1-alpha/2) quantile of T or Normal distribution for residual degrees of freedom
* `:chi2` and `:p-value` - Chi-squared statistic and respective p-value
* `:ll` - a map containing log-likelihood and AIC/BIC
* `:analysis` - laverage, residual and influence analysis - a delay
* `:iters` and `:converged?` - number of iterations and convergence indicator

Analysis, delay containing a map:

* `:residuals` - `:standardized` and `:studentized` residuals (pearsons and deviance)
* `:laverage` - `:hat`, `:sigmas` and laveraged `:coefficients` (leave-one-out)
* `:influence` - `:cooks-distance`, `:dffits`, `:dfbetas` and `:covratio`
* `:influential` - list of influential observations (ids) for influence measures
* `:correlation` - correlation matrix of estimated parameters
sourceraw docstring

glm-nbinomialclj

(glm-nbinomial ys xss)
(glm-nbinomial ys
               xss
               {:keys [nbinomial-theta max-iters epsilon]
                :or {max-iters 25 epsilon 1.0E-8}
                :as params})

Fits theta for negative binomial glm in iterative process.

Returns fitted model with :nbinomial-theta key.

Arguments and parameters are the same as for glm.

Additional parameters:

  • :nbinomial-theta - initial theta used as a starting point for optimization.
Fits theta for negative binomial glm in iterative process.

Returns fitted model with `:nbinomial-theta` key.

Arguments and parameters are the same as for `glm`.

Additional parameters:

* `:nbinomial-theta` - initial theta used as a starting point for optimization.
sourceraw docstring

source

lmclj

(lm ys xss)
(lm ys
    xss
    {:keys [tol weights alpha intercept? offset transformer names]
     :or {tol 1.0E-8 alpha 0.05 intercept? true}})

Fit a linear model using ordinary (OLS) or weighted (WLS) least squares.

Arguments:

  • ys - response vector
  • xss - terms of systematic component
  • optional parameters

Parameters:

  • :tol - tolerance for matrix decomposition (SVD and Cholesky), default: 1.0e-8
  • :weights - optional weights for WLS
  • :offset - optional offset
  • :alpha - significance level, default: 0.05
  • :intercept? - should intercept term be included, default: true
  • :transformer - an optional function which will be used to transform systematic component xs before fitting and prediction
  • :names - sequence or string, used as name for coefficient when pretty-printing model, default 'X'

Notes:

  • SVD decomposition is used instead of more common QR
  • intercept term is added implicitely if intercept? is set to true (by default)
  • Two variants of AIC/BIC are calculated, one based on log-likelihood, second on RSS/n

Returned record implementes IFn protocol and contains:

  • :model - :ols or :wls
  • :intercept? - whether intercept term is included or not
  • :xtxinv - (X^T X)^-1
  • :intercept - intercept term value
  • :beta - vector of model coefficients (without intercept)
  • :coefficients - coefficient analysis, a list of maps containing :estimate, :stderr, :t-value, :p-value and :confidence-interval
  • :weights - initial weights
  • :residuals - a map containing :raw and :weighted residuals
  • :fitted - fitted values for xss
  • :df - degrees of freedom: :residual, :model and :intercept
  • :observations - number of observations
  • :r-squared and :adjusted-r-squared
  • :sigma and :sigma2 - deviance and variance
  • :msreg - regression mean squared
  • :rss, :regss, :tss - residual, regression and total sum of squares
  • :qt - (1-alpha/2) quantile of T distribution for residual degrees of freedom
  • :f-statistic and :p-value - F statistic and respective p-value
  • :ll - a map containing log-likelihood and AIC/BIC in two variants: based on log-likelihood and RSS
  • :analysis - laverage, residual and influence analysis - a delay

Analysis, delay containing a map:

  • :residuals - :standardized and :studentized weighted residuals
  • :laverage - :hat, :sigmas and laveraged :coefficients (leave-one-out)
  • :influence - :cooks-distance, :dffits, :dfbetas and :covratio
  • :influential - list of influential observations (ids) for influence measures
  • :correlation - correlation matrix of estimated parameters
  • :normality - residuals normality tests: :skewness, :kurtosis, :durbin-watson (for raw and weighted), :jarque-berra and :omnibus (normality)
Fit a linear model using ordinary (OLS) or weighted (WLS) least squares.

Arguments:

* `ys` - response vector
* `xss` - terms of systematic component
* optional parameters

Parameters:

* `:tol` - tolerance for matrix decomposition (SVD and Cholesky), default: `1.0e-8`
* `:weights` - optional weights for WLS
* `:offset` - optional offset
* `:alpha` - significance level, default: `0.05`
* `:intercept?` - should intercept term be included, default: `true`
* `:transformer` - an optional function which will be used to transform systematic component `xs` before fitting and prediction
* `:names` - sequence or string, used as name for coefficient when pretty-printing model, default `'X'`

Notes:

* SVD decomposition is used instead of more common QR
* intercept term is added implicitely if `intercept?` is set to `true` (by default)
* Two variants of AIC/BIC are calculated, one based on log-likelihood, second on RSS/n

Returned record implementes `IFn` protocol and contains:

* `:model` - `:ols` or `:wls`
* `:intercept?` - whether intercept term is included or not
* `:xtxinv` - (X^T X)^-1
* `:intercept` - intercept term value
* `:beta` - vector of model coefficients (without intercept)
* `:coefficients` - coefficient analysis, a list of maps containing `:estimate`, `:stderr`, `:t-value`, `:p-value` and `:confidence-interval`
* `:weights` - initial weights
* `:residuals` - a map containing `:raw` and `:weighted` residuals
* `:fitted` - fitted values for xss
* `:df` - degrees of freedom: `:residual`, `:model` and `:intercept`
* `:observations` - number of observations
* `:r-squared` and `:adjusted-r-squared`
* `:sigma` and `:sigma2` - deviance and variance
* `:msreg` - regression mean squared
* `:rss`, `:regss`, `:tss` - residual, regression and total sum of squares
* `:qt` - (1-alpha/2) quantile of T distribution for residual degrees of freedom
* `:f-statistic` and `:p-value` - F statistic and respective p-value
* `:ll` - a map containing log-likelihood and AIC/BIC in two variants: based on log-likelihood and RSS
* `:analysis` - laverage, residual and influence analysis - a delay

Analysis, delay containing a map:

* `:residuals` - `:standardized` and `:studentized` weighted residuals
* `:laverage` - `:hat`, `:sigmas` and laveraged `:coefficients` (leave-one-out)
* `:influence` - `:cooks-distance`, `:dffits`, `:dfbetas` and `:covratio`
* `:influential` - list of influential observations (ids) for influence measures
* `:correlation` - correlation matrix of estimated parameters
* `:normality` - residuals normality tests: `:skewness`, `:kurtosis`, `:durbin-watson` (for raw and weighted), `:jarque-berra` and `:omnibus` (normality)
sourceraw docstring

predictclj

(predict model xs)
(predict model xs stderr?)

Predict from the given model and data point.

If stderr? is true, standard error and confidence interval is added. If model is fitted with offset, first element of data point should contain provided offset.

Expected data point:

  • [x1,x2,...,xn] - when model was trained without offset
  • [offset,x1,x2,...,xn] - when offset was used for training
  • [] or nil - when model was trained with intercept only
  • [offset] - when model was trained with intercept and offset
Predict from the given model and data point.

If `stderr?` is true, standard error and confidence interval is added.
If model is fitted with offset, first element of data point should contain provided offset.

Expected data point:

* `[x1,x2,...,xn]` - when model was trained without offset
* `[offset,x1,x2,...,xn]` - when offset was used for training
* `[]` or `nil` - when model was trained with intercept only
* `[offset]` - when model was trained with intercept and offset
sourceraw docstring

quantile-residualsclj

(quantile-residuals {:keys [quantile-residuals-fun residuals dispersion]
                     :as model})

Quantile residuals for a model, possibly randomized.

Quantile residuals for a model, possibly randomized.
sourceraw docstring

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close