OLS, WLS and GLM regression models with analysis.
OLS, WLS and GLM regression models with analysis.
(->family family-map)
(->family variance residual-deviance)
(->family default-link
variance
initialize
residual-deviance
aic
quantile-residuals-fun
dispersion)
Create Family
record.
Arguments:
default-link
- canonical link function, default: :identity
variance
- variance function in terms of meaninitialize
- initialization of glm, default: the same as in :gaussian
residual-deviance
- calculates residual devianceaic
- calculates AIC, default (constantly ##NaN)
quantile-residuals-fun
- calculates quantile residuals, default as in :gaussian
disperation
- value or :estimate
(default), :pearson
or :mean-deviance
Initialization will be called with ys
and weights
and should return:
AIC function should accept: ys
, fitted
, weights
, deviance
, observation
, rank
(fitted parameters) and additional data created by initialization
Minimum version should define variance
and residual-deviance
.
Create `Family` record. Arguments: * `default-link` - canonical link function, default: `:identity` * `variance` - variance function in terms of mean * `initialize` - initialization of glm, default: the same as in `:gaussian` * `residual-deviance` - calculates residual deviance * `aic` - calculates AIC, default `(constantly ##NaN)` * `quantile-residuals-fun` - calculates quantile residuals, default as in `:gaussian` * `disperation` - value or `:estimate` (default), `:pearson` or `:mean-deviance` Initialization will be called with `ys` and `weights` and should return: * ys, possibly changed if any adjustment is necessary * init-mu, starting point * weights, possibly changes or orignal * (optional) any other data used to calculate AIC AIC function should accept: `ys`, `fitted`, `weights`, `deviance`, `observation`, `rank` (fitted parameters) and additional data created by initialization Minimum version should define `variance` and `residual-deviance`.
(->link link-map)
(->link g mean mean-derivative)
Creates link record.
Args:
g
- link functionmean
- mean, inverse link functionmean-derivative
- derivative of meanCreates link record. Args: * `g` - link function * `mean` - mean, inverse link function * `mean-derivative` - derivative of mean
(add-penalty xss penalty penalty-param)
(add-penalty xss penalty penalty-param intercept?)
Adds rows with penalty data to a given seq of seqs (a row matrix).
Penalties and parameters map:
:ridge
- diagonal matrix with sqrt(:lambda
) parameter, used in regularized regression.:diffs
- differences of given :order
multiplied by sqrt(:lambda
), used in penalized b-splines.Two add more than one penalty, create seq of penalty types and parameters, eg: [:diffs :ridge]
Adds rows with penalty data to a given seq of seqs (a row matrix). Penalties and parameters map: * `:ridge` - diagonal matrix with sqrt(`:lambda`) parameter, used in regularized regression. * `:diffs` - differences of given `:order` multiplied by sqrt(`:lambda`), used in penalized b-splines. Two add more than one penalty, create seq of penalty types and parameters, eg: `[:diffs :ridge]`
(analysis model)
Influence analysis, laverage, standardized and studentized residuals, correlation.
Influence analysis, laverage, standardized and studentized residuals, correlation.
(b-spline-transformer xs nseg degree)
(b-spline-transformer xl xr nseg degree)
(cir ys)
(cir xs ys)
(cir xs ys order)
(cir xs ys ws order)
Centered Isotonic Regression.
Returns shrinked [xs
,ys
] pair.
Arguments:
xs
- regressor variableys
- response variablews
- weights (optional)order
- :asc
or :increasing
(default), :desc
or :decreasing
, :non-decreasing
and :non-increasing
.Centered Isotonic Regression. Returns shrinked [`xs`,`ys`] pair. Arguments: - `xs` - regressor variable - `ys` - response variable - `ws` - weights (optional) - `order` - `:asc` or `:increasing` (default), `:desc` or `:decreasing`, `:non-decreasing` and `:non-increasing`.
(dose glm-model)
(dose glm-model p)
(dose glm-model p coeff-id)
(dose {:keys [link-fun xtxinv coefficients]} p intercept-id coeff-id)
Predict Lethal/Effective dose for given p
(default: p=0.5, median).
Predict Lethal/Effective dose for given `p` (default: p=0.5, median). * intercept-id - id of intercept, default: 0 * coeff-id is the coefficient used for calculating dose, default: 1
(family-with-link family)
(family-with-link family params)
(family-with-link family link params)
Returns family with a link as single map.
Returns family with a link as single map.
(glm ys xss)
(glm ys
xss
{:keys [max-iters tol epsilon family link weights alpha offset
dispersion-estimator intercept? init-mu simple? transformer names
decomposition augmentation augmentation-params]
:or {max-iters 25
tol 1.0E-8
epsilon 1.0E-8
family :gaussian
alpha 0.05
intercept? true
simple? false
decomposition :cholesky}
:as params})
Fit a generalized linear model using IRLS method.
Arguments:
ys
- response vectorxss
- terms of systematic componentParameters:
:tol
- tolerance for matrix decomposition (SVD and Cholesky), default: 1.0e-8
:epsilon
- tolerance for IRLS (stopping condition), default: 1.0e-8
:max-iters
- maximum numbers of iterations, default: 25
:weights
- optional weights:offset
- optional offset:alpha
- significance level, default: 0.05
:intercept?
- should intercept term be included, default: true
:init-mu
- initial response vector for IRLS:simple?
- returns simplified result:dispersion-estimator
- :pearson
, :mean-deviance
or any number, replaces default one.:family
- family, default: :gaussian
:link
- link:nbinomial-theta
- theta for :nbinomial
family, default: 1.0
.:transformer
- an optional function which will be used to transform systematic component xs
before fitting and prediction:names
- an optional vector of names to use when printing the model:decomposition
- which matrix decomposition use to find solution, :cholesky
(default), :rrqr
(rank revealing) or :qr
:augmentation
and augmentation-params
- regularization by data augmentation
:ridge
- adds ridge regresion penalty (intercept is not penalized), default parameters {:lambda 0.1}
:diffs
- adds differences penalty, use with b-spline-transformation
for smoothing, default parameters {:lambda 1.0 :order 2}
Family is one of the: :gaussian
(default), :binomial
, :quasi-binomial
, :poisson
, :quasi-poisson
, :gamma
, :inverse-gaussian
, :nbinomial
, custom Family
record (see ->family
) or a function returning Family (accepting a map as an argument)
Link is one of the: :probit
, :identity
, :loglog
, :sqrt
, :inverse
, :logit
, :power
, :nbinomial
, :cauchit
, :distribution
, :cloglog
, :inversesq
, :log
, :clog
, custom Link
record (see ->link
) or a function returning Link (accepting a map as an argument)
Notes:
intercept?
is set to true
(by default):nbinomial
family requires :nbinomial-theta
parameterReturned record implementes IFn
protocol and contains:
:model
- set to :glm
:intercept?
- whether intercept term is included or not:xtxinv
- (X^T X)^-1:intercept
- intercept term value:beta
- vector of model coefficients (without intercept):coefficients
- coefficient analysis, a list of maps containing :estimate
, :stderr
, :t-value
, :p-value
and :confidence-interval
:weights
- weights, :weights
(working) and :initial
:residuals
- a map containing :raw
, :working
, :pearsons
and :deviance
residuals:fitted
- fitted values for xss:df
- degrees of freedom: :residual
, :null
and :intercept
:observations
- number of observations:deviance
- deviances: :residual
and :null
:dispersion
- default or calculated, used in a model:dispersions
- :pearson
and :mean-deviance
:family
- family used:link
- link used:link-fun
- link function, g
:mean-fun
- mean function, g^-1
:q
- (1-alpha/2) quantile of T or Normal distribution for residual degrees of freedom:chi2
and :p-value
- Chi-squared statistic and respective p-value:ll
- a map containing log-likelihood and AIC/BIC (GLM and based on deviance, dev+2ED):analysis
- laverage, residual and influence analysis - a delay:iters
and :converged?
- number of iterations and convergence indicator:decomposition
- decomposition usedAnalysis, delay containing a map:
:residuals
- :standardized
and :studentized
residuals (pearsons and deviance):laverage
- :hat
, :sigmas
and laveraged :coefficients
(leave-one-out):influence
- :cooks-distance
, :dffits
, :dfbetas
and :covratio
:influential
- list of influential observations (ids) for influence measures:correlation
- correlation matrix of estimated parametersFit a generalized linear model using IRLS method. Arguments: * `ys` - response vector * `xss` - terms of systematic component * optional parameters Parameters: * `:tol` - tolerance for matrix decomposition (SVD and Cholesky), default: `1.0e-8` * `:epsilon` - tolerance for IRLS (stopping condition), default: `1.0e-8` * `:max-iters` - maximum numbers of iterations, default: `25` * `:weights` - optional weights * `:offset` - optional offset * `:alpha` - significance level, default: `0.05` * `:intercept?` - should intercept term be included, default: `true` * `:init-mu` - initial response vector for IRLS * `:simple?` - returns simplified result * `:dispersion-estimator` - `:pearson`, `:mean-deviance` or any number, replaces default one. * `:family` - family, default: `:gaussian` * `:link` - link * `:nbinomial-theta` - theta for `:nbinomial` family, default: `1.0`. * `:transformer` - an optional function which will be used to transform systematic component `xs` before fitting and prediction * `:names` - an optional vector of names to use when printing the model * `:decomposition` - which matrix decomposition use to find solution, `:cholesky` (default), `:rrqr` (rank revealing) or `:qr` * `:augmentation` and `augmentation-params` - regularization by data augmentation - `:ridge` - adds ridge regresion penalty (intercept is not penalized), default parameters `{:lambda 0.1}` - `:diffs` - adds differences penalty, use with `b-spline-transformation` for smoothing, default parameters `{:lambda 1.0 :order 2}` Family is one of the: `:gaussian` (default), `:binomial`, `:quasi-binomial`, `:poisson`, `:quasi-poisson`, `:gamma`, `:inverse-gaussian`, `:nbinomial`, custom `Family` record (see [[->family]]) or a function returning Family (accepting a map as an argument) Link is one of the: `:probit`, `:identity`, `:loglog`, `:sqrt`, `:inverse`, `:logit`, `:power`, `:nbinomial`, `:cauchit`, `:distribution`, `:cloglog`, `:inversesq`, `:log`, `:clog`, custom `Link` record (see [[->link]]) or a function returning Link (accepting a map as an argument) Notes: * SVD decomposition is used instead of more common QR * intercept term is added implicitely if `intercept?` is set to `true` (by default) * `:nbinomial` family requires `:nbinomial-theta` parameter * Each family has its own default (canonical) link. Returned record implementes `IFn` protocol and contains: * `:model` - set to `:glm` * `:intercept?` - whether intercept term is included or not * `:xtxinv` - (X^T X)^-1 * `:intercept` - intercept term value * `:beta` - vector of model coefficients (without intercept) * `:coefficients` - coefficient analysis, a list of maps containing `:estimate`, `:stderr`, `:t-value`, `:p-value` and `:confidence-interval` * `:weights` - weights, `:weights` (working) and `:initial` * `:residuals` - a map containing `:raw`, `:working`, `:pearsons` and `:deviance` residuals * `:fitted` - fitted values for xss * `:df` - degrees of freedom: `:residual`, `:null` and `:intercept` * `:observations` - number of observations * `:deviance` - deviances: `:residual` and `:null` * `:dispersion` - default or calculated, used in a model * `:dispersions` - `:pearson` and `:mean-deviance` * `:family` - family used * `:link` - link used * `:link-fun` - link function, `g` * `:mean-fun` - mean function, `g^-1` * `:q` - (1-alpha/2) quantile of T or Normal distribution for residual degrees of freedom * `:chi2` and `:p-value` - Chi-squared statistic and respective p-value * `:ll` - a map containing log-likelihood and AIC/BIC (GLM and based on deviance, dev+2ED) * `:analysis` - laverage, residual and influence analysis - a delay * `:iters` and `:converged?` - number of iterations and convergence indicator * `:decomposition` - decomposition used Analysis, delay containing a map: * `:residuals` - `:standardized` and `:studentized` residuals (pearsons and deviance) * `:laverage` - `:hat`, `:sigmas` and laveraged `:coefficients` (leave-one-out) * `:influence` - `:cooks-distance`, `:dffits`, `:dfbetas` and `:covratio` * `:influential` - list of influential observations (ids) for influence measures * `:correlation` - correlation matrix of estimated parameters
(glm-nbinomial ys xss)
(glm-nbinomial ys
xss
{:keys [nbinomial-theta max-iters epsilon]
:or {max-iters 25 epsilon 1.0E-8}
:as params})
Fits theta for negative binomial glm in iterative process.
Returns fitted model with :nbinomial-theta
key.
Arguments and parameters are the same as for glm
.
Additional parameters:
:nbinomial-theta
- initial theta used as a starting point for optimization.Fits theta for negative binomial glm in iterative process. Returns fitted model with `:nbinomial-theta` key. Arguments and parameters are the same as for `glm`. Additional parameters: * `:nbinomial-theta` - initial theta used as a starting point for optimization.
(lm ys xss)
(lm ys
xss
{:keys [tol weights alpha intercept? offset transformer names decomposition
augmentation augmentation-params]
:or {tol 1.0E-8 alpha 0.05 intercept? true decomposition :cholesky}})
Fit a linear model using ordinary (OLS) or weighted (WLS) least squares.
Arguments:
ys
- response vectorxss
- terms of systematic componentParameters:
:tol
- tolerance for matrix decomposition (SVD and Cholesky/QR decomposition), default: 1.0e-8
:weights
- optional weights for WLS:offset
- optional offset:alpha
- significance level, default: 0.05
:intercept?
- should intercept term be included, default: true
:transformer
- an optional function which will be used to transform systematic component xs
before fitting and prediction:names
- sequence or string, used as name for coefficient when pretty-printing model, default 'X'
:decomposition
- which matrix decomposition use to find solution, :cholesky
(default), :rrqr
(rank revealing) or :qr
:augmentation
and augmentation-params
- regularization by data augmentation
:ridge
- adds ridge regresion penalty (intercept is not penalized), default parameters {:lambda 0.1}
:diffs
- adds differences penalty, use with b-spline-transformation
for smoothing, default parameters {:lambda 1.0 :order 2}
Notes:
intercept?
is set to true
(by default)Returned record implementes IFn
protocol and contains:
:model
- :ols
or :wls
:intercept?
- whether intercept term is included or not:xtxinv
- (X^T X)^-1:intercept
- intercept term value:beta
- vector of model coefficients (without intercept):coefficients
- coefficient analysis, a list of maps containing :estimate
, :stderr
, :t-value
, :p-value
and :confidence-interval
:weights
- initial weights:residuals
- a map containing :raw
and :weighted
residuals, also :loocv
:fitted
- fitted values for xss:df
- degrees of freedom: :residual
, :model
and :intercept
:observations
- number of observations:r-squared
and :adjusted-r-squared
:sigma
and :sigma2
- deviance and variance:msreg
- regression mean squared:rss
, :regss
, :tss
- residual, regression and total sum of squares:qt
- (1-alpha/2) quantile of T distribution for residual degrees of freedom:f-statistic
and :p-value
- F statistic and respective p-value:ll
- a map containing log-likelihood and AIC/BIC in two variants: based on log-likelihood and RSS:analysis
- laverage, residual and influence analysis - a delay:decomposition
- decomposition used:augmentation
- augmentation used:cv
- cross validation statistic:effective-dimension
- effective dimension of the model (sum of hat matrix diagonal)Analysis, delay containing a map:
:residuals
- :standardized
and :studentized
weighted residuals:laverage
- :hat
, :sigmas
and laveraged :coefficients
(leave-one-out):influence
- :cooks-distance
, :dffits
, :dfbetas
and :covratio
:influential
- list of influential observations (ids) for influence measures:correlation
- correlation matrix of estimated parameters:normality
- residuals normality tests: :skewness
, :kurtosis
, :durbin-watson
(for raw and weighted), :jarque-berra
and :omnibus
(normality)Fit a linear model using ordinary (OLS) or weighted (WLS) least squares. Arguments: * `ys` - response vector * `xss` - terms of systematic component * optional parameters Parameters: * `:tol` - tolerance for matrix decomposition (SVD and Cholesky/QR decomposition), default: `1.0e-8` * `:weights` - optional weights for WLS * `:offset` - optional offset * `:alpha` - significance level, default: `0.05` * `:intercept?` - should intercept term be included, default: `true` * `:transformer` - an optional function which will be used to transform systematic component `xs` before fitting and prediction * `:names` - sequence or string, used as name for coefficient when pretty-printing model, default `'X'` * `:decomposition` - which matrix decomposition use to find solution, `:cholesky` (default), `:rrqr` (rank revealing) or `:qr` * `:augmentation` and `augmentation-params` - regularization by data augmentation - `:ridge` - adds ridge regresion penalty (intercept is not penalized), default parameters `{:lambda 0.1}` - `:diffs` - adds differences penalty, use with `b-spline-transformation` for smoothing, default parameters `{:lambda 1.0 :order 2}` Notes: * SVD decomposition is used instead of more common QR * intercept term is added implicitely if `intercept?` is set to `true` (by default) * Two variants of AIC/BIC are calculated, one based on log-likelihood, second on RSS/n Returned record implementes `IFn` protocol and contains: * `:model` - `:ols` or `:wls` * `:intercept?` - whether intercept term is included or not * `:xtxinv` - (X^T X)^-1 * `:intercept` - intercept term value * `:beta` - vector of model coefficients (without intercept) * `:coefficients` - coefficient analysis, a list of maps containing `:estimate`, `:stderr`, `:t-value`, `:p-value` and `:confidence-interval` * `:weights` - initial weights * `:residuals` - a map containing `:raw` and `:weighted` residuals, also `:loocv` * `:fitted` - fitted values for xss * `:df` - degrees of freedom: `:residual`, `:model` and `:intercept` * `:observations` - number of observations * `:r-squared` and `:adjusted-r-squared` * `:sigma` and `:sigma2` - deviance and variance * `:msreg` - regression mean squared * `:rss`, `:regss`, `:tss` - residual, regression and total sum of squares * `:qt` - (1-alpha/2) quantile of T distribution for residual degrees of freedom * `:f-statistic` and `:p-value` - F statistic and respective p-value * `:ll` - a map containing log-likelihood and AIC/BIC in two variants: based on log-likelihood and RSS * `:analysis` - laverage, residual and influence analysis - a delay * `:decomposition` - decomposition used * `:augmentation` - augmentation used * `:cv` - cross validation statistic * `:effective-dimension` - effective dimension of the model (sum of hat matrix diagonal) Analysis, delay containing a map: * `:residuals` - `:standardized` and `:studentized` weighted residuals * `:laverage` - `:hat`, `:sigmas` and laveraged `:coefficients` (leave-one-out) * `:influence` - `:cooks-distance`, `:dffits`, `:dfbetas` and `:covratio` * `:influential` - list of influential observations (ids) for influence measures * `:correlation` - correlation matrix of estimated parameters * `:normality` - residuals normality tests: `:skewness`, `:kurtosis`, `:durbin-watson` (for raw and weighted), `:jarque-berra` and `:omnibus` (normality)
(pava ys)
(pava ys order)
(pava ys ws order)
Isotonic regression, pool-adjacent-violators algorithm with up-and-down-blocks variant.
Isotonic regression minimizes the (weighted) L2 loss function with a constraint that result should be monotonic (ascending or descending).
Arguments:
ys
- response variable dataws
- weights (optional)order
- :asc
or :increasing
(default), :desc
or :decreasing
, :non-decreasing
and :non-increasing
Returns monotonic predicted values.
Isotonic regression, pool-adjacent-violators algorithm with up-and-down-blocks variant. Isotonic regression minimizes the (weighted) L2 loss function with a constraint that result should be monotonic (ascending or descending). Arguments: - `ys` - response variable data - `ws` - weights (optional) - `order` - `:asc` or `:increasing` (default), `:desc` or `:decreasing`, `:non-decreasing` and `:non-increasing` Returns monotonic predicted values.
(polynomial-transformer xs degree)
Creates polynomial transformer for xs
Creates polynomial transformer for xs
(predict model xs)
(predict model xs stderr?)
Predict from the given model and data point.
If stderr?
is true, standard error and confidence interval is added.
If model is fitted with offset, first element of data point should contain provided offset.
Expected data point:
[x1,x2,...,xn]
- when model was trained without offset[offset,x1,x2,...,xn]
- when offset was used for training[]
or nil
- when model was trained with intercept only[offset]
- when model was trained with intercept and offsetPredict from the given model and data point. If `stderr?` is true, standard error and confidence interval is added. If model is fitted with offset, first element of data point should contain provided offset. Expected data point: * `[x1,x2,...,xn]` - when model was trained without offset * `[offset,x1,x2,...,xn]` - when offset was used for training * `[]` or `nil` - when model was trained with intercept only * `[offset]` - when model was trained with intercept and offset
(quantile-residuals {:keys [quantile-residuals-fun residuals dispersion]
:as model})
Quantile residuals for a model, possibly randomized.
Quantile residuals for a model, possibly randomized.
(trigonometric-transformer period degree)
Creates trigonometric transformer for xs
Creates trigonometric transformer for xs
cljdoc builds & hosts documentation for Clojure/Script libraries
Ctrl+k | Jump to recent docs |
← | Move to previous article |
→ | Move to next article |
Ctrl+/ | Jump to the search field |