fastmath.stats

Liking cljdoc? Tell your friends :D

Clojure only.

->confusion-matrix
acf
acf-ci
ad-test-one-sample
adjacent-values
ameasure
binary-measures
binary-measures-all
binomial-ci
binomial-ci-methods
binomial-test
bonett-seier-test
bootstrap
bootstrap-ci
box-cox-infer-lambda
box-cox-transformation
brown-forsythe-test
chisq-test
ci
cliffs-delta
coefficient-matrix
cohens-d
cohens-d-corrected
cohens-f
cohens-f2
cohens-kappa
cohens-q
cohens-u1
cohens-u1-normal
cohens-u2
cohens-u2-normal
cohens-u3
cohens-u3-normal
cohens-w
confusion-matrix
contingency-2x2-measures
contingency-2x2-measures-all
contingency-table
contingency-table->marginals
correlation
correlation-matrix
count=
covariance
covariance-matrix
cramers-c
cramers-v
cramers-v-corrected
cressie-read-test
demean
dissimilarity
durbin-watson
epsilon-sq
estimate-bins
estimation-strategies-list
eta-sq
expectile
extent
f-test
fligner-killeen-test
freeman-tukey-test
geomean
glass-delta
harmean
hedges-g
hedges-g*
hedges-g-corrected
histogram
hpdi-extent
inner-fence-extent
iqr
jarque-bera-test
jensen-shannon-divergence
kendall-correlation
kruskal-test
ks-test-one-sample
ks-test-two-samples
kullback-leibler-divergence
kurtosis
kurtosis-test
l-moment
l-variation
L0
L1
L2
L2sq
levene-test
LInf
mad
mad-extent
mae
mape
maximum
mcc
me
mean
mean-absolute-deviation
means-ratio
means-ratio-corrected
median
median-3
median-absolute-deviation
minimum
minimum-discrimination-information-test
mode
modes
modified-power-transformation
moment
mse
multinomial-likelihood-ratio-test
neyman-modified-chisq-test
normality-test
omega-sq
one-way-anova-test
outer-fence-extent
outliers
p-overlap
p-value
pacf
pacf-ci
pearson-correlation
pearson-r
percentile
percentile-bc-extent
percentile-bca-extent
percentile-extent
percentiles
pi
pi-extent
pooled-mad
pooled-stddev
pooled-variance
population-stddev
population-variance
population-wstddev
population-wvariance
power-divergence-test
power-transformation
powmean
psnr
quantile
quantile-extent
quantiles
r2
r2-determination
rank-epsilon-sq
rank-eta-sq
remove-outliers
rescale
rmse
robust-standardize
rows->contingency-table
rss
second-moment
sem
sem-extent
similarity
skewness
skewness-test
span
spearman-correlation
standardize
stats-map
stddev
stddev-extent
sum
t-test-one-sample
t-test-two-samples
trim
trim-lower
trim-upper
tschuprows-t
ttest-one-sample
ttest-two-samples
variance
variation
weighted-kappa
winsor
wmean
wmedian
wmode
wmodes
wmw-odds
wquantile
wquantiles
wstddev
wvariance
yeo-johnson-infer-lambda
yeo-johnson-transformation
z-test-one-sample
z-test-two-samples

Namespace provides a comprehensive collection of functions for performing statistical analysis in Clojure. It focuses on providing efficient implementations for common statistical tasks, leveraging fastmath's underlying numerical capabilities.

This namespace covers a wide range of statistical methods, including:

Descriptive Statistics: Measures of central tendency (mean, median, mode, expectile), dispersion (variance, standard deviation, MAD, SEM), and shape (skewness, kurtosis, L-moments).
Quantiles and Percentiles: Functions for calculating percentiles, quantiles, and the median, including weighted versions and various estimation strategies.
Intervals and Extents: Methods for defining ranges within data, such as span, IQR, standard deviation/MAD/SEM extents, percentile/quantile intervals, prediction intervals (PI, HPDI), and fence boundaries for outlier detection.
Outlier Detection: Functions for identifying data points outside conventional fence boundaries.
Data Transformation: Utilities for scaling, centering, trimming, winsorizing, and applying power transformations (Box-Cox, Yeo-Johnson) to data.
Correlation and Covariance: Measures of the linear and monotonic relationship between two or more variables (Pearson, Spearman, Kendall), and functions for generating covariance and correlation matrices.
Distance and Similarity Metrics: Functions for quantifying differences or likeness between data sequences or distributions, including error metrics (MAE, MSE, RMSE), L-p norms, and various distribution dissimilarity/similarity measures.
Contingency Tables: Functions for creating, analyzing, and deriving measures of association and agreement (Cramer's V, Cohen's Kappa) from contingency tables, including specialized functions for 2x2 tables.
Binary Classification Metrics: Functions for generating confusion matrices and calculating a wide array of performance metrics (Accuracy, Precision, Recall, F1, MCC, etc.).
Effect Size: Measures quantifying the magnitude of statistical effects, including difference-based (Cohen's d, Hedges' g, Glass's delta), ratio-based, ordinal/non-parametric (Cliff's Delta, Vargha-Delaney A), and overlap-based (Cohen's U, p-overlap), as well as measures related to explained variance (Eta-squared, Omega-squared, Cohen's f²).
Statistical Tests: Functions for performing hypothesis tests, including:
- Normality and Shape tests (Skewness, Kurtosis, D'Agostino-Pearson K², Jarque-Bera, Bonett-Seier).
- Binomial tests and confidence intervals.
- Location tests (one-sample and two-sample T/Z tests, paired/unpaired).
- Variance tests (F-test, Levene's, Brown-Forsythe, Fligner-Killeen).
- Goodness-of-Fit and Independence tests (Power Divergence family including Chi-squared, G-test; AD/KS tests).
- ANOVA and Rank Sum tests (One-way ANOVA, Kruskal-Wallis).
- Autocorrelation tests (Durbin-Watson).
Time Series Analysis: Functions for analyzing the dependence structure of time series data, such as Autocorrelation (ACF) and Partial Autocorrelation (PACF).
Histograms: Functions for computing histograms and estimating optimal binning strategies.

This namespace aims to provide a robust set of statistical tools for data analysis and modeling within the Clojure ecosystem.

Namespace provides a comprehensive collection of functions for
performing statistical analysis in Clojure. It focuses on providing efficient
implementations for common statistical tasks, leveraging fastmath's underlying
numerical capabilities.

This namespace covers a wide range of statistical methods, including:

* **Descriptive Statistics**: Measures of central tendency (mean, median, mode, expectile),
dispersion (variance, standard deviation, MAD, SEM), and shape (skewness, kurtosis, L-moments).
* **Quantiles and Percentiles**: Functions for calculating percentiles, quantiles, and the median,
including weighted versions and various estimation strategies.
* **Intervals and Extents**: Methods for defining ranges within data, such as span, IQR,
standard deviation/MAD/SEM extents, percentile/quantile intervals, prediction intervals (PI, HPDI),
and fence boundaries for outlier detection.
* **Outlier Detection**: Functions for identifying data points outside conventional fence boundaries.
* **Data Transformation**: Utilities for scaling, centering, trimming, winsorizing,
and applying power transformations (Box-Cox, Yeo-Johnson) to data.
* **Correlation and Covariance**: Measures of the linear and monotonic relationship
between two or more variables (Pearson, Spearman, Kendall), and functions for
generating covariance and correlation matrices.
* **Distance and Similarity Metrics**: Functions for quantifying differences or
likeness between data sequences or distributions, including error metrics (MAE, MSE, RMSE),
L-p norms, and various distribution dissimilarity/similarity measures.
* **Contingency Tables**: Functions for creating, analyzing, and deriving measures
of association and agreement (Cramer's V, Cohen's Kappa) from contingency tables,
including specialized functions for 2x2 tables.
* **Binary Classification Metrics**: Functions for generating confusion matrices
and calculating a wide array of performance metrics (Accuracy, Precision, Recall, F1, MCC, etc.).
* **Effect Size**: Measures quantifying the magnitude of statistical effects,
including difference-based (Cohen's d, Hedges' g, Glass's delta), ratio-based,
ordinal/non-parametric (Cliff's Delta, Vargha-Delaney A), and overlap-based (Cohen's U, p-overlap),
as well as measures related to explained variance (Eta-squared, Omega-squared, Cohen's f²).
* **Statistical Tests**: Functions for performing hypothesis tests, including:
- Normality and Shape tests (Skewness, Kurtosis, D'Agostino-Pearson K², Jarque-Bera, Bonett-Seier).
- Binomial tests and confidence intervals.
- Location tests (one-sample and two-sample T/Z tests, paired/unpaired).
- Variance tests (F-test, Levene's, Brown-Forsythe, Fligner-Killeen).
- Goodness-of-Fit and Independence tests (Power Divergence family including Chi-squared, G-test; AD/KS tests).
- ANOVA and Rank Sum tests (One-way ANOVA, Kruskal-Wallis).
- Autocorrelation tests (Durbin-Watson).
* **Time Series Analysis**: Functions for analyzing the dependence structure of
time series data, such as Autocorrelation (ACF) and Partial Autocorrelation (PACF).
* **Histograms**: Functions for computing histograms and estimating optimal binning strategies.

This namespace aims to provide a robust set of statistical tools for data analysis
and modeling within the Clojure ecosystem.

raw docstring

->confusion-matrix^cljdeprecated

source

acf^clj

(acf data)

(acf data lags)

Calculates the Autocorrelation Function (ACF) for a given time series data.

The ACF measures the linear dependence between a time series and its lagged values. It helps identify patterns (like seasonality or trend) and inform the selection of models for time series analysis (e.g., in ARIMA modeling).

Parameters:

data (seq of numbers): The time series data.
lags (long or seq of longs, optional):
- If a number, calculates ACF for lags from 0 up to this maximum lag.
- If a sequence of numbers, calculates ACF for each lag specified in the sequence.
- If omitted (1-arity call), calculates ACF for lags from 0 up to (dec (count data)).

Returns a sequence of doubles: the autocorrelation coefficients for the specified lags. The value at lag 0 is always 1.0.

See also acf-ci (Calculates ACF with confidence intervals), pacf, pacf-ci.

Calculates the Autocorrelation Function (ACF) for a given time series `data`.

The ACF measures the linear dependence between a time series and its lagged values.
It helps identify patterns (like seasonality or trend) and inform the selection of
models for time series analysis (e.g., in ARIMA modeling).

Parameters:

* `data` (seq of numbers): The time series data.
* `lags` (long or seq of longs, optional):
  * If a number, calculates ACF for lags from 0 up to this maximum lag.
  * If a sequence of numbers, calculates ACF for each lag specified in the sequence.
  * If omitted (1-arity call), calculates ACF for lags from 0 up to `(dec (count data))`.

Returns a sequence of doubles: the autocorrelation coefficients for the specified lags.
The value at lag 0 is always 1.0.

See also [[acf-ci]] (Calculates ACF with confidence intervals), [[pacf]], [[pacf-ci]].

`Ctrl`+`k`	Jump to recent docs
`←`	Move to previous article
`→`	Move to next article
`Ctrl`+`/`	Jump to the search field

	Predicted True	Predicted False
Actual True	TP	FN
Actual False	FP	TN

	Predicted True	Predicted False
Actual True	TP	FN
Actual False	FP	TN

fastmath.stats

->confusion-matrixcljdeprecated

acfclj

acf-ciclj

ad-test-one-sampleclj

adjacent-valuesclj

ameasureclj

binary-measuresclj

binary-measures-allclj

binomial-ciclj

binomial-ci-methodsclj

binomial-testclj

bonett-seier-testclj

bootstrapcljdeprecated

bootstrap-cicljdeprecated

box-cox-infer-lambdaclj

box-cox-transformationclj

brown-forsythe-testclj

chisq-testclj

ciclj

cliffs-deltaclj

coefficient-matrixclj

cohens-dclj

cohens-d-correctedclj

cohens-fclj

cohens-f2clj

cohens-kappaclj

cohens-qclj

cohens-u1clj

cohens-u1-normalclj

cohens-u2clj

cohens-u2-normalclj

cohens-u3clj

cohens-u3-normalclj

cohens-wclj

confusion-matrixclj

contingency-2x2-measuresclj

contingency-2x2-measures-allclj

contingency-tableclj

contingency-table->marginalsclj

correlationclj

correlation-matrixclj

count=clj

covarianceclj

covariance-matrixclj

cramers-cclj

cramers-vclj

cramers-v-correctedclj

cressie-read-testclj

demeanclj

dissimilarityclj

durbin-watsonclj

epsilon-sqclj

estimate-binsclj

estimation-strategies-listclj

eta-sqclj

expectileclj

extentclj

f-testclj

fligner-killeen-testclj

freeman-tukey-testclj

geomeanclj

glass-deltaclj

harmeanclj

hedges-gclj

hedges-g*clj

hedges-g-correctedclj

histogramclj

hpdi-extentclj

inner-fence-extentclj

iqrclj

jarque-bera-testclj

jensen-shannon-divergencecljdeprecated

kendall-correlationclj

kruskal-testclj

ks-test-one-sampleclj

ks-test-two-samplesclj

kullback-leibler-divergencecljdeprecated

kurtosisclj

kurtosis-testclj

->confusion-matrix^cljdeprecated

acf^clj

acf-ci^clj

ad-test-one-sample^clj

adjacent-values^clj

ameasure^clj

binary-measures^clj

binary-measures-all^clj

binomial-ci^clj

binomial-ci-methods^clj

binomial-test^clj

bonett-seier-test^clj

bootstrap^cljdeprecated

bootstrap-ci^cljdeprecated

box-cox-infer-lambda^clj

box-cox-transformation^clj

brown-forsythe-test^clj

chisq-test^clj

ci^clj

cliffs-delta^clj

coefficient-matrix^clj

cohens-d^clj

cohens-d-corrected^clj

cohens-f^clj

cohens-f2^clj

cohens-kappa^clj

cohens-q^clj

cohens-u1^clj

cohens-u1-normal^clj

cohens-u2^clj

cohens-u2-normal^clj

cohens-u3^clj

cohens-u3-normal^clj

cohens-w^clj

confusion-matrix^clj

contingency-2x2-measures^clj

contingency-2x2-measures-all^clj

contingency-table^clj

contingency-table->marginals^clj

correlation^clj

correlation-matrix^clj

count=^clj

covariance^clj

covariance-matrix^clj

cramers-c^clj

cramers-v^clj

cramers-v-corrected^clj

cressie-read-test^clj

demean^clj

dissimilarity^clj

durbin-watson^clj

epsilon-sq^clj

estimate-bins^clj

estimation-strategies-list^clj

eta-sq^clj

expectile^clj

extent^clj

f-test^clj

fligner-killeen-test^clj

freeman-tukey-test^clj

geomean^clj

glass-delta^clj

harmean^clj

hedges-g^clj

hedges-g*^clj

hedges-g-corrected^clj

histogram^clj

hpdi-extent^clj

inner-fence-extent^clj

iqr^clj

jarque-bera-test^clj

jensen-shannon-divergence^cljdeprecated

kendall-correlation^clj

kruskal-test^clj

ks-test-one-sample^clj

ks-test-two-samples^clj

kullback-leibler-divergence^cljdeprecated

kurtosis^clj

kurtosis-test^clj

l-moment^clj