criterium.stats — org.hugoduncan/criterium 0.5.245-ALPHA

criterium.stats.autocorrelation

Autocorrelation function (ACF) computation and related statistics.

Provides FFT-based ACF computation for detecting sample non-independence in benchmark results, along with derived statistics for quantifying the impact on statistical reliability.

Main functions:

acf - Compute autocorrelation coefficients for all lags
ljung-box - Ljung-Box Q statistic and p-value for independence testing
effective-sample-size - Adjusted sample size accounting for autocorrelation
ci-inflation-factor - Factor to widen confidence intervals

Autocorrelation function (ACF) computation and related statistics.

Provides FFT-based ACF computation for detecting sample non-independence
in benchmark results, along with derived statistics for quantifying the
impact on statistical reliability.

Main functions:
- `acf` - Compute autocorrelation coefficients for all lags
- `ljung-box` - Ljung-Box Q statistic and p-value for independence testing
- `effective-sample-size` - Adjusted sample size accounting for autocorrelation
- `ci-inflation-factor` - Factor to widen confidence intervals

raw docstring

The chi-squared distribution with k degrees of freedom is the distribution of a sum of squares of k independent standard normal random variables. It is a special case of the gamma distribution with shape = k/2 and scale = 2.

Chi-squared distribution functions.

The chi-squared distribution with k degrees of freedom is the distribution
of a sum of squares of k independent standard normal random variables.
It is a special case of the gamma distribution with shape = k/2 and scale = 2.

raw docstring

cdf

criterium.stats.core

Core statistical functions: min, max, mean, sum, variance, median, quartiles, quantile.

All functions require typed arrays (ITypedArray) as input. Primitive-optimized implementations avoid boxing overhead.

Core statistical functions: min, max, mean, sum, variance, median, quartiles, quantile.

All functions require typed arrays (ITypedArray) as input.
Primitive-optimized implementations avoid boxing overhead.

raw docstring

criterium.stats.fft

Pure Clojure radix-2 Cooley-Tukey FFT implementation.

Provides O(n log n) Fast Fourier Transform for autocorrelation computation. Uses interleaved complex representation [re0 im0 re1 im1 ...] for cache efficiency. All operations use primitive double arrays with zero garbage allocation during transform execution.

Main functions:

fft! / fft - Forward FFT (in-place / copying)
ifft! / ifft - Inverse FFT (in-place / copying)
next-power-of-2 - Find smallest power of 2 >= n
zero-pad-real - Zero-pad real signal to power-of-2 length

Complex arrays use interleaved format: [re0 im0 re1 im1 ...] Array length is 2*n where n is the number of complex samples.

Pure Clojure radix-2 Cooley-Tukey FFT implementation.

Provides O(n log n) Fast Fourier Transform for autocorrelation computation.
Uses interleaved complex representation [re0 im0 re1 im1 ...] for cache
efficiency. All operations use primitive double arrays with zero garbage
allocation during transform execution.

Main functions:
- `fft!` / `fft` - Forward FFT (in-place / copying)
- `ifft!` / `ifft` - Inverse FFT (in-place / copying)
- `next-power-of-2` - Find smallest power of 2 >= n
- `zero-pad-real` - Zero-pad real signal to power-of-2 length

Complex arrays use interleaved format: [re0 im0 re1 im1 ...]
Array length is 2*n where n is the number of complex samples.

raw docstring

criterium.stats.histogram

Histogram computation utilities with multiple binning methods.

Supports:

:freedman-diaconis (default) - Uses IQR-based bin width calculation
:knuth - Bayesian optimal bin count selection

All functions require typed arrays (DoubleArray, LongArray).

Histogram computation utilities with multiple binning methods.

Supports:
- :freedman-diaconis (default) - Uses IQR-based bin width calculation
- :knuth - Bayesian optimal bin count selection

All functions require typed arrays (DoubleArray, LongArray).

raw docstring

histogram

criterium.stats.kde

Kernel Density Estimation utilities.

Provides ISJ (Improved Sheather-Jones) bandwidth selection, Gaussian kernel density estimation, bootstrap confidence bands, and mode finding.

All functions require typed arrays (DoubleArray, LongArray).

Kernel Density Estimation utilities.

Provides ISJ (Improved Sheather-Jones) bandwidth selection, Gaussian kernel
density estimation, bootstrap confidence bands, and mode finding.

All functions require typed arrays (DoubleArray, LongArray).

raw docstring

criterium.stats.kernel

Kernel functions for density estimation.

Provides kernel weight functions and basic kernel density estimators for modal estimation and bandwidth selection.

Kernel functions for density estimation.

Provides kernel weight functions and basic kernel density estimators
for modal estimation and bandwidth selection.

raw docstring

criterium.stats.knuth

Knuth's Bayesian histogram binning algorithm.

Implements optimal bin count selection by maximizing a log-posterior based on Knuth (2019) DOI: 10.1016/j.dsp.2019.102581

The algorithm finds the optimal number of equal-width bins M by maximizing: F(M|x,I) = n·log(M) + logΓ(M/2) - M·logΓ(1/2) - logΓ((2n+M)/2) + Σₖ₌₁ᴹ logΓ(nₖ + 1/2)

where n = sample count, nₖ = count in bin k.

All functions require typed arrays (DoubleArray, LongArray).

Knuth's Bayesian histogram binning algorithm.

Implements optimal bin count selection by maximizing a log-posterior
based on Knuth (2019) DOI: 10.1016/j.dsp.2019.102581

The algorithm finds the optimal number of equal-width bins M by maximizing:
F(M|x,I) = n·log(M) + logΓ(M/2) - M·logΓ(1/2) - logΓ((2n+M)/2) + Σₖ₌₁ᴹ logΓ(nₖ + 1/2)

where n = sample count, nₖ = count in bin k.

All functions require typed arrays (DoubleArray, LongArray).

raw docstring

criterium.stats.mle

Maximum Likelihood Estimation for statistical distributions.

Provides MLE fitting functions that return both parameter estimates and log-likelihood values for model comparison via AIC/BIC.

Distributions supported:

Gamma: Minka's fast fixed-point approximation for shape
Log-normal: Closed-form MLE
Inverse Gaussian: Closed-form MLE
Weibull: Newton-Raphson iteration for shape

All functions return maps with :params and :log-likelihood keys. All functions require typed arrays (DoubleArray, LongArray).

Maximum Likelihood Estimation for statistical distributions.

Provides MLE fitting functions that return both parameter estimates
and log-likelihood values for model comparison via AIC/BIC.

Distributions supported:
- Gamma: Minka's fast fixed-point approximation for shape
- Log-normal: Closed-form MLE
- Inverse Gaussian: Closed-form MLE
- Weibull: Newton-Raphson iteration for shape

All functions return maps with :params and :log-likelihood keys.
All functions require typed arrays (DoubleArray, LongArray).

raw docstring

criterium.stats.moment-match

Moment-based parameter estimation and distribution suitability screening.

Provides method-of-moments initial parameter estimates for distributions and a prefilter to screen out distributions that are unsuitable for a given dataset based on sample statistics.

This is used before MLE fitting to quickly eliminate distributions where moment-based estimates yield invalid parameters (e.g., negative shape).

Moment-based parameter estimation and distribution suitability screening.

Provides method-of-moments initial parameter estimates for distributions
and a prefilter to screen out distributions that are unsuitable for a
given dataset based on sample statistics.

This is used before MLE fitting to quickly eliminate distributions where
moment-based estimates yield invalid parameters (e.g., negative shape).

raw docstring

criterium.stats.outliers

Outlier detection using boxplot thresholds.

Provides both standard symmetric boxplot and adjusted boxplot for skewed distributions using the medcouple statistic.

All functions require typed arrays (ITypedArray) as input.

Outlier detection using boxplot thresholds.

Provides both standard symmetric boxplot and adjusted boxplot for
skewed distributions using the medcouple statistic.

All functions require typed arrays (ITypedArray) as input.

raw docstring

criterium.stats.probability

Probability functions: log-gamma, error function, normal distribution, and common statistical distributions (gamma, weibull, lognormal, inverse-gaussian).

Probability functions: log-gamma, error function, normal distribution,
and common statistical distributions (gamma, weibull, lognormal, inverse-gaussian).

raw docstring

criterium.stats.sampling

Sampling utilities: sample functions, confidence intervals.

All sampling functions take mutable uniform RNGs and call next-double! to generate random values.

Sampling utilities: sample functions, confidence intervals.

All sampling functions take mutable uniform RNGs and call next-double!
to generate random values.

raw docstring

criterium.stats.t-digest

T-digest streaming quantile estimation. Provides a wrapper API over the merging-digest implementation.

T-digest streaming quantile estimation.
Provides a wrapper API over the merging-digest implementation.

raw docstring

criterium.stats.t-digest.merging-digest

Implementation of the t-digest algorithm for streaming quantile estimation. Based on the MergingDigest variant from https://github.com/tdunning/t-digest

Implementation of the t-digest algorithm for streaming quantile estimation.
Based on the MergingDigest variant from https://github.com/tdunning/t-digest

raw docstring

criterium.stats.t-digest.scale

Scale functions for t-digest algorithm. These control how cluster sizes are determined and affect accuracy in different ways.

Scale functions for t-digest algorithm.
These control how cluster sizes are determined and affect accuracy in different ways.

raw docstring

criterium.stats.tail

Tail statistics for extreme value analysis.

Provides functions for analyzing distribution tails, including:

Hill estimator for tail index estimation
Generalized Pareto Distribution (GPD) fitting and functions
Mean residual life for threshold selection
Tail ratios from percentiles

All functions requiring sample data accept typed arrays (ITypedArray).

References:

Hill (1975), A Simple General Approach to Inference About the Tail of a Distribution
Grimshaw (1993), Computing Maximum Likelihood Estimates for the GPD
Coles (2001), An Introduction to Statistical Modeling of Extreme Values

Tail statistics for extreme value analysis.

Provides functions for analyzing distribution tails, including:
- Hill estimator for tail index estimation
- Generalized Pareto Distribution (GPD) fitting and functions
- Mean residual life for threshold selection
- Tail ratios from percentiles

All functions requiring sample data accept typed arrays (ITypedArray).

References:
- Hill (1975), A Simple General Approach to Inference About the Tail of a Distribution
- Grimshaw (1993), Computing Maximum Likelihood Estimates for the GPD
- Coles (2001), An Introduction to Statistical Modeling of Extreme Values

raw docstring

`Ctrl`+`k`	Jump to recent docs
`←`	Move to previous article
`→`	Move to next article
`Ctrl`+`/`	Jump to the search field

criterium.stats.autocorrelation

criterium.stats.bootstrap

criterium.stats.chi-squared

criterium.stats.core

criterium.stats.fft

criterium.stats.histogram

criterium.stats.kde

criterium.stats.kernel

criterium.stats.knuth

criterium.stats.mle

criterium.stats.moment-match

criterium.stats.outliers

criterium.stats.probability

criterium.stats.sampling

criterium.stats.t-digest

criterium.stats.t-digest.merging-digest

criterium.stats.t-digest.scale

criterium.stats.tail