Liking cljdoc? Tell your friends :D

criterium.stats.tail

Tail statistics for extreme value analysis.

Provides functions for analyzing distribution tails, including:

  • Hill estimator for tail index estimation
  • Generalized Pareto Distribution (GPD) fitting and functions
  • Mean residual life for threshold selection
  • Tail ratios from percentiles

All functions requiring sample data accept typed arrays (ITypedArray).

References:

  • Hill (1975), A Simple General Approach to Inference About the Tail of a Distribution
  • Grimshaw (1993), Computing Maximum Likelihood Estimates for the GPD
  • Coles (2001), An Introduction to Statistical Modeling of Extreme Values
Tail statistics for extreme value analysis.

Provides functions for analyzing distribution tails, including:
- Hill estimator for tail index estimation
- Generalized Pareto Distribution (GPD) fitting and functions
- Mean residual life for threshold selection
- Tail ratios from percentiles

All functions requiring sample data accept typed arrays (ITypedArray).

References:
- Hill (1975), A Simple General Approach to Inference About the Tail of a Distribution
- Grimshaw (1993), Computing Maximum Likelihood Estimates for the GPD
- Coles (2001), An Introduction to Statistical Modeling of Extreme Values
raw docstring

exceedances-over-thresholdclj

(exceedances-over-threshold samples threshold)

Extract excesses over the given threshold. Returns a new DoubleArray containing (y - threshold) for each y > threshold.

In extreme value theory, the 'exceedance' or 'excess' over a threshold u is defined as Y = X - u for observations X > u. These excesses are modeled by the Generalized Pareto Distribution (GPD).

Parameters: samples - typed array of sample values threshold - threshold value u

Returns DoubleArray of excesses (y - threshold for y > threshold).

Extract excesses over the given threshold.
Returns a new DoubleArray containing (y - threshold) for each y > threshold.

In extreme value theory, the 'exceedance' or 'excess' over a threshold u
is defined as Y = X - u for observations X > u. These excesses are modeled
by the Generalized Pareto Distribution (GPD).

Parameters:
  samples - typed array of sample values
  threshold - threshold value u

Returns DoubleArray of excesses (y - threshold for y > threshold).
sourceraw docstring

gpd-cdfclj

(gpd-cdf xi sigma)

Cumulative distribution function for the Generalized Pareto Distribution.

For exceedances y > 0: F(y; ξ, σ) = 1 - (1 + ξy/σ)^(-1/ξ) if ξ ≠ 0 F(y; 0, σ) = 1 - exp(-y/σ) if ξ = 0

Parameters: xi - shape parameter ξ sigma - scale parameter σ (must be positive)

Returns a function F(y) that computes P(Y ≤ y).

Cumulative distribution function for the Generalized Pareto Distribution.

For exceedances y > 0:
  F(y; ξ, σ) = 1 - (1 + ξy/σ)^(-1/ξ)  if ξ ≠ 0
  F(y; 0, σ) = 1 - exp(-y/σ)           if ξ = 0

Parameters:
  xi - shape parameter ξ
  sigma - scale parameter σ (must be positive)

Returns a function F(y) that computes P(Y ≤ y).
sourceraw docstring

gpd-mleclj

(gpd-mle exceedances)
(gpd-mle exceedances opts)

Maximum likelihood estimation for the Generalized Pareto Distribution.

Uses Grimshaw's (1993) algorithm with profile likelihood optimization.

Parameters: exceedances - typed array of exceedance values (values above threshold) All values must be positive (exceedances, not raw data) opts - optional map with: :max-iter - maximum iterations (default 100) :tol - convergence tolerance (default 1e-8) :xi-min - minimum ξ to search (default -0.5) :xi-max - maximum ξ to search (default 2.0)

Returns map with: :xi - shape parameter estimate :sigma - scale parameter estimate :log-likelihood - maximized log-likelihood :converged? - whether optimization converged :n - number of exceedances

Throws if exceedances is empty or contains non-positive values.

Maximum likelihood estimation for the Generalized Pareto Distribution.

Uses Grimshaw's (1993) algorithm with profile likelihood optimization.

Parameters:
  exceedances - typed array of exceedance values (values above threshold)
                All values must be positive (exceedances, not raw data)
  opts - optional map with:
    :max-iter - maximum iterations (default 100)
    :tol - convergence tolerance (default 1e-8)
    :xi-min - minimum ξ to search (default -0.5)
    :xi-max - maximum ξ to search (default 2.0)

Returns map with:
  :xi - shape parameter estimate
  :sigma - scale parameter estimate
  :log-likelihood - maximized log-likelihood
  :converged? - whether optimization converged
  :n - number of exceedances

Throws if exceedances is empty or contains non-positive values.
sourceraw docstring

gpd-pdfclj

(gpd-pdf xi sigma)

Probability density function for the Generalized Pareto Distribution.

For exceedances y > 0: f(y; ξ, σ) = (1/σ) * (1 + ξy/σ)^(-1/ξ - 1) if ξ ≠ 0 f(y; 0, σ) = (1/σ) * exp(-y/σ) if ξ = 0

Support: y ≥ 0 if ξ ≥ 0 0 ≤ y ≤ -σ/ξ if ξ < 0

Parameters: xi - shape parameter ξ (can be negative, zero, or positive) sigma - scale parameter σ (must be positive)

Returns a function f(y) that computes the density at y.

Probability density function for the Generalized Pareto Distribution.

For exceedances y > 0:
  f(y; ξ, σ) = (1/σ) * (1 + ξy/σ)^(-1/ξ - 1)  if ξ ≠ 0
  f(y; 0, σ) = (1/σ) * exp(-y/σ)               if ξ = 0

Support:
  y ≥ 0           if ξ ≥ 0
  0 ≤ y ≤ -σ/ξ    if ξ < 0

Parameters:
  xi - shape parameter ξ (can be negative, zero, or positive)
  sigma - scale parameter σ (must be positive)

Returns a function f(y) that computes the density at y.
sourceraw docstring

gpd-quantileclj

(gpd-quantile xi sigma)

Quantile function (inverse CDF) for the Generalized Pareto Distribution.

For probability p ∈ [0, 1]: Q(p; ξ, σ) = (σ/ξ) * ((1-p)^(-ξ) - 1) if ξ ≠ 0 Q(p; 0, σ) = -σ * log(1-p) if ξ = 0

Parameters: xi - shape parameter ξ sigma - scale parameter σ (must be positive)

Returns a function Q(p) that computes the p-th quantile.

Quantile function (inverse CDF) for the Generalized Pareto Distribution.

For probability p ∈ [0, 1]:
  Q(p; ξ, σ) = (σ/ξ) * ((1-p)^(-ξ) - 1)  if ξ ≠ 0
  Q(p; 0, σ) = -σ * log(1-p)              if ξ = 0

Parameters:
  xi - shape parameter ξ
  sigma - scale parameter σ (must be positive)

Returns a function Q(p) that computes the p-th quantile.
sourceraw docstring

hill-estimatorclj

(hill-estimator sorted-samples k-range)

Compute the Hill estimator for tail index across a range of k values.

The Hill estimator for the k largest order statistics is: H_k = (1/k) * Σᵢ₌₁ᵏ log(X_{(n-i+1)} / X_{(n-k)})

where X_{(i)} is the i-th order statistic (sorted ascending).

The tail index α is estimated as 1/H_k. Heavy tails have small α (< 2).

Parameters: sorted-samples - typed array of samples sorted in ascending order k-range - sequence of k values to compute estimates for (each k uses the k largest observations)

Returns vector of maps {:k k :estimate H_k :tail-index (1/H_k)} for each k in k-range where computation is valid.

Notes:

  • k must be >= 1 and < n (sample size)
  • Returns empty vector if samples has fewer than 2 elements
  • Requires sorted input (ascending order)
Compute the Hill estimator for tail index across a range of k values.

The Hill estimator for the k largest order statistics is:
  H_k = (1/k) * Σᵢ₌₁ᵏ log(X_{(n-i+1)} / X_{(n-k)})

where X_{(i)} is the i-th order statistic (sorted ascending).

The tail index α is estimated as 1/H_k. Heavy tails have small α (< 2).

Parameters:
  sorted-samples - typed array of samples sorted in ascending order
  k-range - sequence of k values to compute estimates for
            (each k uses the k largest observations)

Returns vector of maps {:k k :estimate H_k :tail-index (1/H_k)}
for each k in k-range where computation is valid.

Notes:
- k must be >= 1 and < n (sample size)
- Returns empty vector if samples has fewer than 2 elements
- Requires sorted input (ascending order)
sourceraw docstring

hill-estimator-default-k-rangeclj

(hill-estimator-default-k-range n)

Compute default k range for Hill estimator. Uses k from 10 to min(n/2, 500) with step size based on n.

Parameters: n - sample size

Returns sequence of k values.

Compute default k range for Hill estimator.
Uses k from 10 to min(n/2, 500) with step size based on n.

Parameters:
  n - sample size

Returns sequence of k values.
sourceraw docstring

mean-residual-lifeclj

(mean-residual-life sorted-samples threshold-range)

Compute mean residual life (mean excess) over a range of thresholds.

The mean residual life at threshold u is: e(u) = E[X - u | X > u]

For GPD data, e(u) is linear in u with slope ξ/(1-ξ). A threshold where e(u) becomes approximately linear suggests a good choice for POT analysis.

Parameters: sorted-samples - typed array of samples sorted in ascending order threshold-range - sequence of threshold values to evaluate

Returns vector of maps {:threshold u :mrl e(u) :n-exceed count} where n-exceed is the number of observations exceeding u.

Compute mean residual life (mean excess) over a range of thresholds.

The mean residual life at threshold u is:
  e(u) = E[X - u | X > u]

For GPD data, e(u) is linear in u with slope ξ/(1-ξ).
A threshold where e(u) becomes approximately linear suggests
a good choice for POT analysis.

Parameters:
  sorted-samples - typed array of samples sorted in ascending order
  threshold-range - sequence of threshold values to evaluate

Returns vector of maps {:threshold u :mrl e(u) :n-exceed count}
where n-exceed is the number of observations exceeding u.
sourceraw docstring

mean-residual-life-default-thresholdsclj

(mean-residual-life-default-thresholds sorted-samples)
(mean-residual-life-default-thresholds sorted-samples n-points)

Compute default threshold range for MRL plot. Uses quantiles from 50th to 95th percentile.

Parameters: sorted-samples - typed array of samples sorted in ascending order n-points - number of threshold points (default 20)

Returns sequence of threshold values.

Compute default threshold range for MRL plot.
Uses quantiles from 50th to 95th percentile.

Parameters:
  sorted-samples - typed array of samples sorted in ascending order
  n-points - number of threshold points (default 20)

Returns sequence of threshold values.
sourceraw docstring

tail-ratiosclj

(tail-ratios percentiles)

Compute tail ratios from percentile values.

Tail ratios indicate how heavy the distribution tail is. Higher ratios suggest heavier tails.

Parameters: percentiles - map of percentile values with keys like :p95, :p99, :p999 or numeric keys like 0.95, 0.99, 0.999

Returns map with: :p99-p95 - ratio of 99th to 95th percentile :p999-p99 - ratio of 99.9th to 99th percentile :p999-p95 - ratio of 99.9th to 95th percentile (if all present)

Returns nil for ratios where required percentiles are missing.

Compute tail ratios from percentile values.

Tail ratios indicate how heavy the distribution tail is.
Higher ratios suggest heavier tails.

Parameters:
  percentiles - map of percentile values with keys like :p95, :p99, :p999
                or numeric keys like 0.95, 0.99, 0.999

Returns map with:
  :p99-p95 - ratio of 99th to 95th percentile
  :p999-p99 - ratio of 99.9th to 99th percentile
  :p999-p95 - ratio of 99.9th to 95th percentile (if all present)

Returns nil for ratios where required percentiles are missing.
sourceraw docstring

cljdoc builds & hosts documentation for Clojure/Script libraries

Keyboard shortcuts
Ctrl+kJump to recent docs
Move to previous article
Move to next article
Ctrl+/Jump to the search field
× close