Bootstrap methods and confidence intervals
Bootstrap methods and confidence intervals
(bootstrap input)
(bootstrap input params-or-statistic)
(bootstrap input
statistic
{:keys [rng samples size method antithetic? dimensions include?
multi?]
:or {samples 500}
:as params})
Generates bootstrap samples from a given dataset or probabilistic model for resampling purposes.
This function supports both nonparametric bootstrap (resampling directly from the data) and parametric bootstrap (resampling from a statistical model estimated from or provided for the data). It can optionally apply a statistic function to the original data and each sample, returning summary statistics for the bootstrap distribution.
The primary input can be:
:data
: The sequence of data values.:model
: An optional model for parametric bootstrap. If not provided,
a default discrete distribution is built from the data (see :distribution
and :smoothing
).The function offers various parameters to control the sampling process and model generation.
Parameters:
input
(sequence or map): The data source. Can be a sequence of numbers
or a map containing :data
and optionally :model
. Can be sequence of
sequences for multidimensional data (when :dimensions
is :multi
).statistic
(function, optional): A function that takes a sequence of data
and returns a single numerical value (e.g., fastmath.stats/mean
,
fastmath.stats/median
). If provided, bootstrap-stats
is called on the
results.params
(map, optional): An options map to configure the bootstrap process.
Keys include:
:samples
(long, default: 500): The number of bootstrap samples to generate.:size
(long, optional): The size of each individual bootstrap sample.
Defaults to the size of the original data.:method
(keyword, optional): Specifies the sampling method.
nil
(default): Standard random sampling with replacement.:jackknife
: Performs leave-one-out jackknife resampling (ignores
:samples
and :size
).:jackknife+
: Performs positive jackknife resampling (duplicates
each observation once; ignores :samples
).fastmath.random/->seq
for sampling
from a distribution (only relevant if a :model
is used or built).:rng
(random number generator, optional): An instance of a random number
generator (see fastmath.random/rng
). A default JVM RNG is used if not provided.:smoothing
(keyword, optional): Applies smoothing to the bootstrap process.
:kde
: Uses Kernel Density Estimation to smooth the empirical distribution
before sampling. Requires specifying :kernel
(default :gaussian
)
and optionally :bandwidth
(auto-estimated by default).:gaussian
: Adds random noise drawn from N(0, standard error) to each
resampled value.:distribution
(keyword, default: :real-discrete-distribution
): The type
of discrete distribution to build automatically from the data if no explicit
:model
is provided. Other options include :integer-discrete-distribution
(for integer data) and :categorical-distribution
(for any data type).:dimensions
(keyword, optional): If set to :multi
, treats the input
:data
as a sequence of sequences (multidimensional data). Models are
built or used separately for each dimension, and samples are generated
as sequences of vectors.:antithetic?
(boolean, default: false
): If true
, uses antithetic sampling
for variance reduction (paired samples are generated as x
and 1-x
from a uniform
distribution, then transformed by the inverse CDF of the model). Requires sampling
from a distribution model.:include?
(boolean, default: false
): If true
, the original dataset
is included as one of the samples in the output collection.Model for parametric bootstrap:
The :model
parameter in the input map can be:
fastmath.random
distribution object (e.g., (r/distribution :normal {:mu 0 :sd 1})
).If :model
is omitted from the input map, a default discrete distribution
(:real-discrete-distribution
by default, see :distribution
param) is built
from the :data
. Smoothing options (:smoothing
) apply to this automatically
built model.
When :dimensions
is :multi
, :model
should be a sequence of models, one for
each dimension.
Returns:
statistic
is provided: A map containing the original input map augmented
with analysis results from bootstrap-stats
(e.g., :t0
, :ts
, :bias
,
:mean
, :stddev
).statistic
is nil
: A map containing the original input map augmented
with the generated bootstrap samples in the :samples
key. The :samples
value is a collection of sequences, where each inner sequence is one
bootstrap sample. If :dimensions
is :multi
, samples are sequences of vectors.See also jackknife
, jackknife+
, bootstrap-stats
,
ci-normal
, ci-basic
, ci-percentile
, ci-bc
, ci-bca
,
ci-studentized
, ci-t
.
Generates bootstrap samples from a given dataset or probabilistic model for resampling purposes. This function supports both **nonparametric bootstrap** (resampling directly from the data) and **parametric bootstrap** (resampling from a statistical model estimated from or provided for the data). It can optionally apply a statistic function to the original data and each sample, returning summary statistics for the bootstrap distribution. The primary input can be: * A sequence of data values (for nonparametric bootstrap). * A map containing: * `:data`: The sequence of data values. * `:model`: An optional model for parametric bootstrap. If not provided, a default discrete distribution is built from the data (see `:distribution` and `:smoothing`). The function offers various parameters to control the sampling process and model generation. Parameters: * `input` (sequence or map): The data source. Can be a sequence of numbers or a map containing `:data` and optionally `:model`. Can be sequence of sequences for multidimensional data (when `:dimensions` is `:multi`). * `statistic` (function, optional): A function that takes a sequence of data and returns a single numerical value (e.g., `fastmath.stats/mean`, `fastmath.stats/median`). If provided, `bootstrap-stats` is called on the results. * `params` (map, optional): An options map to configure the bootstrap process. Keys include: * `:samples` (long, default: 500): The number of bootstrap samples to generate. * `:size` (long, optional): The size of each individual bootstrap sample. Defaults to the size of the original data. * `:method` (keyword, optional): Specifies the sampling method. * `nil` (default): Standard random sampling with replacement. * `:jackknife`: Performs leave-one-out jackknife resampling (ignores `:samples` and `:size`). * `:jackknife+`: Performs positive jackknife resampling (duplicates each observation once; ignores `:samples`). * Other keywords are passed to `fastmath.random/->seq` for sampling from a distribution (only relevant if a `:model` is used or built). * `:rng` (random number generator, optional): An instance of a random number generator (see `fastmath.random/rng`). A default JVM RNG is used if not provided. * `:smoothing` (keyword, optional): Applies smoothing to the bootstrap process. * `:kde`: Uses Kernel Density Estimation to smooth the empirical distribution before sampling. Requires specifying `:kernel` (default `:gaussian`) and optionally `:bandwidth` (auto-estimated by default). * `:gaussian`: Adds random noise drawn from N(0, standard error) to each resampled value. * `:distribution` (keyword, default: `:real-discrete-distribution`): The type of discrete distribution to build automatically from the data if no explicit `:model` is provided. Other options include `:integer-discrete-distribution` (for integer data) and `:categorical-distribution` (for any data type). * `:dimensions` (keyword, optional): If set to `:multi`, treats the input `:data` as a sequence of sequences (multidimensional data). Models are built or used separately for each dimension, and samples are generated as sequences of vectors. * `:antithetic?` (boolean, default: `false`): If `true`, uses antithetic sampling for variance reduction (paired samples are generated as `x` and `1-x` from a uniform distribution, then transformed by the inverse CDF of the model). Requires sampling from a distribution model. * `:include?` (boolean, default: `false`): If `true`, the original dataset is included as one of the samples in the output collection. Model for parametric bootstrap: The `:model` parameter in the input map can be: * Any `fastmath.random` distribution object (e.g., `(r/distribution :normal {:mu 0 :sd 1})`). * Any 0-arity function that returns a random sample when called. If `:model` is omitted from the input map, a default discrete distribution (`:real-discrete-distribution` by default, see `:distribution` param) is built from the `:data`. Smoothing options (`:smoothing`) apply to this automatically built model. When `:dimensions` is `:multi`, `:model` should be a sequence of models, one for each dimension. Returns: * If `statistic` is provided: A map containing the original input map augmented with analysis results from `bootstrap-stats` (e.g., `:t0`, `:ts`, `:bias`, `:mean`, `:stddev`). * If `statistic` is `nil`: A map containing the original input map augmented with the generated bootstrap samples in the `:samples` key. The `:samples` value is a collection of sequences, where each inner sequence is one bootstrap sample. If `:dimensions` is `:multi`, samples are sequences of vectors. See also [[jackknife]], [[jackknife+]], [[bootstrap-stats]], [[ci-normal]], [[ci-basic]], [[ci-percentile]], [[ci-bc]], [[ci-bca]], [[ci-studentized]], [[ci-t]].
(bootstrap-stats {:keys [data samples] :as input} statistic)
Calculates summary statistics for bootstrap results.
Takes bootstrap output (typically from bootstrap
) and a statistic function,
computes the statistic on the original data (t0
) and on each bootstrap sample (ts
),
and derives various descriptive statistics from the distribution of ts
.
Parameters:
boot-data
(map): A map containing:
:data
: The original dataset.:samples
: A collection of bootstrap samples (e.g., from bootstrap
).:model
from bootstrap generation.statistic
(function): A function that accepts a sequence of data and returns
a single numerical statistic (e.g., fastmath.stats/mean
, fastmath.stats/median
).Returns a map which is the input boot-data
augmented with bootstrap analysis results:
:statistic
: The statistic function applied.:t0
: The statistic calculated on the original :data
.:ts
: A sequence of the statistic calculated on each bootstrap sample in :samples
.:bias
: The estimated bias of the statistic: mean(:ts) - :t0
.:mean
, :median
, :variance
, :stddev
, :sem
: Descriptive statistics
(mean, median, variance, standard deviation, standard error of the mean)
calculated from the distribution of :ts
.This function prepares the results for calculating various bootstrap
confidence intervals (e.g., ci-normal
, ci-percentile
, etc.).
Calculates summary statistics for bootstrap results. Takes bootstrap output (typically from [[bootstrap]]) and a statistic function, computes the statistic on the original data (`t0`) and on each bootstrap sample (`ts`), and derives various descriptive statistics from the distribution of `ts`. Parameters: * `boot-data` (map): A map containing: * `:data`: The original dataset. * `:samples`: A collection of bootstrap samples (e.g., from [[bootstrap]]). * (optional) other keys like `:model` from bootstrap generation. * `statistic` (function): A function that accepts a sequence of data and returns a single numerical statistic (e.g., `fastmath.stats/mean`, `fastmath.stats/median`). Returns a map which is the input `boot-data` augmented with bootstrap analysis results: * `:statistic`: The statistic function applied. * `:t0`: The statistic calculated on the original `:data`. * `:ts`: A sequence of the statistic calculated on each bootstrap sample in `:samples`. * `:bias`: The estimated bias of the statistic: `mean(:ts) - :t0`. * `:mean`, `:median`, `:variance`, `:stddev`, `:sem`: Descriptive statistics (mean, median, variance, standard deviation, standard error of the mean) calculated from the distribution of `:ts`. This function prepares the results for calculating various bootstrap confidence intervals (e.g., [[ci-normal]], [[ci-percentile]], etc.).
(ci-basic boot-data)
(ci-basic boot-data alpha)
(ci-basic {:keys [t0 ts]} alpha estimation-strategy)
Calculates the Basic (or Percentile-t) bootstrap confidence interval.
This method is based on the assumption that the distribution of the bootstrap
replicates (:ts
) centered around the true statistic (t
) is approximately the
same as the distribution of the original statistic (:t0
) centered around the mean
of the bootstrap replicates (mean(:ts)
).
The interval is constructed using the quantiles of the bootstrap replicates (:ts
)
relative to the original statistic (:t0
). Specifically, the lower bound is
2 * :t0 - q_upper
and the upper bound is 2 * :t0 - q_lower
, where q_lower
and
q_upper
are the alpha/2
and 1 - alpha/2
quantiles of :ts
, respectively.
Parameters:
boot-data
(map): A map containing bootstrap results, typically from bootstrap-stats
.
Requires keys:
:t0
(double): The statistic calculated on the original data.:ts
(sequence of numbers): The statistic calculated on each bootstrap sample.alpha
(double, optional): The significance level for the interval.
Defaults to 0.05
(for a 95% CI). The interval is based on the alpha/2
and 1 - alpha/2
quantiles of the :ts
distribution.estimation-strategy
(keyword, optional): Specifies the quantile estimation strategy
used to calculate the quantiles of :ts
. Defaults to :legacy
. See [[quantiles]]
for available options (e.g., :r1
through :r9
).Returns a vector [lower-bound, upper-bound, t0]
.
lower-bound
(double): The lower limit of the confidence interval.upper-bound
(double): The upper limit of the confidence interval.t0
(double): The statistic calculated on the original data (from boot-data
).See also bootstrap-stats
for input preparation and other confidence interval methods:
ci-normal
, ci-percentile
, ci-bc
, ci-bca
, ci-studentized
, ci-t
, [[quantiles]].
Calculates the Basic (or Percentile-t) bootstrap confidence interval. This method is based on the assumption that the distribution of the bootstrap replicates (`:ts`) centered around the true statistic (`t`) is approximately the same as the distribution of the original statistic (`:t0`) centered around the mean of the bootstrap replicates (`mean(:ts)`). The interval is constructed using the quantiles of the bootstrap replicates (`:ts`) relative to the original statistic (`:t0`). Specifically, the lower bound is `2 * :t0 - q_upper` and the upper bound is `2 * :t0 - q_lower`, where `q_lower` and `q_upper` are the `alpha/2` and `1 - alpha/2` quantiles of `:ts`, respectively. Parameters: * `boot-data` (map): A map containing bootstrap results, typically from [[bootstrap-stats]]. Requires keys: * `:t0` (double): The statistic calculated on the original data. * `:ts` (sequence of numbers): The statistic calculated on each bootstrap sample. * `alpha` (double, optional): The significance level for the interval. Defaults to `0.05` (for a 95% CI). The interval is based on the `alpha/2` and `1 - alpha/2` quantiles of the `:ts` distribution. * `estimation-strategy` (keyword, optional): Specifies the quantile estimation strategy used to calculate the quantiles of `:ts`. Defaults to `:legacy`. See [[quantiles]] for available options (e.g., `:r1` through `:r9`). Returns a vector `[lower-bound, upper-bound, t0]`. * `lower-bound` (double): The lower limit of the confidence interval. * `upper-bound` (double): The upper limit of the confidence interval. * `t0` (double): The statistic calculated on the original data (from `boot-data`). See also [[bootstrap-stats]] for input preparation and other confidence interval methods: [[ci-normal]], [[ci-percentile]], [[ci-bc]], [[ci-bca]], [[ci-studentized]], [[ci-t]], [[quantiles]].
(ci-bc boot-data)
(ci-bc boot-data alpha)
(ci-bc {:keys [t0 ts]} alpha estimation-strategy)
Calculates the Bias-Corrected (BC) bootstrap confidence interval.
This method adjusts the standard Percentile bootstrap interval (ci-percentile
)
to account for potential bias in the statistic's distribution. The correction
is based on the proportion of bootstrap replicates of the statistic (:ts
)
that are less than the statistic calculated on the original data (:t0
).
The procedure involves:
:ts
) based on these shifted probabilities.Parameters:
boot-data
(map): A map containing bootstrap results, typically from bootstrap-stats
.
Requires keys:
:t0
(double): The statistic calculated on the original data.:ts
(sequence of numbers): The statistic calculated on each bootstrap sample.alpha
(double, optional): The significance level for the interval.
Defaults to 0.05
(for a 95% CI). The interval is based on quantiles of the
:ts
distribution, adjusted by the bias correction factor.estimation-strategy
(keyword, optional): Specifies the quantile estimation strategy
used to calculate the final interval bounds from :ts
after applying corrections.
Defaults to :legacy
. See [[quantiles]] for available options (e.g., :r1
through :r9
).Returns a vector [lower-bound, upper-bound, t0]
.
lower-bound
(double): The lower limit of the confidence interval.upper-bound
(double): The upper limit of the confidence interval.t0
(double): The statistic calculated on the original data (from boot-data
).See also bootstrap-stats
for input preparation and other confidence interval methods:
ci-normal
, ci-basic
, ci-percentile
, ci-bca
, ci-studentized
, ci-t
, [[quantiles]].
Calculates the Bias-Corrected (BC) bootstrap confidence interval. This method adjusts the standard Percentile bootstrap interval ([[ci-percentile]]) to account for potential bias in the statistic's distribution. The correction is based on the proportion of bootstrap replicates of the statistic (`:ts`) that are less than the statistic calculated on the original data (`:t0`). The procedure involves: 1. Calculating a bias correction factor ($z_0$) based on the empirical cumulative distribution function (CDF) of the bootstrap replicates at the point of the original statistic ($z_0 = \Phi^{-1}(\text{Proportion of } t^* < t_0)$, where $\Phi^{-1}$ is the inverse standard normal CDF). 2. Shifting the standard normal quantiles corresponding to the desired confidence level ($\alpha/2$ and $1-\alpha/2$) by $z_0$. 3. Finding the corresponding quantiles in the distribution of bootstrap replicates (`:ts`) based on these shifted probabilities. Parameters: * `boot-data` (map): A map containing bootstrap results, typically from [[bootstrap-stats]]. Requires keys: * `:t0` (double): The statistic calculated on the original data. * `:ts` (sequence of numbers): The statistic calculated on each bootstrap sample. * `alpha` (double, optional): The significance level for the interval. Defaults to `0.05` (for a 95% CI). The interval is based on quantiles of the `:ts` distribution, adjusted by the bias correction factor. * `estimation-strategy` (keyword, optional): Specifies the quantile estimation strategy used to calculate the final interval bounds from `:ts` after applying corrections. Defaults to `:legacy`. See [[quantiles]] for available options (e.g., `:r1` through `:r9`). Returns a vector `[lower-bound, upper-bound, t0]`. * `lower-bound` (double): The lower limit of the confidence interval. * `upper-bound` (double): The upper limit of the confidence interval. * `t0` (double): The statistic calculated on the original data (from `boot-data`). See also [[bootstrap-stats]] for input preparation and other confidence interval methods: [[ci-normal]], [[ci-basic]], [[ci-percentile]], [[ci-bca]], [[ci-studentized]], [[ci-t]], [[quantiles]].
(ci-bca boot-data)
(ci-bca boot-data alpha)
(ci-bca {:keys [t0 ts data statistic]} alpha estimation-strategy)
Calculates the Bias-Corrected and Accelerated (BCa) bootstrap confidence interval.
The BCa interval is a sophisticated method that corrects for both bias and skewness in the distribution of the bootstrap statistic replicates. It is considered a more accurate interval, particularly when the bootstrap distribution is skewed.
The calculation requires two components:
The function uses one of two methods to calculate the acceleration factor:
boot-data
map contains the original
:data
and the :statistic
function used to compute :t0
and :ts
,
the acceleration factor is estimated using the jackknife method (by computing
the statistic on leave-one-out jackknife samples).:data
or :statistic
are missing from boot-data
,
the acceleration factor is estimated empirically from the distribution of
the bootstrap replicates (:ts
) using its skewness.Parameters:
boot-data
(map): A map containing bootstrap results, typically from bootstrap-stats
.
Requires keys:
:t0
(double): The statistic calculated on the original data.:ts
(sequence of numbers): The statistic calculated on each bootstrap sample.
May optionally include::data
(sequence): The original dataset (required for jackknife acceleration).:statistic
(function): The function used to calculate the statistic (required for jackknife acceleration).alpha
(double, optional): The significance level for the interval.
Defaults to 0.05
(for a 95% CI). The BCa method uses quantiles of the
normal distribution and the bootstrap replicates, adjusted by the bias
and acceleration factors.estimation-strategy
(keyword, optional): Specifies the quantile estimation strategy
used to calculate the quantiles of the bootstrap replicates (:ts
) for the
final interval bounds after applying corrections. Defaults to :legacy
.
See [[quantiles]] for available options (e.g., :r1
through :r9
).Returns a vector [lower-bound, upper-bound, t0]
.
lower-bound
(double): The lower limit of the confidence interval.upper-bound
(double): The upper limit of the confidence interval.t0
(double): The statistic calculated on the original data (from boot-data
).See also bootstrap-stats
for input preparation and other confidence interval methods:
ci-normal
, ci-basic
, ci-percentile
, ci-bc
, ci-studentized
, ci-t
, jackknife
, [[quantiles]].
Calculates the Bias-Corrected and Accelerated (BCa) bootstrap confidence interval. The BCa interval is a sophisticated method that corrects for both bias and skewness in the distribution of the bootstrap statistic replicates. It is considered a more accurate interval, particularly when the bootstrap distribution is skewed. The calculation requires two components: 1. A **bias correction factor** ($z_0$) based on the proportion of bootstrap replicates less than the original statistic ($t_0$). 2. An **acceleration factor** ($a$) which quantifies the rate of change of the standard error of the statistic with respect to the true parameter value. The function uses one of two methods to calculate the acceleration factor: * **Jackknife method**: If the input `boot-data` map contains the original `:data` and the `:statistic` function used to compute `:t0` and `:ts`, the acceleration factor is estimated using the jackknife method (by computing the statistic on leave-one-out jackknife samples). * **Empirical method**: If `:data` or `:statistic` are missing from `boot-data`, the acceleration factor is estimated empirically from the distribution of the bootstrap replicates (`:ts`) using its skewness. Parameters: * `boot-data` (map): A map containing bootstrap results, typically from [[bootstrap-stats]]. Requires keys: * `:t0` (double): The statistic calculated on the original data. * `:ts` (sequence of numbers): The statistic calculated on each bootstrap sample. May optionally include: * `:data` (sequence): The original dataset (required for jackknife acceleration). * `:statistic` (function): The function used to calculate the statistic (required for jackknife acceleration). * `alpha` (double, optional): The significance level for the interval. Defaults to `0.05` (for a 95% CI). The BCa method uses quantiles of the normal distribution and the bootstrap replicates, adjusted by the bias and acceleration factors. * `estimation-strategy` (keyword, optional): Specifies the quantile estimation strategy used to calculate the quantiles of the bootstrap replicates (`:ts`) for the final interval bounds after applying corrections. Defaults to `:legacy`. See [[quantiles]] for available options (e.g., `:r1` through `:r9`). Returns a vector `[lower-bound, upper-bound, t0]`. * `lower-bound` (double): The lower limit of the confidence interval. * `upper-bound` (double): The upper limit of the confidence interval. * `t0` (double): The statistic calculated on the original data (from `boot-data`). See also [[bootstrap-stats]] for input preparation and other confidence interval methods: [[ci-normal]], [[ci-basic]], [[ci-percentile]], [[ci-bc]], [[ci-studentized]], [[ci-t]], [[jackknife]], [[quantiles]].
(ci-normal boot-data)
(ci-normal {:keys [t0 ts stddev bias]} alpha)
Calculates a Normal (Gaussian) approximation bias-corrected confidence interval.
This method assumes the distribution of the bootstrap replicates of the statistic (:ts
)
is approximately normal. It computes a confidence interval centered around the
mean of the bootstrap statistics, adjusted by the estimated bias (mean(:ts) - :t0
),
and uses the standard error of the bootstrap statistics for scaling.
Parameters:
boot-data
(map): A map containing bootstrap results. Typically produced by bootstrap-stats
.
Requires keys:
:t0
(double): The statistic calculated on the original data.:ts
(sequence of numbers): The statistic calculated on each bootstrap sample.
May optionally include pre-calculated :stddev
(standard deviation of :ts
)
and :bias
for efficiency.alpha
(double, optional): The significance level for the interval.
Defaults to 0.05
(for a 95% CI). The interval is based on the alpha/2
and 1 - alpha/2
quantiles of the standard normal distribution.Returns a vector [lower-bound, upper-bound, t0]
.
lower-bound
(double): The lower limit of the confidence interval.upper-bound
(double): The upper limit of the confidence interval.t0
(double): The statistic calculated on the original data (from boot-data
).See also bootstrap-stats
for input preparation and other confidence interval methods:
ci-basic
, ci-percentile
, ci-bc
, ci-bca
, ci-studentized
, ci-t
.
Calculates a Normal (Gaussian) approximation bias-corrected confidence interval. This method assumes the distribution of the bootstrap replicates of the statistic (`:ts`) is approximately normal. It computes a confidence interval centered around the mean of the bootstrap statistics, adjusted by the estimated bias (`mean(:ts) - :t0`), and uses the standard error of the bootstrap statistics for scaling. Parameters: * `boot-data` (map): A map containing bootstrap results. Typically produced by [[bootstrap-stats]]. Requires keys: * `:t0` (double): The statistic calculated on the original data. * `:ts` (sequence of numbers): The statistic calculated on each bootstrap sample. May optionally include pre-calculated `:stddev` (standard deviation of `:ts`) and `:bias` for efficiency. * `alpha` (double, optional): The significance level for the interval. Defaults to `0.05` (for a 95% CI). The interval is based on the `alpha/2` and `1 - alpha/2` quantiles of the standard normal distribution. Returns a vector `[lower-bound, upper-bound, t0]`. * `lower-bound` (double): The lower limit of the confidence interval. * `upper-bound` (double): The upper limit of the confidence interval. * `t0` (double): The statistic calculated on the original data (from `boot-data`). See also [[bootstrap-stats]] for input preparation and other confidence interval methods: [[ci-basic]], [[ci-percentile]], [[ci-bc]], [[ci-bca]], [[ci-studentized]], [[ci-t]].
(ci-percentile boot-data)
(ci-percentile boot-data alpha)
(ci-percentile {:keys [t0 ts]} alpha estimation-strategy)
Calculates the Percentile bootstrap confidence interval.
This is the simplest bootstrap confidence interval method. It directly uses
the quantiles of the bootstrap replicates of the statistic (:ts
) as the
confidence interval bounds.
For a confidence level of 1 - alpha
, the interval is formed by taking the
alpha/2
and 1 - alpha/2
quantiles of the distribution of bootstrap
replicates (:ts
).
Parameters:
boot-data
(map): A map containing bootstrap results, typically from bootstrap-stats
.
Requires keys:
:t0
(double): The statistic calculated on the original data.:ts
(sequence of numbers): The statistic calculated on each bootstrap sample.alpha
(double, optional): The significance level for the interval.
Defaults to 0.05
(for a 95% CI). The interval is based on the alpha/2
and 1 - alpha/2
quantiles of the :ts
distribution.estimation-strategy
(keyword, optional): Specifies the quantile estimation strategy
used to calculate the quantiles of :ts
. Defaults to :legacy
. See [[quantiles]]
for available options (e.g., :r1
through :r9
).Returns a vector [lower-bound, upper-bound, t0]
.
lower-bound
(double): The alpha/2
quantile of :ts
.upper-bound
(double): The 1 - alpha/2
quantile of :ts
.t0
(double): The statistic calculated on the original data (from boot-data
).See also bootstrap-stats
for input preparation and other confidence interval methods:
ci-normal
, ci-basic
, ci-bc
, ci-bca
, ci-studentized
, ci-t
, [[quantiles]].
Calculates the Percentile bootstrap confidence interval. This is the simplest bootstrap confidence interval method. It directly uses the quantiles of the bootstrap replicates of the statistic (`:ts`) as the confidence interval bounds. For a confidence level of `1 - alpha`, the interval is formed by taking the `alpha/2` and `1 - alpha/2` quantiles of the distribution of bootstrap replicates (`:ts`). Parameters: * `boot-data` (map): A map containing bootstrap results, typically from [[bootstrap-stats]]. Requires keys: * `:t0` (double): The statistic calculated on the original data. * `:ts` (sequence of numbers): The statistic calculated on each bootstrap sample. * `alpha` (double, optional): The significance level for the interval. Defaults to `0.05` (for a 95% CI). The interval is based on the `alpha/2` and `1 - alpha/2` quantiles of the `:ts` distribution. * `estimation-strategy` (keyword, optional): Specifies the quantile estimation strategy used to calculate the quantiles of `:ts`. Defaults to `:legacy`. See [[quantiles]] for available options (e.g., `:r1` through `:r9`). Returns a vector `[lower-bound, upper-bound, t0]`. * `lower-bound` (double): The `alpha/2` quantile of `:ts`. * `upper-bound` (double): The `1 - alpha/2` quantile of `:ts`. * `t0` (double): The statistic calculated on the original data (from `boot-data`). See also [[bootstrap-stats]] for input preparation and other confidence interval methods: [[ci-normal]], [[ci-basic]], [[ci-bc]], [[ci-bca]], [[ci-studentized]], [[ci-t]], [[quantiles]].
(ci-studentized boot-data)
(ci-studentized boot-data alpha)
(ci-studentized {:keys [t0 ts data samples]} alpha estimation-strategy)
Calculates the Studentized (or Bootstrap-t) confidence interval.
This method is based on the distribution of the studentized pivotal quantity
(statistic(sample) - statistic(data)) / standard_error(statistic(sample))
.
It estimates the quantiles of this distribution using bootstrap replicates
and then uses them to construct a confidence interval around the statistic calculated
on the original data (:t0
), scaled by the standard error of the statistic calculated
on the original data (stddev(:data)
).
Parameters:
boot-data
(map): A map containing bootstrap results and necessary inputs.
This map typically comes from bootstrap-stats
and augmented with :data
and :samples
from the original bootstrap
call if not already present.
Requires the following keys:
:t0
(double): The statistic calculated on the original data.:ts
(sequence of numbers): The statistic calculated on each bootstrap sample.:data
(sequence): The original dataset used for bootstrapping. Needed to estimate the standard error of the statistic for scaling the interval.:samples
(collection of sequences): The collection of bootstrap samples. Needed to calculate the standard error of the statistic for each bootstrap sample.alpha
(double, optional): The significance level for the interval.
Defaults to 0.05
(for a 95% CI). The interval is based on the alpha/2
and 1 - alpha/2
quantiles of the studentized bootstrap replicates.estimation-strategy
(keyword, optional): Specifies the quantile estimation strategy
used to calculate the quantiles of the studentized replicates. Defaults to :legacy
.
See [[quantiles]] for available options (e.g., :r1
through :r9
).Returns a vector [lower-bound, upper-bound, t0]
.
lower-bound
(double): The lower limit of the confidence interval.upper-bound
(double): The upper limit of the confidence interval.t0
(double): The statistic calculated on the original data (from boot-data
).See also bootstrap-stats
for input preparation and other confidence interval methods:
ci-normal
, ci-basic
, ci-percentile
, ci-bc
, ci-bca
, ci-t
, [[stats/stddev]], [[stats/quantiles]].
Calculates the Studentized (or Bootstrap-t) confidence interval. This method is based on the distribution of the studentized pivotal quantity ` (statistic(sample) - statistic(data)) / standard_error(statistic(sample)) `. It estimates the quantiles of this distribution using bootstrap replicates and then uses them to construct a confidence interval around the statistic calculated on the original data (`:t0`), scaled by the standard error of the statistic calculated on the original data (`stddev(:data)`). Parameters: * `boot-data` (map): A map containing bootstrap results and necessary inputs. This map typically comes from [[bootstrap-stats]] and augmented with `:data` and `:samples` from the original [[bootstrap]] call if not already present. Requires the following keys: * `:t0` (double): The statistic calculated on the original data. * `:ts` (sequence of numbers): The statistic calculated on each bootstrap sample. * `:data` (sequence): The original dataset used for bootstrapping. Needed to estimate the standard error of the statistic for scaling the interval. * `:samples` (collection of sequences): The collection of bootstrap samples. Needed to calculate the standard error of the statistic for each bootstrap sample. * `alpha` (double, optional): The significance level for the interval. Defaults to `0.05` (for a 95% CI). The interval is based on the `alpha/2` and `1 - alpha/2` quantiles of the studentized bootstrap replicates. * `estimation-strategy` (keyword, optional): Specifies the quantile estimation strategy used to calculate the quantiles of the studentized replicates. Defaults to `:legacy`. See [[quantiles]] for available options (e.g., `:r1` through `:r9`). Returns a vector `[lower-bound, upper-bound, t0]`. * `lower-bound` (double): The lower limit of the confidence interval. * `upper-bound` (double): The upper limit of the confidence interval. * `t0` (double): The statistic calculated on the original data (from `boot-data`). See also [[bootstrap-stats]] for input preparation and other confidence interval methods: [[ci-normal]], [[ci-basic]], [[ci-percentile]], [[ci-bc]], [[ci-bca]], [[ci-t]], [[stats/stddev]], [[stats/quantiles]].
(ci-t boot-data)
(ci-t {:keys [t0 ts stddev]} alpha)
Calculates a confidence interval based on Student's t-distribution, centered at the original statistic value.
This method constructs a confidence interval centered at the statistic calculated on the original data (:t0
). The width of the interval is determined by the standard deviation of the bootstrap replicates (:ts
), scaled by a critical value from a Student's t-distribution. The degrees of freedom for the t-distribution are based on the number of bootstrap replicates (count(:ts) - 1
).
This interval does not explicitly use the Studentized bootstrap pivotal quantity. Instead, it applies a standard t-interval structure using components derived from the bootstrap results and the original data.
Parameters:
boot-data
(map): A map containing bootstrap results, typically from bootstrap-stats
. Requires keys:
:t0
(double): The statistic calculated on the original data.:ts
(sequence of numbers): The statistic calculated on each bootstrap sample.
May optionally include pre-calculated :stddev
(standard deviation of :ts
) for efficiency.alpha
(double, optional): The significance level for the interval. Defaults to 0.05
(for a 95% CI). The interval is based on the alpha/2
and 1 - alpha/2
quantiles of the Student's t-distribution with count(:ts) - 1
degrees of freedom.Returns a vector [lower-bound, upper-bound, t0]
.
lower-bound
(double): The lower limit of the confidence interval.upper-bound
(double): The upper limit of the confidence interval.t0
(double): The statistic calculated on the original data (from boot-data
).See also bootstrap-stats
for input preparation and other confidence interval methods:
ci-normal
, ci-basic
, ci-percentile
, ci-bc
, ci-bca
, ci-studentized
.
Calculates a confidence interval based on Student's t-distribution, centered at the original statistic value. This method constructs a confidence interval centered at the statistic calculated on the original data (`:t0`). The width of the interval is determined by the standard deviation of the bootstrap replicates (`:ts`), scaled by a critical value from a Student's t-distribution. The degrees of freedom for the t-distribution are based on the number of bootstrap replicates (`count(:ts) - 1`). This interval does not explicitly use the Studentized bootstrap pivotal quantity. Instead, it applies a standard t-interval structure using components derived from the bootstrap results and the original data. Parameters: * `boot-data` (map): A map containing bootstrap results, typically from [[bootstrap-stats]]. Requires keys: * `:t0` (double): The statistic calculated on the original data. * `:ts` (sequence of numbers): The statistic calculated on each bootstrap sample. May optionally include pre-calculated `:stddev` (standard deviation of `:ts`) for efficiency. * `alpha` (double, optional): The significance level for the interval. Defaults to `0.05` (for a 95% CI). The interval is based on the `alpha/2` and `1 - alpha/2` quantiles of the Student's t-distribution with `count(:ts) - 1` degrees of freedom. Returns a vector `[lower-bound, upper-bound, t0]`. * `lower-bound` (double): The lower limit of the confidence interval. * `upper-bound` (double): The upper limit of the confidence interval. * `t0` (double): The statistic calculated on the original data (from `boot-data`). See also [[bootstrap-stats]] for input preparation and other confidence interval methods: [[ci-normal]], [[ci-basic]], [[ci-percentile]], [[ci-bc]], [[ci-bca]], [[ci-studentized]].
(jackknife vs)
Generates a set of samples from a given sequence using the jackknife leave-one-out method.
For an input sequence vs
of size n
, this method creates n
samples. Each sample is formed by removing a single observation from the original sequence.
Parameters:
vs
(sequence): The input data sequence.Returns a sequence of sequences. The i-th inner sequence is vs
with the i-th element removed.
These samples are commonly used for estimating the bias and standard error of a statistic (e.g., via bootstrap-stats
).
Generates a set of samples from a given sequence using the jackknife leave-one-out method. For an input sequence `vs` of size `n`, this method creates `n` samples. Each sample is formed by removing a single observation from the original sequence. Parameters: * `vs` (sequence): The input data sequence. Returns a sequence of sequences. The i-th inner sequence is `vs` with the i-th element removed. These samples are commonly used for estimating the bias and standard error of a statistic (e.g., via [[bootstrap-stats]]).
(jackknife+ vs)
Generates a set of samples from a sequence using the 'jackknife positive' method.
For an input sequence vs
of size n
, this method creates n
samples. Each sample is formed by duplicating a single observation from the original sequence and adding it back to the original sequence. Thus, each sample has size n+1
.
Parameters:
vs
(sequence): The input data sequence.Returns a sequence of sequences. The i-th inner sequence is vs
with an additional copy of the i-th element of vs
.
This method is used in specific resampling techniques for estimating bias and variance of a statistic.
Generates a set of samples from a sequence using the 'jackknife positive' method. For an input sequence `vs` of size `n`, this method creates `n` samples. Each sample is formed by duplicating a single observation from the original sequence and adding it back to the original sequence. Thus, each sample has size `n+1`. Parameters: * `vs` (sequence): The input data sequence. Returns a sequence of sequences. The i-th inner sequence is `vs` with an additional copy of the i-th element of `vs`. This method is used in specific resampling techniques for estimating bias and variance of a statistic.
cljdoc builds & hosts documentation for Clojure/Script libraries
Ctrl+k | Jump to recent docs |
← | Move to previous article |
→ | Move to next article |
Ctrl+/ | Jump to the search field |