Liking cljdoc? Tell your friends :D

fastmath.stats

Statistics functions.

  • Descriptive statistics for sequence.
  • Correlation / covariance of two sequences.
  • Outliers

All functions are backed by Apache Commons Math or SMILE libraries. All work with Clojure sequences.

Descriptive statistics

All in one function stats-map contains:

  • :Size - size of the samples, (count ...)
  • :Min - minimum value
  • :Max - maximum value
  • :Range - range of values
  • :Mean - mean/average
  • :Median - median, see also: median-3
  • :Mode - mode, see also: modes
  • :Q1 - first quartile, use: percentile, [[quartile]]
  • :Q3 - third quartile, use: percentile, [[quartile]]
  • :Total - sum of all samples
  • :SD - sample standard deviation
  • :Variance - variance
  • :MAD - median-absolute-deviation
  • :SEM - standard error of mean
  • :LAV - lower adjacent value, use: adjacent-values
  • :UAV - upper adjacent value, use: adjacent-values
  • :IQR - interquartile range, (- q3 q1)
  • :LOF - lower outer fence, (- q1 (* 3.0 iqr))
  • :UOF - upper outer fence, (+ q3 (* 3.0 iqr))
  • :LIF - lower inner fence, (- q1 (* 1.5 iqr))
  • :UIF - upper inner fence, (+ q3 (* 1.5 iqr))
  • :Outliers - list of outliers, samples which are outside outer fences
  • :Kurtosis - kurtosis
  • :Skewness - skewness
  • :SecMoment - second central moment, use: second-moment

Note: percentile and [[quartile]] can have 10 different interpolation strategies. See docs

Correlation / Covariance / Divergence

Other

Normalize samples to have mean=0 and standard deviation = 1 with standardize.

histogram to count samples in evenly spaced ranges.

Statistics functions.

* Descriptive statistics for sequence.
* Correlation / covariance of two sequences.
* Outliers

All functions are backed by Apache Commons Math or SMILE libraries. All work with Clojure sequences.

### Descriptive statistics

All in one function [[stats-map]] contains:

* `:Size` - size of the samples, `(count ...)`
* `:Min` - [[minimum]] value
* `:Max` - [[maximum]] value
* `:Range` - range of values
* `:Mean` - [[mean]]/average
* `:Median` - [[median]], see also: [[median-3]]
* `:Mode` - [[mode]], see also: [[modes]]
* `:Q1` - first quartile, use: [[percentile]], [[quartile]]
* `:Q3` - third quartile, use: [[percentile]], [[quartile]]
* `:Total` - [[sum]] of all samples
* `:SD` - sample standard deviation
* `:Variance` - variance
* `:MAD` - [[median-absolute-deviation]]
* `:SEM` - standard error of mean
* `:LAV` - lower adjacent value, use: [[adjacent-values]]
* `:UAV` - upper adjacent value, use: [[adjacent-values]]
* `:IQR` - interquartile range, `(- q3 q1)`
* `:LOF` - lower outer fence, `(- q1 (* 3.0 iqr))`
* `:UOF` - upper outer fence, `(+ q3 (* 3.0 iqr))`
* `:LIF` - lower inner fence, `(- q1 (* 1.5 iqr))`
* `:UIF` - upper inner fence, `(+ q3 (* 1.5 iqr))`
* `:Outliers` - list of [[outliers]], samples which are outside outer fences
* `:Kurtosis` - [[kurtosis]]
* `:Skewness` - [[skewness]]
* `:SecMoment` - second central moment, use: [[second-moment]]

Note: [[percentile]] and [[quartile]] can have 10 different interpolation strategies. See [docs](http://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/stat/descriptive/rank/Percentile.html)

### Correlation / Covariance / Divergence

* [[covariance]]
* [[correlation]]
* [[pearson-correlation]]
* [[spearman-correlation]]
* [[kendall-correlation]]
* [[kullback-leibler-divergence]]
* [[jensen-shannon-divergence]]

### Other

Normalize samples to have mean=0 and standard deviation = 1 with [[standardize]].

[[histogram]] to count samples in evenly spaced ranges.
raw docstring

adjacent-valuesclj

(adjacent-values vs)
(adjacent-values vs estimation-strategy)
(adjacent-values vs q1 q3)

Lower and upper adjacent values (LAV and UAV).

Let Q1 is 25-percentile and Q3 is 75-percentile. IQR is (- Q3 Q1).

  • LAV is smallest value which is greater or equal to the LIF = (- Q1 (* 1.5 IQR)).
  • UAV is largest value which is lower or equal to the UIF = (+ Q3 (* 1.5 IQR)).

Optional estimation-strategy argument can be set to change quantile calculations estimation type. See [[estimation-strategies]].

Lower and upper adjacent values (LAV and UAV).

Let Q1 is 25-percentile and Q3 is 75-percentile. IQR is `(- Q3 Q1)`.

* LAV is smallest value which is greater or equal to the LIF = `(- Q1 (* 1.5 IQR))`.
* UAV is largest value which is lower or equal to the UIF = `(+ Q3 (* 1.5 IQR))`.

Optional `estimation-strategy` argument can be set to change quantile calculations estimation type. See [[estimation-strategies]].
sourceraw docstring

ameasureclj

(ameasure group1 group2)

Vargha-Delaney A measure for two populations a and b

Vargha-Delaney A measure for two populations a and b
sourceraw docstring

binary-measuresclj

(binary-measures truth prediction)
(binary-measures truth prediction true-value false-value)
source

binary-measures-allclj

(binary-measures-all truth prediction)
(binary-measures-all truth prediction true-value false-value)
https://en.wikipedia.org/wiki/Precision_and_recall
sourceraw docstring

bootstrapclj

(bootstrap vs)
(bootstrap vs samples)
(bootstrap vs samples size)

Generate set of samples of given size from provided data.

Default samples is 50, number of size defaults to 1000

Generate set of samples of given size from provided data.

Default `samples` is 50, number of `size` defaults to 1000
sourceraw docstring

bootstrap-ciclj

(bootstrap-ci vs)
(bootstrap-ci vs alpha)
(bootstrap-ci vs alpha samples)
(bootstrap-ci vs alpha samples stat-fn)

Bootstrap method to calculate confidence interval.

Alpha defaults to 0.98, samples to 1000. Last parameter is statistical function used to measure, default to mean.

Bootstrap method to calculate confidence interval.

Alpha defaults to 0.98, samples to 1000.
Last parameter is statistical function used to measure, default to mean.
sourceraw docstring

ciclj

(ci vs)
(ci vs alpha)

T-student based confidence interval for given data. Alpha value defaults to 0.98.

T-student based confidence interval for given data. Alpha value defaults to 0.98.
sourceraw docstring

cliffs-deltaclj

(cliffs-delta group1 group2)

Cliff's delta effect size

Cliff's delta effect size
sourceraw docstring

cohens-dclj

(cohens-d group1 group2)

Cohen's d effect size for two groups

Cohen's d effect size for two groups
sourceraw docstring

cohens-d-origclj

(cohens-d-orig group1 group2)

Original version of Cohen's d effect size for two groups

Original version of Cohen's d effect size for two groups
sourceraw docstring

correlationclj

(correlation vs1 vs2)

Correlation of two sequences.

Correlation of two sequences.
sourceraw docstring

covarianceclj

(covariance vs1 vs2)

Covariance of two sequences.

Covariance of two sequences.
sourceraw docstring

covariance-matrixclj

(covariance-matrix vss)

Generate covariance matrix from seq of seqs. Row order.

Generate covariance matrix from seq of seqs. Row order.
sourceraw docstring

estimate-binsclj

(estimate-bins vs)
(estimate-bins vs bins-or-estimate-method)

Estimate number of bins for histogram.

Possible methods are: :sqrt :sturges :rice :doane :scott :freedman-diaconis (default).

Estimate number of bins for histogram.

Possible methods are: `:sqrt` `:sturges` `:rice` `:doane` `:scott` `:freedman-diaconis` (default).
sourceraw docstring

estimation-strategies-listclj

source

extentclj

(extent vs)

Return extent (min, max) values from sequence

Return extent (min, max) values from sequence
sourceraw docstring

glass-deltaclj

(glass-delta group1 group2)

Glass's delta effect size for two groups

Glass's delta effect size for two groups
sourceraw docstring

hedges-gclj

(hedges-g group1 group2)

Hedges's g effect size for two groups

Hedges's g effect size for two groups
sourceraw docstring

hedges-g*clj

(hedges-g* group1 group2)

Less biased Hedges's g effect size for two groups

Less biased Hedges's g effect size for two groups
sourceraw docstring

histogramclj

(histogram vs)
(histogram vs bins-or-estimate-method)
(histogram vs bins [mn mx])

Calculate histogram.

Returns map with keys:

  • :size - number of bins
  • :step - distance between bins
  • :bins - list of pairs of range lower value and number of hits
  • :min - min value
  • :max - max value
  • :samples - number of used samples

For estimation methods check estimate-bins.

Calculate histogram.

Returns map with keys:

* `:size` - number of bins
* `:step` - distance between bins
* `:bins` - list of pairs of range lower value and number of hits
* `:min` - min value
* `:max` - max value
* `:samples` - number of used samples

For estimation methods check [[estimate-bins]].
sourceraw docstring

iqrclj

(iqr vs)
(iqr vs estimation-strategy)

Interquartile range.

Interquartile range.
sourceraw docstring

jensen-shannon-divergenceclj

(jensen-shannon-divergence vs1 vs2)

Jensen-Shannon divergence of two sequences.

Jensen-Shannon divergence of two sequences.
sourceraw docstring

kendall-correlationclj

(kendall-correlation vs1 vs2)

Kendall's correlation of two sequences.

Kendall's correlation of two sequences.
sourceraw docstring

kullback-leibler-divergenceclj

(kullback-leibler-divergence vs1 vs2)

Kullback-Leibler divergence of two sequences.

Kullback-Leibler divergence of two sequences.
sourceraw docstring

kurtosisclj

(kurtosis vs)

Calculate kurtosis from sequence.

Calculate kurtosis from sequence.
sourceraw docstring

mad-extentclj

(mad-extent vs__4538__auto__)

median +/- median-absolute-deviation

median +/- median-absolute-deviation
sourceraw docstring

maximumclj

(maximum vs)

Maximum value from sequence.

Maximum value from sequence.
sourceraw docstring

meanclj

(mean vs)

Calculate mean of vs

Calculate mean of `vs`
sourceraw docstring

medianclj

(median vs)

Calculate median of vs. See median-3.

Calculate median of `vs`. See [[median-3]].
sourceraw docstring

median-3clj

(median-3 a b c)

Median of three values. See median.

Median of three values. See [[median]].
sourceraw docstring

median-absolute-deviationclj

(median-absolute-deviation vs)

Calculate MAD

Calculate MAD
sourceraw docstring

minimumclj

(minimum vs)

Minimum value from sequence.

Minimum value from sequence.
sourceraw docstring

modeclj

(mode vs)

Find the value that appears most often in a dataset vs.

See also modes.

Find the value that appears most often in a dataset `vs`.

See also [[modes]].
sourceraw docstring

modesclj

(modes vs)

Find the values that appears most often in a dataset vs.

Returns sequence with all most appearing values in increasing order.

See also mode.

Find the values that appears most often in a dataset `vs`.

Returns sequence with all most appearing values in increasing order.

See also [[mode]].
sourceraw docstring

outliersclj

(outliers vs)
(outliers vs estimation-strategy)
(outliers vs q1 q3)

Find outliers defined as values outside outer fences.

Let Q1 is 25-percentile and Q3 is 75-percentile. IQR is (- Q3 Q1).

  • LIF (Lower Outer Fence) equals (- Q1 (* 1.5 IQR)).
  • UIF (Upper Outer Fence) equals (+ Q3 (* 1.5 IQR)).

Returns sequence.

Optional estimation-strategy argument can be set to change quantile calculations estimation type. See [[estimation-strategies]].

Find outliers defined as values outside outer fences.

Let Q1 is 25-percentile and Q3 is 75-percentile. IQR is `(- Q3 Q1)`.

* LIF (Lower Outer Fence) equals `(- Q1 (* 1.5 IQR))`.
* UIF (Upper Outer Fence) equals `(+ Q3 (* 1.5 IQR))`.

Returns sequence.

Optional `estimation-strategy` argument can be set to change quantile calculations estimation type. See [[estimation-strategies]].
sourceraw docstring

pearson-correlationclj

(pearson-correlation vs1 vs2)

Pearson's correlation of two sequences.

Pearson's correlation of two sequences.
sourceraw docstring

percentileclj

(percentile vs p)
(percentile vs p estimation-strategy)

Calculate percentile of a vs.

Percentile p is from range 0-100.

See docs.

Optionally you can provide estimation-strategy to change interpolation methods for selecting values. Default is :legacy. See more here

See also quantile.

Calculate percentile of a `vs`.

Percentile `p` is from range 0-100.

See [docs](http://commons.apache.org/proper/commons-math/javadocs/api-3.4/org/apache/commons/math3/stat/descriptive/rank/Percentile.html).

Optionally you can provide `estimation-strategy` to change interpolation methods for selecting values. Default is `:legacy`. See more [here](http://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/stat/descriptive/rank/Percentile.EstimationType.html)

See also [[quantile]].
sourceraw docstring

percentile-extentclj

(percentile-extent vs)
(percentile-extent vs p)
(percentile-extent vs p1 p2)
(percentile-extent vs p1 p2 estimation-strategy)

Return percentile range.

Return percentile range.
sourceraw docstring

percentilesclj

(percentiles vs ps)
(percentiles vs ps estimation-strategy)

Calculate percentiles of a vs.

Percentiles are sequence of values from range 0-100.

See docs.

Optionally you can provide estimation-strategy to change interpolation methods for selecting values. Default is :legacy. See more here

See also quantile.

Calculate percentiles of a `vs`.

Percentiles are sequence of values from range 0-100.

See [docs](http://commons.apache.org/proper/commons-math/javadocs/api-3.4/org/apache/commons/math3/stat/descriptive/rank/Percentile.html).

Optionally you can provide `estimation-strategy` to change interpolation methods for selecting values. Default is `:legacy`. See more [here](http://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/stat/descriptive/rank/Percentile.EstimationType.html)

See also [[quantile]].
sourceraw docstring

population-stddevclj

(population-stddev vs)
(population-stddev vs u)

Calculate population standard deviation of vs.

See stddev.

Calculate population standard deviation of `vs`.

See [[stddev]].
sourceraw docstring

population-varianceclj

(population-variance vs)
(population-variance vs u)

Calculate population variance of vs.

See variance.

Calculate population variance of `vs`.

See [[variance]].
sourceraw docstring

quantileclj

(quantile vs q)
(quantile vs q estimation-strategy)

Calculate quantile of a vs.

Quantile q is from range 0.0-1.0.

See docs for interpolation strategy.

Optionally you can provide estimation-strategy to change interpolation methods for selecting values. Default is :legacy. See more here

See also percentile.

Calculate quantile of a `vs`.

Quantile `q` is from range 0.0-1.0.

See [docs](http://commons.apache.org/proper/commons-math/javadocs/api-3.4/org/apache/commons/math3/stat/descriptive/rank/Percentile.html) for interpolation strategy.

Optionally you can provide `estimation-strategy` to change interpolation methods for selecting values. Default is `:legacy`. See more [here](http://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/stat/descriptive/rank/Percentile.EstimationType.html)

See also [[percentile]].
sourceraw docstring

quantilesclj

(quantiles vs qs)
(quantiles vs qs estimation-strategy)

Calculate quantiles of a vs.

Quantilizes is sequence with values from range 0.0-1.0.

See docs for interpolation strategy.

Optionally you can provide estimation-strategy to change interpolation methods for selecting values. Default is :legacy. See more here

See also percentiles.

Calculate quantiles of a `vs`.

Quantilizes is sequence with values from range 0.0-1.0.

See [docs](http://commons.apache.org/proper/commons-math/javadocs/api-3.4/org/apache/commons/math3/stat/descriptive/rank/Percentile.html) for interpolation strategy.

Optionally you can provide `estimation-strategy` to change interpolation methods for selecting values. Default is `:legacy`. See more [here](http://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/stat/descriptive/rank/Percentile.EstimationType.html)

See also [[percentiles]].
sourceraw docstring

second-momentclj

(second-moment vs)

Calculate second moment from sequence.

It's a sum of squared deviations from the sample mean

Calculate second moment from sequence.

It's a sum of squared deviations from the sample mean
sourceraw docstring

semclj

(sem vs)

Standard error of mean

Standard error of mean
sourceraw docstring

sem-extentclj

(sem-extent vs__4538__auto__)

mean +/- sem

mean +/- sem
sourceraw docstring

skewnessclj

(skewness vs)

Calculate kurtosis from sequence.

Calculate kurtosis from sequence.
sourceraw docstring

spearman-correlationclj

(spearman-correlation vs1 vs2)

Spearman's correlation of two sequences.

Spearman's correlation of two sequences.
sourceraw docstring

standardizeclj

(standardize vs)

Normalize samples to have mean = 0 and stddev = 1.

Normalize samples to have mean = 0 and stddev = 1.
sourceraw docstring

stats-mapclj

(stats-map vs)
(stats-map vs estimation-strategy)

Calculate several statistics of vs and return as map.

Optional estimation-strategy argument can be set to change quantile calculations estimation type. See [[estimation-strategies]].

Calculate several statistics of `vs` and return as map.

Optional `estimation-strategy` argument can be set to change quantile calculations estimation type. See [[estimation-strategies]].
sourceraw docstring

stddevclj

(stddev vs)
(stddev vs u)

Calculate standard deviation of vs.

See population-stddev.

Calculate standard deviation of `vs`.

See [[population-stddev]].
sourceraw docstring

stddev-extentclj

(stddev-extent vs__4538__auto__)

mean +/- stddev

mean +/- stddev
sourceraw docstring

sumclj

(sum vs)

Sum of all vs values.

Sum of all `vs` values.
sourceraw docstring

ttest-one-sampleclj

(ttest-one-sample xs)
(ttest-one-sample xs
                  {:keys [alpha sides mu]
                   :or {alpha 0.05 sides :two-sided mu 0.0}})
source

ttest-two-samplesclj

(ttest-two-samples
  xs
  ys
  {:keys [alpha sides mu paired? equal-variances?]
   :or {alpha 0.05 sides :two-sided mu 0.0 paired? false equal-variances? false}
   :as params})
source

varianceclj

(variance vs)
(variance vs u)

Calculate variance of vs.

See population-variance.

Calculate variance of `vs`.

See [[population-variance]].
sourceraw docstring

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close