Statistics functions.
All functions are backed by Apache Commons Math or SMILE libraries. All work with Clojure sequences.
All in one function stats-map
contains:
:Size
- size of the samples, (count ...)
:Min
- minimum
value:Max
- maximum
value:Mean
- mean
/average:Median
- median
, see also: median-3
:Mode
- mode
, see also: modes
:Q1
- first quartile, use: percentile
, [[quartile]]:Q3
- third quartile, use: percentile
, [[quartile]]:Total
- sum
of all samples:SD
- standard deviation of population, corrected sample standard deviation, use: population-stddev
:MAD
- median-absolute-deviation
:SEM
- standard error of mean:LAV
- lower adjacent value, use: adjacent-values
:UAV
- upper adjacent value, use: adjacent-values
:IQR
- interquartile range, (- q3 q1)
:LOF
- lower outer fence, (- q1 (* 3.0 iqr))
:UOF
- upper outer fence, (+ q3 (* 3.0 iqr))
:LIF
- lower inner fence, (- q1 (* 1.5 iqr))
:UIF
- upper inner fence, (+ q3 (* 1.5 iqr))
:Outliers
- number of outliers
, samples which are outside outer fences:Kurtosis
- kurtosis
:Skewness
- skewness
:SecMoment
- second central moment, use: second-moment
Note: percentile
and [[quartile]] can have 10 different interpolation strategies. See docs
covariance
correlation
pearson-correlation
spearman-correlation
kendall-correlation
kullback-leibler-divergence
jensen-shannon-divergence
Normalize samples to have mean=0 and standard deviation = 1 with standardize
.
histogram
to count samples in evenly spaced ranges.
Statistics functions. * Descriptive statistics for sequence. * Correlation / covariance of two sequences. * Outliers All functions are backed by Apache Commons Math or SMILE libraries. All work with Clojure sequences. ### Descriptive statistics All in one function [[stats-map]] contains: * `:Size` - size of the samples, `(count ...)` * `:Min` - [[minimum]] value * `:Max` - [[maximum]] value * `:Mean` - [[mean]]/average * `:Median` - [[median]], see also: [[median-3]] * `:Mode` - [[mode]], see also: [[modes]] * `:Q1` - first quartile, use: [[percentile]], [[quartile]] * `:Q3` - third quartile, use: [[percentile]], [[quartile]] * `:Total` - [[sum]] of all samples * `:SD` - standard deviation of population, corrected sample standard deviation, use: [[population-stddev]] * `:MAD` - [[median-absolute-deviation]] * `:SEM` - standard error of mean * `:LAV` - lower adjacent value, use: [[adjacent-values]] * `:UAV` - upper adjacent value, use: [[adjacent-values]] * `:IQR` - interquartile range, `(- q3 q1)` * `:LOF` - lower outer fence, `(- q1 (* 3.0 iqr))` * `:UOF` - upper outer fence, `(+ q3 (* 3.0 iqr))` * `:LIF` - lower inner fence, `(- q1 (* 1.5 iqr))` * `:UIF` - upper inner fence, `(+ q3 (* 1.5 iqr))` * `:Outliers` - number of [[outliers]], samples which are outside outer fences * `:Kurtosis` - [[kurtosis]] * `:Skewness` - [[skewness]] * `:SecMoment` - second central moment, use: [[second-moment]] Note: [[percentile]] and [[quartile]] can have 10 different interpolation strategies. See [docs](http://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/stat/descriptive/rank/Percentile.html) ### Correlation / Covariance / Divergence * [[covariance]] * [[correlation]] * [[pearson-correlation]] * [[spearman-correlation]] * [[kendall-correlation]] * [[kullback-leibler-divergence]] * [[jensen-shannon-divergence]] ### Other Normalize samples to have mean=0 and standard deviation = 1 with [[standardize]]. [[histogram]] to count samples in evenly spaced ranges.
(adjacent-values vs)
(adjacent-values vs estimation-strategy)
(adjacent-values vs q1 q3)
Lower and upper adjacent values (LAV and UAV).
Let Q1 is 25-percentile and Q3 is 75-percentile. IQR is (- Q3 Q1)
.
(- Q1 (* 1.5 IQR))
.(+ Q3 (* 1.5 IQR))
.Optional estimation-strategy
argument can be set to change quantile calculations estimation type. See [[estimation-strategies]].
Lower and upper adjacent values (LAV and UAV). Let Q1 is 25-percentile and Q3 is 75-percentile. IQR is `(- Q3 Q1)`. * LAV is smallest value which is greater or equal to the LIF = `(- Q1 (* 1.5 IQR))`. * UAV is largest value which is lower or equal to the UIF = `(+ Q3 (* 1.5 IQR))`. Optional `estimation-strategy` argument can be set to change quantile calculations estimation type. See [[estimation-strategies]].
(correlation vs1 vs2)
Correlation of two sequences.
Correlation of two sequences.
(covariance vs1 vs2)
Covariance of two sequences.
Covariance of two sequences.
(extent vs)
Return extent (min, max) values from sequence
Return extent (min, max) values from sequence
(histogram vs bins)
(histogram vs bins [mn mx])
Calculate histogram.
Returns map with keys:
:size
- number of bins:step
- distance between bins:bins
- list of pairs of range lower value and number of hitsCalculate histogram. Returns map with keys: * `:size` - number of bins * `:step` - distance between bins * `:bins` - list of pairs of range lower value and number of hits
(jensen-shannon-divergence vs1 vs2)
Jensen-Shannon divergence of two sequences.
Jensen-Shannon divergence of two sequences.
(kendall-correlation vs1 vs2)
Kendall's correlation of two sequences.
Kendall's correlation of two sequences.
(kullback-leibler-divergence vs1 vs2)
Kullback-Leibler divergence of two sequences.
Kullback-Leibler divergence of two sequences.
(kurtosis vs)
Calculate kurtosis from sequence.
Calculate kurtosis from sequence.
(median vs)
Calculate median of a list. See median-3
.
Calculate median of a list. See [[median-3]].
(median-3 a b c)
Median of three values. See median
.
Median of three values. See [[median]].
(median-absolute-deviation vs)
Calculate MAD
Calculate MAD
(mode vs)
Find the value that appears most often in a dataset vs
.
See also modes
.
Find the value that appears most often in a dataset `vs`. See also [[modes]].
(modes vs)
Find the values that appears most often in a dataset vs
.
Returns sequence with all most appearing values in increasing order.
See also mode
.
Find the values that appears most often in a dataset `vs`. Returns sequence with all most appearing values in increasing order. See also [[mode]].
(outliers vs)
(outliers vs estimation-strategy)
(outliers vs q1 q3)
Find outliers defined as values outside outer fences.
Let Q1 is 25-percentile and Q3 is 75-percentile. IQR is (- Q3 Q1)
.
(- Q1 (* 3.0 IQR))
.(+ Q3 (* 3.0 IQR))
.Returns sequence.
Optional estimation-strategy
argument can be set to change quantile calculations estimation type. See [[estimation-strategies]].
Find outliers defined as values outside outer fences. Let Q1 is 25-percentile and Q3 is 75-percentile. IQR is `(- Q3 Q1)`. * LOF (Lower Outer Fence) equals `(- Q1 (* 3.0 IQR))`. * UOF (Upper Outer Fence) equals `(+ Q3 (* 3.0 IQR))`. Returns sequence. Optional `estimation-strategy` argument can be set to change quantile calculations estimation type. See [[estimation-strategies]].
(pearson-correlation vs1 vs2)
Pearson's correlation of two sequences.
Pearson's correlation of two sequences.
(percentile vs p)
(percentile vs p estimation-strategy)
Calculate percentile of a vs
.
Percentile p
is from range 0-100.
See docs.
Optionally you can provide estimation-strategy
to change interpolation methods for selecting values. Default is :legacy
. See more here
See also quantile
.
Calculate percentile of a `vs`. Percentile `p` is from range 0-100. See [docs](http://commons.apache.org/proper/commons-math/javadocs/api-3.4/org/apache/commons/math3/stat/descriptive/rank/Percentile.html). Optionally you can provide `estimation-strategy` to change interpolation methods for selecting values. Default is `:legacy`. See more [here](http://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/index.html) See also [[quantile]].
(population-stddev vs)
(population-stddev vs u)
Calculate population standard deviation of a list.
See stddev
.
Calculate population standard deviation of a list. See [[stddev]].
(population-variance vs)
(population-variance vs u)
Calculate population variance of a list.
See variance
.
Calculate population variance of a list. See [[variance]].
(quantile vs p)
(quantile vs p estimation-strategy)
Calculate quantile of a vs
.
Percentile p
is from range 0.0-1.0.
See docs for interpolation strategy.
Optionally you can provide estimation-strategy
to change interpolation methods for selecting values. Default is :legacy
. See more here
See also percentile
.
Calculate quantile of a `vs`. Percentile `p` is from range 0.0-1.0. See [docs](http://commons.apache.org/proper/commons-math/javadocs/api-3.4/org/apache/commons/math3/stat/descriptive/rank/Percentile.html) for interpolation strategy. Optionally you can provide `estimation-strategy` to change interpolation methods for selecting values. Default is `:legacy`. See more [here](http://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/index.html) See also [[percentile]].
(second-moment vs)
Calculate second moment from sequence.
It's a sum of squared deviations from the sample mean
Calculate second moment from sequence. It's a sum of squared deviations from the sample mean
(skewness vs)
Calculate kurtosis from sequence.
Calculate kurtosis from sequence.
(spearman-correlation vs1 vs2)
Spearman's correlation of two sequences.
Spearman's correlation of two sequences.
(standardize vs)
Normalize samples to have mean = 0 and stddev = 1.
Normalize samples to have mean = 0 and stddev = 1.
(stats-map vs)
(stats-map vs estimation-strategy)
Calculate several statistics from the list and return as map.
Optional estimation-strategy
argument can be set to change quantile calculations estimation type. See [[estimation-strategies]].
Calculate several statistics from the list and return as map. Optional `estimation-strategy` argument can be set to change quantile calculations estimation type. See [[estimation-strategies]].
(stddev vs)
(stddev vs u)
Calculate population standard deviation of a list.
See population-stddev
.
Calculate population standard deviation of a list. See [[population-stddev]].
(variance vs)
(variance vs u)
Calculate variance of a list.
See population-variance
.
Calculate variance of a list. See [[population-variance]].
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close