Statistics functions.
All functions are backed by Apache Commons Math or SMILE libraries. All work with Clojure sequences.
All in one function stats-map
contains:
:Size
- size of the samples, (count ...)
:Min
- minimum
value:Max
- maximum
value:Mean
- mean
/average:Median
- median
, see also: median-3
:Mode
- mode
, see also: modes
:Q1
- first quartile, use: percentile
, [[quartile]]:Q3
- third quartile, use: percentile
, [[quartile]]:Total
- sum
of all samples:SD
- standard deviation of population, corrected sample standard deviation, use: population-stddev
:MAD
- median-absolute-deviation
:SEM
- standard error of mean:LAV
- lower adjacent value, use: adjacent-values
:UAV
- upper adjacent value, use: adjacent-values
:IQR
- interquartile range, (- q3 q1)
:LOF
- lower outer fence, (- q1 (* 3.0 iqr))
:UOF
- upper outer fence, (+ q3 (* 3.0 iqr))
:LIF
- lower inner fence, (- q1 (* 1.5 iqr))
:UIF
- upper inner fence, (+ q3 (* 1.5 iqr))
:Outliers
- number of outliers
, samples which are outside outer fences:Kurtosis
- kurtosis
:Skewness
- skewness
:SecMoment
- second central moment, use: second-moment
Note: percentile
and [[quartile]] can have 10 different interpolation strategies. See docs
covariance
correlation
pearson-correlation
spearman-correlation
kendall-correlation
kullback-leibler-divergence
jensen-shannon-divergence
Normalize samples to have mean=0 and standard deviation = 1 with standardize
.
histogram
to count samples in evenly spaced ranges.
Statistics functions. * Descriptive statistics for sequence. * Correlation / covariance of two sequences. * Outliers All functions are backed by Apache Commons Math or SMILE libraries. All work with Clojure sequences. ### Descriptive statistics All in one function [[stats-map]] contains: * `:Size` - size of the samples, `(count ...)` * `:Min` - [[minimum]] value * `:Max` - [[maximum]] value * `:Mean` - [[mean]]/average * `:Median` - [[median]], see also: [[median-3]] * `:Mode` - [[mode]], see also: [[modes]] * `:Q1` - first quartile, use: [[percentile]], [[quartile]] * `:Q3` - third quartile, use: [[percentile]], [[quartile]] * `:Total` - [[sum]] of all samples * `:SD` - standard deviation of population, corrected sample standard deviation, use: [[population-stddev]] * `:MAD` - [[median-absolute-deviation]] * `:SEM` - standard error of mean * `:LAV` - lower adjacent value, use: [[adjacent-values]] * `:UAV` - upper adjacent value, use: [[adjacent-values]] * `:IQR` - interquartile range, `(- q3 q1)` * `:LOF` - lower outer fence, `(- q1 (* 3.0 iqr))` * `:UOF` - upper outer fence, `(+ q3 (* 3.0 iqr))` * `:LIF` - lower inner fence, `(- q1 (* 1.5 iqr))` * `:UIF` - upper inner fence, `(+ q3 (* 1.5 iqr))` * `:Outliers` - number of [[outliers]], samples which are outside outer fences * `:Kurtosis` - [[kurtosis]] * `:Skewness` - [[skewness]] * `:SecMoment` - second central moment, use: [[second-moment]] Note: [[percentile]] and [[quartile]] can have 10 different interpolation strategies. See [docs](http://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/stat/descriptive/rank/Percentile.html) ### Correlation / Covariance / Divergence * [[covariance]] * [[correlation]] * [[pearson-correlation]] * [[spearman-correlation]] * [[kendall-correlation]] * [[kullback-leibler-divergence]] * [[jensen-shannon-divergence]] ### Other Normalize samples to have mean=0 and standard deviation = 1 with [[standardize]]. [[histogram]] to count samples in evenly spaced ranges.
(adjacent-values vs)
(adjacent-values vs estimation-strategy)
(adjacent-values vs q1 q3)
Lower and upper adjacent values (LAV and UAV).
Let Q1 is 25-percentile and Q3 is 75-percentile. IQR is (- Q3 Q1)
.
(- Q1 (* 1.5 IQR))
.(+ Q3 (* 1.5 IQR))
.Optional estimation-strategy
argument can be set to change quantile calculations estimation type. See [[estimation-strategies]].
Lower and upper adjacent values (LAV and UAV). Let Q1 is 25-percentile and Q3 is 75-percentile. IQR is `(- Q3 Q1)`. * LAV is smallest value which is greater or equal to the LIF = `(- Q1 (* 1.5 IQR))`. * UAV is largest value which is lower or equal to the UIF = `(+ Q3 (* 1.5 IQR))`. Optional `estimation-strategy` argument can be set to change quantile calculations estimation type. See [[estimation-strategies]].
(correlation vs1 vs2)
Correlation of two sequences.
Correlation of two sequences.
(covariance vs1 vs2)
Covariance of two sequences.
Covariance of two sequences.
(estimate-bins vs)
(estimate-bins vs method)
Estimate number of bins for histogram.
Possible methods are: :sqrt
:sturges
:rice
:doane
:scott
:freedman-diaconis
(default).
Estimate number of bins for histogram. Possible methods are: `:sqrt` `:sturges` `:rice` `:doane` `:scott` `:freedman-diaconis` (default).
(extent vs)
Return extent (min, max) values from sequence
Return extent (min, max) values from sequence
(histogram vs)
(histogram vs bins-or-estimate-method)
(histogram vs bins [mn mx])
Calculate histogram.
Returns map with keys:
:size
- number of bins:step
- distance between bins:bins
- list of triples of range lower value, number of hits and ratio of used samples:min
- min value:max
- max value:samples
- number of used samplesFor estimation methods check estimate-bins
.
Calculate histogram. Returns map with keys: * `:size` - number of bins * `:step` - distance between bins * `:bins` - list of triples of range lower value, number of hits and ratio of used samples * `:min` - min value * `:max` - max value * `:samples` - number of used samples For estimation methods check [[estimate-bins]].
(iqr vs)
(iqr vs estimation-strategy)
Interquartile range.
Interquartile range.
(jensen-shannon-divergence vs1 vs2)
Jensen-Shannon divergence of two sequences.
Jensen-Shannon divergence of two sequences.
(kendall-correlation vs1 vs2)
Kendall's correlation of two sequences.
Kendall's correlation of two sequences.
(kernel-density vs)
(kernel-density vs h)
Creates kernel density function for given series vs
and optional bandwidth h
.
Creates kernel density function for given series `vs` and optional bandwidth `h`.
(kullback-leibler-divergence vs1 vs2)
Kullback-Leibler divergence of two sequences.
Kullback-Leibler divergence of two sequences.
(kurtosis vs)
Calculate kurtosis from sequence.
Calculate kurtosis from sequence.
(median vs)
Calculate median of vs
. See median-3
.
Calculate median of `vs`. See [[median-3]].
(median-3 a b c)
Median of three values. See median
.
Median of three values. See [[median]].
(median-absolute-deviation vs)
Calculate MAD
Calculate MAD
(mode vs)
Find the value that appears most often in a dataset vs
.
See also modes
.
Find the value that appears most often in a dataset `vs`. See also [[modes]].
(modes vs)
Find the values that appears most often in a dataset vs
.
Returns sequence with all most appearing values in increasing order.
See also mode
.
Find the values that appears most often in a dataset `vs`. Returns sequence with all most appearing values in increasing order. See also [[mode]].
(outliers vs)
(outliers vs estimation-strategy)
(outliers vs q1 q3)
Find outliers defined as values outside outer fences.
Let Q1 is 25-percentile and Q3 is 75-percentile. IQR is (- Q3 Q1)
.
(- Q1 (* 3.0 IQR))
.(+ Q3 (* 3.0 IQR))
.Returns sequence.
Optional estimation-strategy
argument can be set to change quantile calculations estimation type. See [[estimation-strategies]].
Find outliers defined as values outside outer fences. Let Q1 is 25-percentile and Q3 is 75-percentile. IQR is `(- Q3 Q1)`. * LOF (Lower Outer Fence) equals `(- Q1 (* 3.0 IQR))`. * UOF (Upper Outer Fence) equals `(+ Q3 (* 3.0 IQR))`. Returns sequence. Optional `estimation-strategy` argument can be set to change quantile calculations estimation type. See [[estimation-strategies]].
(pearson-correlation vs1 vs2)
Pearson's correlation of two sequences.
Pearson's correlation of two sequences.
(percentile vs p)
(percentile vs p estimation-strategy)
Calculate percentile of a vs
.
Percentile p
is from range 0-100.
See docs.
Optionally you can provide estimation-strategy
to change interpolation methods for selecting values. Default is :legacy
. See more here
See also quantile
.
Calculate percentile of a `vs`. Percentile `p` is from range 0-100. See [docs](http://commons.apache.org/proper/commons-math/javadocs/api-3.4/org/apache/commons/math3/stat/descriptive/rank/Percentile.html). Optionally you can provide `estimation-strategy` to change interpolation methods for selecting values. Default is `:legacy`. See more [here](http://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/index.html) See also [[quantile]].
(population-stddev vs)
(population-stddev vs u)
Calculate population standard deviation of vs
.
See stddev
.
Calculate population standard deviation of `vs`. See [[stddev]].
(population-variance vs)
(population-variance vs u)
Calculate population variance of vs
.
See variance
.
Calculate population variance of `vs`. See [[variance]].
(quantile vs p)
(quantile vs p estimation-strategy)
Calculate quantile of a vs
.
Percentile p
is from range 0.0-1.0.
See docs for interpolation strategy.
Optionally you can provide estimation-strategy
to change interpolation methods for selecting values. Default is :legacy
. See more here
See also percentile
.
Calculate quantile of a `vs`. Percentile `p` is from range 0.0-1.0. See [docs](http://commons.apache.org/proper/commons-math/javadocs/api-3.4/org/apache/commons/math3/stat/descriptive/rank/Percentile.html) for interpolation strategy. Optionally you can provide `estimation-strategy` to change interpolation methods for selecting values. Default is `:legacy`. See more [here](http://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/index.html) See also [[percentile]].
(second-moment vs)
Calculate second moment from sequence.
It's a sum of squared deviations from the sample mean
Calculate second moment from sequence. It's a sum of squared deviations from the sample mean
(skewness vs)
Calculate kurtosis from sequence.
Calculate kurtosis from sequence.
(spearman-correlation vs1 vs2)
Spearman's correlation of two sequences.
Spearman's correlation of two sequences.
(standardize vs)
Normalize samples to have mean = 0 and stddev = 1.
Normalize samples to have mean = 0 and stddev = 1.
(stats-map vs)
(stats-map vs estimation-strategy)
Calculate several statistics of vs
and return as map.
Optional estimation-strategy
argument can be set to change quantile calculations estimation type. See [[estimation-strategies]].
Calculate several statistics of `vs` and return as map. Optional `estimation-strategy` argument can be set to change quantile calculations estimation type. See [[estimation-strategies]].
(stddev vs)
(stddev vs u)
Calculate standard deviation of vs
.
See population-stddev
.
Calculate standard deviation of `vs`. See [[population-stddev]].
(variance vs)
(variance vs u)
Calculate variance of vs
.
See population-variance
.
Calculate variance of `vs`. See [[population-variance]].
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close