fastmath.stats

Liking cljdoc? Tell your friends :D

Clojure only.

Statistics functions.

Descriptive statistics for sequence.
Correlation / covariance of two sequences.
Outliers

All functions are backed by Apache Commons Math or SMILE libraries. All work with Clojure sequences.

Descriptive statistics

All in one function stats-map contains:

:Size - size of the samples, (count ...)
:Min - minimum value
:Max - maximum value
:Mean - mean/average
:Median - median, see also: median-3
:Mode - mode, see also: modes
:Q1 - first quartile, use: percentile, [[quartile]]
:Q3 - third quartile, use: percentile, [[quartile]]
:Total - sum of all samples
:SD - standard deviation of population, corrected sample standard deviation, use: population-stddev
:MAD - median-absolute-deviation
:SEM - standard error of mean
:LAV - lower adjacent value, use: adjacent-values
:UAV - upper adjacent value, use: adjacent-values
:IQR - interquartile range, (- q3 q1)
:LOF - lower outer fence, (- q1 (* 3.0 iqr))
:UOF - upper outer fence, (+ q3 (* 3.0 iqr))
:LIF - lower inner fence, (- q1 (* 1.5 iqr))
:UIF - upper inner fence, (+ q3 (* 1.5 iqr))
:Outliers - number of outliers, samples which are outside outer fences
:Kurtosis - kurtosis
:Skewness - skewness
:SecMoment - second central moment, use: second-moment

Note: percentile and [[quartile]] can have 10 different interpolation strategies. See docs

Correlation / Covariance / Divergence

Other

Normalize samples to have mean=0 and standard deviation = 1 with standardize.

histogram to count samples in evenly spaced ranges.

Statistics functions.

* Descriptive statistics for sequence.
* Correlation / covariance of two sequences.
* Outliers

All functions are backed by Apache Commons Math or SMILE libraries. All work with Clojure sequences.

### Descriptive statistics

All in one function [[stats-map]] contains:

* `:Size` - size of the samples, `(count ...)`
* `:Min` - [[minimum]] value
* `:Max` - [[maximum]] value
* `:Mean` - [[mean]]/average
* `:Median` - [[median]], see also: [[median-3]]
* `:Mode` - [[mode]], see also: [[modes]]
* `:Q1` - first quartile, use: [[percentile]], [[quartile]]
* `:Q3` - third quartile, use: [[percentile]], [[quartile]]
* `:Total` - [[sum]] of all samples
* `:SD` - standard deviation of population, corrected sample standard deviation, use: [[population-stddev]]
* `:MAD` - [[median-absolute-deviation]]
* `:SEM` - standard error of mean
* `:LAV` - lower adjacent value, use: [[adjacent-values]]
* `:UAV` - upper adjacent value, use: [[adjacent-values]]
* `:IQR` - interquartile range, `(- q3 q1)`
* `:LOF` - lower outer fence, `(- q1 (* 3.0 iqr))`
* `:UOF` - upper outer fence, `(+ q3 (* 3.0 iqr))`
* `:LIF` - lower inner fence, `(- q1 (* 1.5 iqr))`
* `:UIF` - upper inner fence, `(+ q3 (* 1.5 iqr))`
* `:Outliers` - number of [[outliers]], samples which are outside outer fences
* `:Kurtosis` - [[kurtosis]]
* `:Skewness` - [[skewness]]
* `:SecMoment` - second central moment, use: [[second-moment]]

Note: [[percentile]] and [[quartile]] can have 10 different interpolation strategies. See [docs](http://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/stat/descriptive/rank/Percentile.html)

### Correlation / Covariance / Divergence

* [[covariance]]
* [[correlation]]
* [[pearson-correlation]]
* [[spearman-correlation]]
* [[kendall-correlation]]
* [[kullback-leibler-divergence]]
* [[jensen-shannon-divergence]]

### Other

Normalize samples to have mean=0 and standard deviation = 1 with [[standardize]].

[[histogram]] to count samples in evenly spaced ranges.

raw docstring

adjacent-values^clj

(adjacent-values vs)

(adjacent-values vs estimation-strategy)

(adjacent-values vs q1 q3)

Lower and upper adjacent values (LAV and UAV).

Let Q1 is 25-percentile and Q3 is 75-percentile. IQR is (- Q3 Q1).

LAV is smallest value which is greater or equal to the LIF = (- Q1 (* 1.5 IQR)).
UAV is largest value which is lower or equal to the UIF = (+ Q3 (* 1.5 IQR)).

Optional estimation-strategy argument can be set to change quantile calculations estimation type. See [[estimation-strategies]].

Lower and upper adjacent values (LAV and UAV).

Let Q1 is 25-percentile and Q3 is 75-percentile. IQR is `(- Q3 Q1)`.

* LAV is smallest value which is greater or equal to the LIF = `(- Q1 (* 1.5 IQR))`.
* UAV is largest value which is lower or equal to the UIF = `(+ Q3 (* 1.5 IQR))`.

Optional `estimation-strategy` argument can be set to change quantile calculations estimation type. See [[estimation-strategies]].

source raw docstring

correlation^clj

(correlation vs1 vs2)

Correlation of two sequences.

Correlation of two sequences.

source raw docstring

covariance^clj

(covariance vs1 vs2)

Covariance of two sequences.

Covariance of two sequences.

source raw docstring

estimation-strategies-list^clj

source

extent^clj

(extent vs)

Return extent (min, max) values from sequence

Return extent (min, max) values from sequence

source raw docstring

histogram^clj

(histogram vs bins)

(histogram vs bins [mn mx])

Calculate histogram.

Returns map with keys:

:size - number of bins
:step - distance between bins
:bins - list of pairs of range lower value and number of hits

Calculate histogram.

Returns map with keys:

* `:size` - number of bins
* `:step` - distance between bins
* `:bins` - list of pairs of range lower value and number of hits

source raw docstring

jensen-shannon-divergence^clj

(jensen-shannon-divergence vs1 vs2)

Jensen-Shannon divergence of two sequences.

Jensen-Shannon divergence of two sequences.

source raw docstring

k-means^clj

(k-means k vs)

k-means clustering

k-means clustering

source raw docstring

kendall-correlation^clj

(kendall-correlation vs1 vs2)

Kendall's correlation of two sequences.

Kendall's correlation of two sequences.

source raw docstring

kullback-leibler-divergence^clj

(kullback-leibler-divergence vs1 vs2)

Kullback-Leibler divergence of two sequences.

Kullback-Leibler divergence of two sequences.

source raw docstring

kurtosis^clj

(kurtosis vs)

Calculate kurtosis from sequence.

Calculate kurtosis from sequence.

source raw docstring

maximum^clj

(maximum vs)

Maximum value from sequence.

Maximum value from sequence.

source raw docstring

mean^clj

(mean vs)

Calculate mean of a list

Calculate mean of a list

source raw docstring

median^clj

(median vs)

Calculate median of a list. See median-3.

Calculate median of a list. See [[median-3]].

source raw docstring

median-3^clj

(median-3 a b c)

Median of three values. See median.

Median of three values. See [[median]].

source raw docstring

median-absolute-deviation^clj

(median-absolute-deviation vs)

Calculate MAD

Calculate MAD

source raw docstring

minimum^clj

(minimum vs)

Minimum value from sequence.

Minimum value from sequence.

source raw docstring

mode^clj

(mode vs)

Find the value that appears most often in a dataset vs.

modes^clj

(modes vs)

Find the values that appears most often in a dataset vs.

Returns sequence with all most appearing values in increasing order.

outliers^clj

(outliers vs)

(outliers vs estimation-strategy)

(outliers vs q1 q3)

Find outliers defined as values outside outer fences.

Let Q1 is 25-percentile and Q3 is 75-percentile. IQR is (- Q3 Q1).

LOF (Lower Outer Fence) equals (- Q1 (* 3.0 IQR)).
UOF (Upper Outer Fence) equals (+ Q3 (* 3.0 IQR)).

Returns sequence.

Optional estimation-strategy argument can be set to change quantile calculations estimation type. See [[estimation-strategies]].

Find outliers defined as values outside outer fences.

Let Q1 is 25-percentile and Q3 is 75-percentile. IQR is `(- Q3 Q1)`.

* LOF (Lower Outer Fence) equals `(- Q1 (* 3.0 IQR))`.
* UOF (Upper Outer Fence) equals `(+ Q3 (* 3.0 IQR))`.

Returns sequence.

Optional `estimation-strategy` argument can be set to change quantile calculations estimation type. See [[estimation-strategies]].

source raw docstring

pearson-correlation^clj

(pearson-correlation vs1 vs2)

Pearson's correlation of two sequences.

Pearson's correlation of two sequences.

source raw docstring

percentile^clj

(percentile vs p)

(percentile vs p estimation-strategy)

Calculate percentile of a vs.

Percentile p is from range 0-100.

See docs.

Optionally you can provide estimation-strategy to change interpolation methods for selecting values. Default is :legacy. See more here

population-stddev^clj

(population-stddev vs)

(population-stddev vs u)

Calculate population standard deviation of a list.

See stddev.

Calculate population standard deviation of a list.

See [[stddev]].

source raw docstring

population-variance^clj

(population-variance vs)

(population-variance vs u)

Calculate population variance of a list.

See variance.

Calculate population variance of a list.

See [[variance]].

source raw docstring

quantile^clj

(quantile vs p)

(quantile vs p estimation-strategy)

Calculate quantile of a vs.

Percentile p is from range 0.0-1.0.

See docs for interpolation strategy.

Optionally you can provide estimation-strategy to change interpolation methods for selecting values. Default is :legacy. See more here

second-moment^clj

(second-moment vs)

Calculate second moment from sequence.

It's a sum of squared deviations from the sample mean

Calculate second moment from sequence.

It's a sum of squared deviations from the sample mean

source raw docstring

skewness^clj

(skewness vs)

Calculate kurtosis from sequence.

Calculate kurtosis from sequence.

source raw docstring

spearman-correlation^clj

(spearman-correlation vs1 vs2)

Spearman's correlation of two sequences.

Spearman's correlation of two sequences.

source raw docstring

standardize^clj

(standardize vs)

Normalize samples to have mean = 0 and stddev = 1.

Normalize samples to have mean = 0 and stddev = 1.

source raw docstring

stats-map^clj

(stats-map vs)

(stats-map vs estimation-strategy)

Calculate several statistics from the list and return as map.

Optional estimation-strategy argument can be set to change quantile calculations estimation type. See [[estimation-strategies]].

Calculate several statistics from the list and return as map.

Optional `estimation-strategy` argument can be set to change quantile calculations estimation type. See [[estimation-strategies]].

source raw docstring

stddev^clj

(stddev vs)

(stddev vs u)

Calculate population standard deviation of a list.

See population-stddev.

Calculate population standard deviation of a list.

See [[population-stddev]].

source raw docstring

sum^clj

(sum vs)

Sum of all vs values.

Sum of all `vs` values.

source raw docstring

variance^clj

(variance vs)

(variance vs u)

Calculate variance of a list.

See population-variance.

Calculate variance of a list.

See [[population-variance]].

source raw docstring

cljdoc builds & hosts documentation for Clojure/Script libraries

Keyboard shortcuts

`Ctrl`+`k`	Jump to recent docs
`←`	Move to previous article
`→`	Move to next article
`Ctrl`+`/`	Jump to the search field

Raise an issue Browse cljdoc source Chat on Slack

× close

fastmath.stats

Descriptive statistics

Correlation / Covariance / Divergence

Other

adjacent-valuesclj

correlationclj

covarianceclj

estimation-strategies-listclj

extentclj

histogramclj

jensen-shannon-divergenceclj

k-meansclj

kendall-correlationclj

kullback-leibler-divergenceclj

kurtosisclj

maximumclj

meanclj

medianclj

median-3clj

median-absolute-deviationclj

minimumclj

modeclj

modesclj

outliersclj

pearson-correlationclj

percentileclj

population-stddevclj

population-varianceclj

quantileclj

second-momentclj

skewnessclj

spearman-correlationclj

standardizeclj

stats-mapclj

stddevclj

sumclj

varianceclj

adjacent-values^clj

correlation^clj

covariance^clj

estimation-strategies-list^clj

extent^clj

histogram^clj

jensen-shannon-divergence^clj

k-means^clj

kendall-correlation^clj

kullback-leibler-divergence^clj

kurtosis^clj

maximum^clj

mean^clj

median^clj

median-3^clj

median-absolute-deviation^clj

minimum^clj

mode^clj

modes^clj

outliers^clj

pearson-correlation^clj

percentile^clj

population-stddev^clj

population-variance^clj

quantile^clj

second-moment^clj

skewness^clj

spearman-correlation^clj

standardize^clj

stats-map^clj

stddev^clj

sum^clj

variance^clj