Knuth's Bayesian histogram binning algorithm.
Implements optimal bin count selection by maximizing a log-posterior based on Knuth (2019) DOI: 10.1016/j.dsp.2019.102581
The algorithm finds the optimal number of equal-width bins M by maximizing: F(M|x,I) = n·log(M) + logΓ(M/2) - M·logΓ(1/2) - logΓ((2n+M)/2) + Σₖ₌₁ᴹ logΓ(nₖ + 1/2)
where n = sample count, nₖ = count in bin k.
All functions require typed arrays (DoubleArray, LongArray).
Knuth's Bayesian histogram binning algorithm. Implements optimal bin count selection by maximizing a log-posterior based on Knuth (2019) DOI: 10.1016/j.dsp.2019.102581 The algorithm finds the optimal number of equal-width bins M by maximizing: F(M|x,I) = n·log(M) + logΓ(M/2) - M·logΓ(1/2) - logΓ((2n+M)/2) + Σₖ₌₁ᴹ logΓ(nₖ + 1/2) where n = sample count, nₖ = count in bin k. All functions require typed arrays (DoubleArray, LongArray).
(log-posterior n bin-counts)Compute Knuth's log-posterior for M bins given sample count and bin counts.
F(M|x,I) = n·log(M) + logΓ(M/2) - M·logΓ(1/2) - logΓ((2n+M)/2) + Σₖ₌₁ᴹ logΓ(nₖ + 1/2)
Parameters: n - total sample count bin-counts - LongArray of counts per bin
Returns the log-posterior value (higher is better).
Compute Knuth's log-posterior for M bins given sample count and bin counts. F(M|x,I) = n·log(M) + logΓ(M/2) - M·logΓ(1/2) - logΓ((2n+M)/2) + Σₖ₌₁ᴹ logΓ(nₖ + 1/2) Parameters: n - total sample count bin-counts - LongArray of counts per bin Returns the log-posterior value (higher is better).
(optimal-bins data)(optimal-bins data {:keys [max-bins min max] :or {max-bins 50}})Find optimal number of bins using Knuth's Bayesian method.
Searches M ∈ [1, max-bins] for the value that maximizes the log-posterior.
Requires a typed array (DoubleArray, LongArray).
Parameters: data - typed array (DoubleArray, LongArray) of numeric values opts - optional map with: :max-bins - maximum M to search (default: 50) :min - pre-computed minimum value (avoids redundant scan) :max - pre-computed maximum value (avoids redundant scan)
Returns map with: :optimal-bins - the optimal number of bins M :log-posterior - the log-posterior value at optimal M
Throws: ex-info {:error :knuth/no-samples} for empty input ex-info {:error :knuth/same-values} when all values are identical
Find optimal number of bins using Knuth's Bayesian method.
Searches M ∈ [1, max-bins] for the value that maximizes the log-posterior.
Requires a typed array (DoubleArray, LongArray).
Parameters:
data - typed array (DoubleArray, LongArray) of numeric values
opts - optional map with:
:max-bins - maximum M to search (default: 50)
:min - pre-computed minimum value (avoids redundant scan)
:max - pre-computed maximum value (avoids redundant scan)
Returns map with:
:optimal-bins - the optimal number of bins M
:log-posterior - the log-posterior value at optimal M
Throws:
ex-info {:error :knuth/no-samples} for empty input
ex-info {:error :knuth/same-values} when all values are identicalcljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs |
| ← | Move to previous article |
| → | Move to next article |
| Ctrl+/ | Jump to the search field |