(add-default-chain-length conf)
For monte carlo methods we need a chain length to use when doing sampling. Although callers can pass in a chain length there are some dangers when doing so - for example if the chain length is low it won't necessarily approximate k means plus plus. Meanwhile if the chian length is too low then there will be no point in doing sampling at all - we could just use k means plus plus rather than approximating it.
This function checks to see if a chain length is set and if one is then it does nothing, but it nothing is set it uses the formulas provided in the k means plus plus apporximation papers to ddetermine a reasonable chain length.
For monte carlo methods we need a chain length to use when doing sampling. Although callers can pass in a chain length there are some dangers when doing so - for example if the chain length is low it won't necessarily approximate k means plus plus. Meanwhile if the chian length is too low then there will be no point in doing sampling at all - we could just use k means plus plus rather than approximating it. This function checks to see if a chain length is set and if one is then it does nothing, but it nothing is set it uses the formulas provided in the k means plus plus apporximation papers to ddetermine a reasonable chain length.
(centroids->dataset s results)
Converts a vector of points to a dataset.
Converts a vector of points to a dataset.
(chain-length-warnings config)
Analyzes the chain length and emits warnings if necessary.
Analyzes the chain length and emits warnings if necessary.
(shortest-distance-* configuration)
Denotes the shortest distance from a data point to a center. Which distance to use is decided by the k means configuration.
Denotes the shortest distance from a data point to a center. Which distance to use is decided by the k means configuration.
(shortest-distance-squared-* configuration centroids)
Denotes the shortest distance from a data point to a center squared. Useful for computing a D^2 sampling distribution.
Denotes the shortest distance from a data point to a center squared. Useful for computing a D^2 sampling distribution.
(vector->dataset data col-names)
Converts a vector of points to a dataset with col-name column names.
Converts a vector of points to a dataset with col-name column names.
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close