Liking cljdoc? Tell your friends :D

active-analytics.clustering.k-means


choose-initial-centroidsclj

(choose-initial-centroids data k)

Takes k random (distinct) elements from data to use as initial centroids.

Takes `k` random (distinct) elements from `data` to use
as initial centroids.
sourceraw docstring

clusterclj

(cluster data centroids distance-fn)

Builds clusters by assigning each point to its nearest centroid.

Builds clusters by assigning each point to its
nearest centroid.
sourceraw docstring

default-centroid-fnclj

(default-centroid-fn points)

Calculates a centroid for points. For now, assumes the points to be neanderthal vectors.

Calculates a centroid for `points`.
For now, assumes the `points` to be neanderthal vectors.
sourceraw docstring

k-meansclj

(k-means data
         distance-fn
         max-number-of-iterations
         &
         {:keys [k threshold initial-centroids centroid-fn]})

Performs k-means clustering on data, given the desired number of clusters k, a function distance-fn that computes the distance between two data points, and a max-number-of-iterations after which to stop the algorithm.

Takes keyword arguments :k, :threshold, initial-centroids, and :centroid-fn.

  • :k is the target number of clusters. If no :k is passed to k-means, it will assume :initial-centroids to be present and count those to use for :k.
  • After each step, calculate the distance between new and previous centroids. Stop the algorithm once this is less than :threshold in every centroid. No given threshold means that max-number-of-iterations iterations will be done.
  • The :initial-centroids designate the starting point for the iterations. If none are passed, the algorithm will use :k random data points.
  • The :centroid-fn should take a seq of points (the cluster) and return a new point, the corresponding centroid. If no :centroid-fn is given, data is assumed to be a seq of neanderthal vectors (for now).
Performs k-means clustering on `data`, given the desired
number of clusters `k`, a function `distance-fn` that
computes the distance between two data points, and a
`max-number-of-iterations` after which to stop the
algorithm.

Takes keyword arguments `:k`, `:threshold`,
`initial-centroids`, and `:centroid-fn`.

- `:k` is the target number of clusters. If no `:k` is
  passed to [[k-means]], it will assume `:initial-centroids`
  to be present and count those to use for `:k`.
- After each step, calculate the distance
  between new and previous centroids. Stop the algorithm
  once this is less than `:threshold` in every centroid.
  No given threshold means that `max-number-of-iterations`
  iterations will be done.
- The `:initial-centroids` designate the starting point
  for the iterations. If none are passed, the algorithm
  will use `:k` random data points.
- The `:centroid-fn` should take a seq of points (the cluster)
  and return a new point, the corresponding centroid.
  If no `:centroid-fn` is given, `data` is assumed to
  be a seq of neanderthal vectors (for now).
sourceraw docstring

nearest-centroidclj

(nearest-centroid p centroids distance-fn)

Finds the centroid closest to a point p.

Finds the centroid closest to a point `p`.
sourceraw docstring

stepclj

(step data centroids distance-fn centroid-fn)

Calculates the next generation of centroids from the current one.

Calculates the next generation of centroids from
the current one.
sourceraw docstring

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close