Liking cljdoc? Tell your friends :D

josh.meanings.kmeans

K-Means clustering generates a specific number of disjoint, non-hierarchical clusters. It is well suited to generating globular clusters. The K-Means method is numerical, unsupervised, non-deterministic and iterative. Every member of a cluster is closer to its cluster center than the center of any other cluster.

The choice of initial partition can greatly affect the final clusters that result, in terms of inter-cluster and intracluster distances and cohesion. As a result k means is best run multiple times in order to avoid the trap of a local minimum.

K-Means clustering generates a specific number of disjoint, 
non-hierarchical clusters. It is well suited to generating globular
clusters. The K-Means method is numerical, unsupervised, 
non-deterministic and iterative. Every member of a cluster is closer 
to its cluster center than the center of any other cluster.

The choice of initial partition can greatly affect the final clusters 
that result, in terms of inter-cluster and intracluster distances and 
cohesion. As a result k means is best run multiple times in order to 
avoid the trap of a local minimum.
raw docstring

assignmentsclj

(assignments centroids distance-fn points)

Returns the assignments of the points to the centroids.

Returns the assignments of the points to the centroids.
sourceraw docstring

calculate-objectiveclj

(calculate-objective s)
source

classifyclj

(classify centroids distance-fn point)

Returns the index of the centroid that is closest to the point.

Returns the index of the centroid that is closest to the point.
sourceraw docstring

column-namesclj

(column-names filepath)
source

costclj

(cost centroids distance-fn assignment point)

Returns the distance of an assigned point from its centroid.

Returns the distance of an assigned point from its centroid.
sourceraw docstring

costsclj

(costs centroids distance-fn assignments points)

Returns the distances of the points from their centroids.

Returns the distances of the points from their centroids.
sourceraw docstring

dataset-assignmentsclj

(dataset-assignments centroids distance-fn cols points)

Updates the assignments dataset with the new assignments.

Updates the assignments dataset with the new assignments.
sourceraw docstring

dataset-assignments-seqclj

(dataset-assignments-seq centroids distance-fn cols points-seq)

Updates a sequence of assignment datasets with the new assignments.

Updates a sequence of assignment datasets with the new assignments.
sourceraw docstring

dataset-costsclj

(dataset-costs centroids distance-fn cols points)

Returns the distances of the points from their centroids.

Returns the distances of the points from their centroids.
sourceraw docstring

dataset-costs-seqclj

(dataset-costs-seq centroids distance-fn cols points-seq)

Returns the distances of the points from their centroids.

Returns the distances of the points from their centroids.
sourceraw docstring

default-chain-lengthclj

source

default-distance-keyclj

source

default-formatclj

source

default-initclj

source

default-optionsclj

source

default-run-countclj

source

distancesclj

(distances centroids distance-fn point)

Returns a vector of distance of the centroids from the point.

Returns a vector of distance of the centroids from the point.
sourceraw docstring

estimate-sizeclj

(estimate-size filepath)

Estimates the number of rows in the datset at filepath.

Estimates the number of rows in the datset at filepath.
sourceraw docstring

initialize-centroids!clj

(initialize-centroids! s)

Calls initialize-centroids and writes the returned dataset to the centroids file.

Calls initialize-centroids and writes the returned dataset to the centroids file.
sourceraw docstring

initialize-k-means-stateclj

(initialize-k-means-state points-file k options)

Sets initial configuration options for the k means calculation.

Sets initial configuration options for the k means calculation.
sourceraw docstring

k-meanscljmultimethod

source

k-means-seqclj

(k-means-seq dataset k & options)

Returns a lazy sequence of m ClusterResult.

Returns a lazy sequence of m ClusterResult.
sourceraw docstring

k-means-via-fileclj

(k-means-via-file points-filepath k & options)
source

max-indexclj

(max-index coll)

Returns the index of the minimum value in a collection.

Returns the index of the minimum value in a collection.
sourceraw docstring

min-indexclj

(min-index coll)

Returns the index of the minimum value in a collection.

Returns the index of the minimum value in a collection.
sourceraw docstring

recalculate-meansclj

(recalculate-means s)
source

regenerate-assignments!clj

(regenerate-assignments! s)

Writes the new assignments to the assignments file.

Writes the new assignments to the assignments file.
sourceraw docstring

stabilized?clj

(stabilized? centroids-1 centroids-2)

K-means is said to be stabilized when performing an iterative refinement (often called a lloyd iteration), does not result in any shifting of points between clusters. A stabilized k-means calculation can be stopped, because further refinement won't produce any changes.

K-means is said to be stabilized when performing an
iterative refinement (often called a lloyd iteration), 
does not result in any shifting of points between 
clusters. A stabilized k-means calculation can be 
stopped, because further refinement won't produce 
any changes.
sourceraw docstring

sumclj

(sum coll)

Returns the sum of the numbers in the sequence.

Returns the sum of the numbers in the sequence.
sourceraw docstring

update-centroidsclj

(update-centroids s)
source

validate-optionsclj

(validate-options options)

Validates the given options map, ensuring that all required options are present and valid.

Validates the given options map, ensuring that all required options are present and valid.
sourceraw docstring

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close