Class MacQueenKMeans<V extends NumberVector>

  • Type Parameters:
    V - vector type to use
    All Implemented Interfaces:
    Algorithm, ClusteringAlgorithm<Clustering<KMeansModel>>, KMeans<V,​KMeansModel>

    @Title("k-Means (MacQueen Algorithm)")
    @Reference(authors="J. MacQueen",
               title="Some Methods for Classification and Analysis of Multivariate Observations",
               booktitle="5th Berkeley Symp. Math. Statist. Prob.",
               url="http://projecteuclid.org/euclid.bsmsp/1200512992",
               bibkey="conf/bsmsp/MacQueen67")
    public class MacQueenKMeans<V extends NumberVector>
    extends AbstractKMeans<V,​KMeansModel>
    The original k-means algorithm, using MacQueen style incremental updates; making this effectively an "online" (streaming) algorithm.

    This implementation will by default iterate over the data set until convergence, although MacQueen likely only meant to do a single pass over the data, but the result quality improves with multiple passes.

    Reference:

    J. MacQueen
    Some Methods for Classification and Analysis of Multivariate Observations
    5th Berkeley Symp. Math. Statist. Prob.

    Since:
    0.1
    Author:
    Erich Schubert
    • Field Detail

      • LOG

        private static final Logging LOG
        The logger for this class.
    • Constructor Detail

      • MacQueenKMeans

        public MacQueenKMeans​(NumberVectorDistance<? super V> distance,
                              int k,
                              int maxiter,
                              KMeansInitialization initializer)
        Constructor.
        Parameters:
        distance - distance function
        k - k parameter
        maxiter - Maxiter parameter
        initializer - Initialization method