Class SampleKMeans<V extends NumberVector>

  • Type Parameters:
    V - Vector type
    All Implemented Interfaces:
    KMeansInitialization

    @Reference(authors="P. S. Bradley, U. M. Fayyad",
               title="Refining Initial Points for K-Means Clustering",
               booktitle="Proc. 15th Int. Conf. on Machine Learning (ICML 1998)",
               bibkey="DBLP:conf/icml/BradleyF98")
    public class SampleKMeans<V extends NumberVector>
    extends AbstractKMeansInitialization
    Initialize k-means by running k-means on a sample of the data set only.

    Reference:

    The idea of finding centers on a sample can be found in:

    P. S. Bradley, U. M. Fayyad
    Refining Initial Points for K-Means Clustering
    Proc. 15th Int. Conf. on Machine Learning (ICML 1998)

    But Bradley and Fayyad also suggest to repeat this multiple times. This implementation uses a single attempt only.

    Since:
    0.6.0
    Author:
    Erich Schubert
    • Field Detail

      • innerkMeans

        private KMeans<V extends NumberVector,​?> innerkMeans
        Variant of kMeans to use for initialization.
      • rate

        private double rate
        Sample size.
    • Constructor Detail

      • SampleKMeans

        public SampleKMeans​(RandomFactory rnd,
                            KMeans<V,​?> innerkMeans,
                            double rate)
        Constructor.
        Parameters:
        rnd - Random generator.
        innerkMeans - Inner k-means algorithm.
        rate - Sampling rate.
    • Method Detail

      • chooseInitialMeans

        public double[][] chooseInitialMeans​(Relation<? extends NumberVector> relation,
                                             int k,
                                             NumberVectorDistance<?> distance)
        Description copied from interface: KMeansInitialization
        Choose initial means
        Parameters:
        relation - Relation
        k - Parameter k
        distance - Distance function
        Returns:
        List of chosen means for k-means