Class SphericalSimplifiedElkanKMeans<V extends NumberVector>

  • Type Parameters:
    V - vector datatype
    All Implemented Interfaces:
    Algorithm, ClusteringAlgorithm<Clustering<KMeansModel>>, KMeans<V,​KMeansModel>
    Direct Known Subclasses:

    @Reference(authors="Erich Schubert, Andreas Lang, Gloria Feher",title="Accelerating Spherical k-Means",booktitle="Int. Conf. on Similarity Search and Applications, SISAP 2021",url="",bibkey="DBLP:conf/sisap/SchubertLF21") @Reference(authors="Erich Schubert",title="A Triangle Inequality for Cosine Similarity",booktitle="Int. Conf. on Similarity Search and Applications, SISAP 2021",url="",bibkey="DBLP:conf/sisap/Schubert21")
    public class SphericalSimplifiedElkanKMeans<V extends NumberVector>
    extends SphericalKMeans<V>
    A spherical k-Means algorithm based on Hamerly's fast k-means by exploiting the triangle inequality.

    FIXME: currently requires the vectors to be L2 normalized beforehand


    Erich Schubert, Andreas Lang, Gloria Feher
    Accelerating Spherical k-Means
    Int. Conf. on Similarity Search and Applications, SISAP 2021

    The underlying triangle inequality used for pruning is introduced in:

    Erich Schubert
    A Triangle Inequality for Cosine Similarity
    Int. Conf. on Similarity Search and Applications, SISAP 2021

    Erich Schubert
    • Field Detail

      • LOG

        private static final Logging LOG
        The logger for this class.
      • varstat

        protected boolean varstat
        Flag whether to compute the final variance statistic.
    • Constructor Detail

      • SphericalSimplifiedElkanKMeans

        public SphericalSimplifiedElkanKMeans​(int k,
                                              int maxiter,
                                              KMeansInitialization initializer,
                                              boolean varstat)
        k - k parameter
        maxiter - Maxiter parameter
        initializer - Initialization method
        varstat - Compute the variance statistic