Class GMeans<V extends NumberVector,​M extends MeanModel>

  • Type Parameters:
    V - Vector
    M - Model
    All Implemented Interfaces:
    Algorithm, ClusteringAlgorithm<Clustering<M>>, KMeans<V,​M>

    @Title("G-means")
    @Reference(authors="G. Hamerly and C. Elkan",
               title="Learning the k in k-means",
               booktitle="Neural Information Processing Systems",
               url="https://www.researchgate.net/publication/2869155_Learning_the_K_in_K-Means",
               bibkey="DBLP:conf/nips/HamerlyE03")
    public class GMeans<V extends NumberVector,​M extends MeanModel>
    extends XMeans<V,​M>
    G-Means extends K-Means and estimates the number of centers with Anderson Darling Test.
    Implemented as specialization of XMeans.

    Reference:

    G. Hamerly and C. Elkan
    Learning the K in K-Means
    Neural Information Processing Systems

    Since:
    0.8.0
    Author:
    Robert Gehde
    • Field Detail

      • LOG

        private static final Logging LOG
        The logger for this class.
      • critical

        protected double critical
        Critical value
    • Constructor Detail

      • GMeans

        public GMeans​(NumberVectorDistance<? super V> distance,
                      double critical,
                      int k_min,
                      int k_max,
                      int maxiter,
                      KMeans<V,​M> innerKMeans,
                      KMeansInitialization initializer,
                      RandomFactory random)
        Constructor.
        Parameters:
        distance - Distance function
        critical - Critical value
        k_min - Minimum number of clusters
        k_max - Maximum number of clusters
        maxiter - Maximum number of iterations
        innerKMeans - Nested k-means algorithm
        initializer - Initialization method
        random - Random generator
    • Method Detail

      • splitCluster

        protected java.util.List<Cluster<M>> splitCluster​(Cluster<M> parentCluster,
                                                          Relation<V> relation)
        Description copied from class: XMeans
        Conditionally splits the clusters based on the information criterion.
        Overrides:
        splitCluster in class XMeans<V extends NumberVector,​M extends MeanModel>
        Parameters:
        parentCluster - Cluster to split
        relation - Data relation
        Returns:
        Parent cluster when split decreases clustering quality or child clusters when split improves clustering.
      • splitCentroid

        protected double[][] splitCentroid​(Cluster<? extends MeanModel> parentCluster,
                                           Relation<V> relation)
        Description copied from class: XMeans
        Split an existing centroid into two initial centers.
        Overrides:
        splitCentroid in class XMeans<V extends NumberVector,​M extends MeanModel>
        Parameters:
        parentCluster - Existing cluster
        relation - Data relation
        Returns:
        List of new centroids