Class CFWeightedRandomlyChosen


  • @Reference(authors="Andreas Lang and Erich Schubert",
               title="BETULA: Fast Clustering of Large Data with Improved BIRCH CF-Trees",
               booktitle="Information Systems",
               url="https://doi.org/10.1016/j.is.2021.101918",
               bibkey="DBLP:journals/is/LangS22")
    public class CFWeightedRandomlyChosen
    extends AbstractCFKMeansInitialization
    Initialize K-means by randomly choosing k existing elements as initial cluster centers for Clustering Features. For normal k-means use RandomlyChosen. This version uses the number of points in each cluster feature for weighting.

    References:

    Andreas Lang and Erich Schubert
    BETULA: Fast Clustering of Large Data with Improved BIRCH CF-Trees
    Information Systems

    Since:
    0.8.0
    Author:
    Andreas Lang
    • Constructor Detail

      • CFWeightedRandomlyChosen

        public CFWeightedRandomlyChosen​(RandomFactory rf)
        Constructor.
        Parameters:
        rf - Random generator
    • Method Detail

      • chooseInitialMeans

        public double[][] chooseInitialMeans​(CFTree<?> tree,
                                             java.util.List<? extends ClusterFeature> cfs,
                                             int k)
        Description copied from class: AbstractCFKMeansInitialization
        Build the initial models.
        Specified by:
        chooseInitialMeans in class AbstractCFKMeansInitialization
        Parameters:
        tree - CF tree
        cfs - Cluster features of the tree (may be ignored for tree-based initializations, should be an array list for efficiency)
        k - Number of clusters.
        Returns:
        initial cluster means