Class PAMMEDSIL<O>

  • Type Parameters:
    O - Input data type
    All Implemented Interfaces:
    Algorithm, ClusteringAlgorithm<Clustering<MedoidModel>>, KMedoidsClustering<O>
    Direct Known Subclasses:
    FastMSC

    @Reference(authors="M. Van der Laan, K. Pollard, J. Bryan",title="A new partitioning around medoids algorithm",booktitle="Journal of Statistical Computation and Simulation 73(8)",url="https://doi.org/10.1080/0094965031000136012",bibkey="doi:10.1080/0094965031000136012") @Reference(authors="Lars Lenssen and Erich Schubert",title="Clustering by Direct Optimization of the Medoid Silhouette",booktitle="Int. Conf. on Similarity Search and Applications, SISAP 2022",url="https://doi.org/10.1007/978-3-031-17849-8_15",bibkey="DBLP:conf/sisap/LenssenS22")
    public class PAMMEDSIL<O>
    extends PAMSIL<O>
    Clustering to optimize the Medoid Silhouette coefficient with a PAM-based swap heuristic. This is less expensive than using the full Silhouette, as the medoid Silhouette is only in O(kn) as opposed to O(n²) for the full Silhouette. Each iteration considers (n-k)k swaps, and hence the overall complexity per iteration hence is O((n-k)k(n-k)k), i.e., roughly O(n²k²).

    Reference:

    M. Van der Laan, K. Pollard, J. Bryan
    A new partitioning around medoids algorithm
    Journal of Statistical Computation and Simulation 73(8)

    This already incorporates some optimizations from:

    Lars Lenssen and Erich Schubert
    Clustering by Direct Optimization of the Medoid Silhouette
    Int. Conf. on Similarity Search and Applications, SISAP 2022

    Since:
    0.8.0
    Author:
    Erich Schubert
    • Field Detail

      • LOG

        private static final Logging LOG
        The logger for this class.
    • Constructor Detail

      • PAMMEDSIL

        public PAMMEDSIL​(Distance<? super O> distance,
                         int k,
                         int maxiter,
                         KMedoidsInitialization<O> initializer)
        Constructor.
        Parameters:
        distance - distance function
        k - k parameter
        maxiter - Maxiter parameter
        initializer - Function to generate the initial means
    • Method Detail

      • run

        public Clustering<MedoidModel> run​(Relation<O> relation,
                                           int k,
                                           DistanceQuery<? super O> distQ)
        Description copied from interface: KMedoidsClustering
        Run k-medoids clustering with a given distance query.
        Not a very elegant API, but needed for some types of nested k-medoids.
        Specified by:
        run in interface KMedoidsClustering<O>
        Overrides:
        run in class PAMSIL<O>
        Parameters:
        relation - relation to use
        k - Number of clusters
        distQ - Distance query to use
        Returns:
        result
      • getLogger

        protected Logging getLogger()
        Description copied from class: PAM
        Get the static class logger.
        Overrides:
        getLogger in class PAMSIL<O>