Class HDBSCANHierarchyExtraction

  • All Implemented Interfaces:
    Algorithm, ClusteringAlgorithm<Clustering<DendrogramModel>>

    @Reference(authors="R. J. G. B. Campello, D. Moulavi, J. Sander",title="Density-Based Clustering Based on Hierarchical Density Estimates",booktitle="Pacific-Asia Conf. Advances in Knowledge Discovery and Data Mining (PAKDD)",url="https://doi.org/10.1007/978-3-642-37456-2_14",bibkey="DBLP:conf/pakdd/CampelloMS13") @Reference(authors="R. J. G. B. Campello, D. Moulavi, A. Zimek, J. Sander",title="Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection",booktitle="ACM Trans. Knowl. Discov. Data 10(1)",url="https://doi.org/10.1145/2733381",bibkey="DBLP:journals/tkdd/CampelloMZS15")
    public class HDBSCANHierarchyExtraction
    extends java.lang.Object
    implements ClusteringAlgorithm<Clustering<DendrogramModel>>
    Extraction of simplified cluster hierarchies, as proposed in HDBSCAN, and additionally also compute the GLOSH outlier scores.

    In contrast to the authors top-down approach, we use a bottom-up approach based on the more efficient pointer representation introduced in SLINK.

    In particular, it can also be used to extract a hierarchy from a hierarchical agglomerative clustering.

    Reference:

    R. J. G. B. Campello, D. Moulavi, J. Sander
    Density-Based Clustering Based on Hierarchical Density Estimates
    Pacific-Asia Conf. Advances in Knowledge Discovery and Data Mining (PAKDD)

    R. J. G. B. Campello, D. Moulavi, A. Zimek, J. Sander
    Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection
    ACM Trans. Knowl. Discov. Data 10(1)

    Note: some of the code is rather complex because we delay the creation of one-element clusters to reduce garbage collection overhead.

    Since:
    0.7.0
    Author:
    Erich Schubert