Class HDBSCANHierarchyExtraction
- java.lang.Object
-
- elki.clustering.hierarchical.extraction.HDBSCANHierarchyExtraction
-
- All Implemented Interfaces:
Algorithm
,ClusteringAlgorithm<Clustering<DendrogramModel>>
@Reference(authors="R. J. G. B. Campello, D. Moulavi, J. Sander",title="Density-Based Clustering Based on Hierarchical Density Estimates",booktitle="Pacific-Asia Conf. Advances in Knowledge Discovery and Data Mining (PAKDD)",url="https://doi.org/10.1007/978-3-642-37456-2_14",bibkey="DBLP:conf/pakdd/CampelloMS13") @Reference(authors="R. J. G. B. Campello, D. Moulavi, A. Zimek, J. Sander",title="Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection",booktitle="ACM Trans. Knowl. Discov. Data 10(1)",url="https://doi.org/10.1145/2733381",bibkey="DBLP:journals/tkdd/CampelloMZS15") public class HDBSCANHierarchyExtraction extends java.lang.Object implements ClusteringAlgorithm<Clustering<DendrogramModel>>
Extraction of simplified cluster hierarchies, as proposed in HDBSCAN, and additionally also compute the GLOSH outlier scores.In contrast to the authors top-down approach, we use a bottom-up approach based on the more efficient pointer representation introduced in SLINK.
In particular, it can also be used to extract a hierarchy from a hierarchical agglomerative clustering.
Reference:
R. J. G. B. Campello, D. Moulavi, J. Sander
Density-Based Clustering Based on Hierarchical Density Estimates
Pacific-Asia Conf. Advances in Knowledge Discovery and Data Mining (PAKDD)R. J. G. B. Campello, D. Moulavi, A. Zimek, J. Sander
Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection
ACM Trans. Knowl. Discov. Data 10(1)Note: some of the code is rather complex because we delay the creation of one-element clusters to reduce garbage collection overhead.
- Since:
- 0.7.0
- Author:
- Erich Schubert
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected class
HDBSCANHierarchyExtraction.Instance
Instance for a single data set.static class
HDBSCANHierarchyExtraction.Par
Parameterization class.private static class
HDBSCANHierarchyExtraction.TempCluster
Temporary cluster.-
Nested classes/interfaces inherited from interface elki.Algorithm
Algorithm.Utils
-
-
Field Summary
Fields Modifier and Type Field Description private HierarchicalClusteringAlgorithm
algorithm
Clustering algorithm to run to obtain the hierarchy.private boolean
hierarchical
Return a hierarchical result.private static Logging
LOG
Class logger.private int
minClSize
Minimum cluster size.
-
Constructor Summary
Constructors Constructor Description HDBSCANHierarchyExtraction(HierarchicalClusteringAlgorithm algorithm, int minClSize, boolean hierarchical)
Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Clustering<DendrogramModel>
autorun(Database database)
Try to auto-run the algorithm on a database by calling a method calledrun
, with an optionalDatabase
first, and with data relations as specified byAlgorithm.getInputTypeRestriction()
.TypeInformation[]
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.Clustering<DendrogramModel>
run(ClusterMergeHistory merges)
Process an existing result.
-
-
-
Field Detail
-
LOG
private static final Logging LOG
Class logger.
-
minClSize
private int minClSize
Minimum cluster size.
-
algorithm
private HierarchicalClusteringAlgorithm algorithm
Clustering algorithm to run to obtain the hierarchy.
-
hierarchical
private boolean hierarchical
Return a hierarchical result.
-
-
Constructor Detail
-
HDBSCANHierarchyExtraction
public HDBSCANHierarchyExtraction(HierarchicalClusteringAlgorithm algorithm, int minClSize, boolean hierarchical)
Constructor.- Parameters:
algorithm
- Algorithm to runminClSize
- Minimum cluster sizehierarchical
- Produce a hierarchical result
-
-
Method Detail
-
autorun
public Clustering<DendrogramModel> autorun(Database database)
Description copied from interface:Algorithm
Try to auto-run the algorithm on a database by calling a method calledrun
, with an optionalDatabase
first, and with data relations as specified byAlgorithm.getInputTypeRestriction()
.- Specified by:
autorun
in interfaceAlgorithm
- Specified by:
autorun
in interfaceClusteringAlgorithm<Clustering<DendrogramModel>>
- Parameters:
database
- the database to run the algorithm on- Returns:
- the Result computed by this algorithm
-
run
public Clustering<DendrogramModel> run(ClusterMergeHistory merges)
Process an existing result.- Parameters:
merges
- Existing result in pointer representation.- Returns:
- Clustering
-
getInputTypeRestriction
public TypeInformation[] getInputTypeRestriction()
Description copied from interface:Algorithm
Get the input type restriction used for negotiating the data query.- Specified by:
getInputTypeRestriction
in interfaceAlgorithm
- Returns:
- Type restriction
-
-