Class SLINKHDBSCANLinearMemory<O>
- java.lang.Object
-
- elki.clustering.hierarchical.AbstractHDBSCAN<O>
-
- elki.clustering.hierarchical.SLINKHDBSCANLinearMemory<O>
-
- All Implemented Interfaces:
Algorithm
,HierarchicalClusteringAlgorithm
@Reference(authors="R. J. G. B. Campello, D. Moulavi, J. Sander", title="Density-Based Clustering Based on Hierarchical Density Estimates", booktitle="Pacific-Asia Conf. Advances in Knowledge Discovery and Data Mining (PAKDD)", url="https://doi.org/10.1007/978-3-642-37456-2_14", bibkey="DBLP:conf/pakdd/CampelloMS13") public class SLINKHDBSCANLinearMemory<O> extends AbstractHDBSCAN<O> implements HierarchicalClusteringAlgorithm
Linear memory implementation of HDBSCAN clustering based on SLINK.By not building a distance matrix, we can reduce memory usage to linear memory only; but at the cost of roughly double the runtime (unless using indexes) as we first need to compute all kNN distances (for core sizes), then recompute distances when building the spanning tree.
This version uses the SLINK algorithm to directly produce the pointer representation expected by the extraction methods. The SLINK algorithm is closely related to Prim's minimum spanning tree, but produces the more compact pointer representation instead of an edges list.
This implementation does not include the cluster extraction discussed as Step 4. This functionality should however already be provided by
HDBSCANHierarchyExtraction
. For this reason, we also do not include self-edges.Reference:
R. J. G. B. Campello, D. Moulavi, J. Sander
Density-Based Clustering Based on Hierarchical Density Estimates
Pacific-Asia Conf. Advances in Knowledge Discovery and Data Mining (PAKDD)- Since:
- 0.7.0
- Author:
- Erich Schubert
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class elki.clustering.hierarchical.AbstractHDBSCAN
AbstractHDBSCAN.HDBSCANAdapter, AbstractHDBSCAN.HeapMSTCollector
-
Nested classes/interfaces inherited from interface elki.Algorithm
Algorithm.Utils
-
-
Field Summary
Fields Modifier and Type Field Description private static Logging
LOG
Class logger.-
Fields inherited from class elki.clustering.hierarchical.AbstractHDBSCAN
distance, minPts
-
-
Constructor Summary
Constructors Constructor Description SLINKHDBSCANLinearMemory(Distance<? super O> distance, int minPts)
Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description TypeInformation[]
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.protected Logging
getLogger()
Get the (STATIC) logger for this class.ClusterMergeHistory
run(Relation<O> relation)
Run the algorithmprivate void
step2(DBIDRef id, DBIDs processedIDs, DistanceQuery<? super O> distQuery, DoubleDataStore coredists, WritableDoubleDataStore m)
Second step: Determine the pairwise distances from all objects in the pointer representation to the new object with the specified id.private void
step3(DBIDRef id, WritableDBIDDataStore pi, WritableDoubleDataStore lambda, DBIDs processedIDs, WritableDoubleDataStore m)
Third step: Determine the values for P and Lprivate void
step4(DBIDRef id, WritableDBIDDataStore pi, WritableDoubleDataStore lambda, DBIDs processedIDs)
Fourth step: Actualize the clusters if necessary-
Methods inherited from class elki.clustering.hierarchical.AbstractHDBSCAN
computeCoreDists, convertToMergeList
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface elki.clustering.hierarchical.HierarchicalClusteringAlgorithm
autorun
-
-
-
-
Field Detail
-
LOG
private static final Logging LOG
Class logger.
-
-
Method Detail
-
getInputTypeRestriction
public TypeInformation[] getInputTypeRestriction()
Description copied from interface:Algorithm
Get the input type restriction used for negotiating the data query.- Specified by:
getInputTypeRestriction
in interfaceAlgorithm
- Overrides:
getInputTypeRestriction
in classAbstractHDBSCAN<O>
- Returns:
- Type restriction
-
run
public ClusterMergeHistory run(Relation<O> relation)
Run the algorithm- Parameters:
relation
- Relation- Returns:
- Clustering hierarchy
-
step2
private void step2(DBIDRef id, DBIDs processedIDs, DistanceQuery<? super O> distQuery, DoubleDataStore coredists, WritableDoubleDataStore m)
Second step: Determine the pairwise distances from all objects in the pointer representation to the new object with the specified id.- Parameters:
id
- the id of the object to be inserted into the pointer representationprocessedIDs
- the already processed idsdistQuery
- Distance querym
- Data store
-
step3
private void step3(DBIDRef id, WritableDBIDDataStore pi, WritableDoubleDataStore lambda, DBIDs processedIDs, WritableDoubleDataStore m)
Third step: Determine the values for P and L- Parameters:
id
- the id of the object to be inserted into the pointer representationpi
- Pi data storelambda
- Lambda data storeprocessedIDs
- the already processed idsm
- Data store
-
step4
private void step4(DBIDRef id, WritableDBIDDataStore pi, WritableDoubleDataStore lambda, DBIDs processedIDs)
Fourth step: Actualize the clusters if necessary- Parameters:
id
- the id of the current objectpi
- Pi data storelambda
- Lambda data storeprocessedIDs
- the already processed ids
-
getLogger
protected Logging getLogger()
Description copied from class:AbstractHDBSCAN
Get the (STATIC) logger for this class.- Specified by:
getLogger
in classAbstractHDBSCAN<O>
- Returns:
- the static logger
-
-