Class ParallelGeneralizedDBSCAN
- java.lang.Object
-
- elki.clustering.dbscan.parallel.ParallelGeneralizedDBSCAN
-
- All Implemented Interfaces:
Algorithm
,ClusteringAlgorithm<Clustering<Model>>
@Reference(prefix="closely related", authors="M. Patwary, D. Palsetia, A. Agrawal, W. K. Liao, F. Manne, A. Choudhary", title="A new scalable parallel DBSCAN algorithm using the disjoint-set data structure", booktitle="IEEE Int. Conf. for High Performance Computing, Networking, Storage and Analysis (SC)", url="https://doi.org/10.1109/SC.2012.9", bibkey="DBLP:conf/sc/PatwaryPALMC12") public class ParallelGeneralizedDBSCAN extends java.lang.Object implements ClusteringAlgorithm<Clustering<Model>>
Parallel version of DBSCAN clustering.This is the archetype of a non-linear shared-memory DBSCAN that does not sequentially expand a cluster, but processes points in arbitrary order and merges clusters when neighboring core points occur.
Because of synchronization when labeling points, the speedup will only be sublinear in the number of cores. But in particular without an index and on large data, the majority of the work is finding the neighbors; not in labeling the points.
Reference:
Please cite the latest ELKI version.
Related is the following publication, whose "disjoint set data structure" appears to be a similar union-find approach to ours, and whose DSDBSCAN appears rather similar. The main benefit of our approach is that we avoid using the union-find data structure for every object, but only use it for merging clusters.
M. Patwary, D. Palsetia, A. Agrawal, W. K. Liao, F. Manne, A. Choudhary
A new scalable parallel DBSCAN algorithm using the disjoint-set data structure
In IEEE Int. Conf. for High Performance Computing, Networking, Storage and Analysis (SC)- Since:
- 0.7.5
- Author:
- Erich Schubert
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
ParallelGeneralizedDBSCAN.Instance<T>
Instance for a particular data set.static class
ParallelGeneralizedDBSCAN.Par
Parameterization class-
Nested classes/interfaces inherited from interface elki.Algorithm
Algorithm.Utils
-
-
Field Summary
Fields Modifier and Type Field Description protected boolean
coremodel
Track which objects are "core" objects.protected CorePredicate<?>
corepred
The core predicate factory.private static Logging
LOG
Get a logger for this algorithmprotected NeighborPredicate<?>
npred
The neighborhood predicate factory.
-
Constructor Summary
Constructors Constructor Description ParallelGeneralizedDBSCAN(NeighborPredicate<?> npred, CorePredicate<?> corepred, boolean coremodel)
Constructor for parameterized algorithm.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Clustering<Model>
autorun(Database database)
Try to auto-run the algorithm on a database by calling a method calledrun
, with an optionalDatabase
first, and with data relations as specified byAlgorithm.getInputTypeRestriction()
.TypeInformation[]
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.
-
-
-
Field Detail
-
LOG
private static final Logging LOG
Get a logger for this algorithm
-
npred
protected NeighborPredicate<?> npred
The neighborhood predicate factory.
-
corepred
protected CorePredicate<?> corepred
The core predicate factory.
-
coremodel
protected boolean coremodel
Track which objects are "core" objects.
-
-
Constructor Detail
-
ParallelGeneralizedDBSCAN
public ParallelGeneralizedDBSCAN(NeighborPredicate<?> npred, CorePredicate<?> corepred, boolean coremodel)
Constructor for parameterized algorithm.- Parameters:
npred
- Neighbor predicate.corepred
- Core point predicate.coremodel
- Keep track of core points.
-
-
Method Detail
-
getInputTypeRestriction
public TypeInformation[] getInputTypeRestriction()
Description copied from interface:Algorithm
Get the input type restriction used for negotiating the data query.- Specified by:
getInputTypeRestriction
in interfaceAlgorithm
- Returns:
- Type restriction
-
autorun
public Clustering<Model> autorun(Database database)
Description copied from interface:Algorithm
Try to auto-run the algorithm on a database by calling a method calledrun
, with an optionalDatabase
first, and with data relations as specified byAlgorithm.getInputTypeRestriction()
.- Specified by:
autorun
in interfaceAlgorithm
- Specified by:
autorun
in interfaceClusteringAlgorithm<Clustering<Model>>
- Parameters:
database
- the database to run the algorithm on- Returns:
- the Result computed by this algorithm
-
-