Package elki.clustering.dbscan
Class GriDBSCAN.Instance<V extends NumberVector>
- java.lang.Object
-
- elki.clustering.dbscan.GriDBSCAN.Instance<V>
-
- Type Parameters:
V
- Vector type
- Enclosing class:
- GriDBSCAN<V extends NumberVector>
protected static class GriDBSCAN.Instance<V extends NumberVector> extends java.lang.Object
Instance, for a single run.- Author:
- Erich Schubert
-
-
Field Summary
Fields Modifier and Type Field Description private Border[]
borders
Border identifier objects (shared to conserve memory).protected int[]
cells
Number of cells per dimension.private WritableDataStore<Assignment>
clusterids
Cluster assignments.private Core[]
cores
Core identifier objects (shared to conserve memory).protected int
dim
Dimensionality.protected Distance<? super V>
distance
Distance function used.protected double[][]
domain
Value domain.protected double
epsilon
Holds the epsilon radius threshold.(package private) it.unimi.dsi.fastutil.longs.Long2ObjectOpenHashMap<ModifiableDBIDs>
grid
Data grid partitioning.protected double
gridwidth
Width of the grid cells.protected int
minpts
Holds the minimum cluster size.protected static int
NOISE
Noise IDs.protected double[]
offset
Grid offset.private boolean
overflown
Indicates that the number of grid cells has overflown.private WritableIntegerDataStore
temporary
Temporary assignments of a single run.protected static int
UNPROCESSED
Unprocessed IDs.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
buildGrid(Relation<V> relation, int numcells, double[] offset)
Build the data grid.protected Clustering<Model>
buildResult(DBIDs ids, int clusterid)
Assemble the clustering result.protected int
checkGridCellSizes(int size, long numcell)
Perform some sanity checks on the grid cells.private long
computeGridBaseOffsets(int size)
Compute the grid base offset.protected int
expandCluster(DBIDRef seed, int clusterid, WritableIntegerDataStore clusterids, ModifiableDoubleDBIDList neighbors, ArrayModifiableDBIDs activeSet, RangeSearcher<DBIDRef> rq, FiniteProgress pprog)
Set-based expand cluster implementation.private void
insertIntoGrid(DBIDRef id, V obj, int d, int v)
Insert a single object into the grid; potentially into multiple cells (at most 2^d) via recursion.protected void
mergeClusterInformation(ModifiableDBIDs cellids, WritableIntegerDataStore temporary, WritableDataStore<Assignment> clusterids)
Merge cluster information.protected int
processCorePoint(DBIDRef seed, DoubleDBIDList newneighbors, int clusterid, WritableIntegerDataStore clusterids, ArrayModifiableDBIDs activeSet)
Process a single core point.Clustering<Model>
run(Relation<V> relation)
Performs the DBSCAN algorithm on the given database.private int
runDBSCANOnCell(DBIDs cellids, Relation<V> relation, ModifiableDoubleDBIDList neighbors, ArrayModifiableDBIDs activeSet, int clusterid)
private void
updateCoreBorderObjects(int clusterid)
Update the shared arrays for core points (to conserve memory)
-
-
-
Field Detail
-
UNPROCESSED
protected static final int UNPROCESSED
Unprocessed IDs.- See Also:
- Constant Field Values
-
NOISE
protected static final int NOISE
Noise IDs.- See Also:
- Constant Field Values
-
distance
protected Distance<? super V extends NumberVector> distance
Distance function used.
-
epsilon
protected double epsilon
Holds the epsilon radius threshold.
-
minpts
protected int minpts
Holds the minimum cluster size.
-
gridwidth
protected double gridwidth
Width of the grid cells. Must be at least 2 epsilon!
-
domain
protected double[][] domain
Value domain.
-
dim
protected int dim
Dimensionality.
-
offset
protected double[] offset
Grid offset.
-
cells
protected int[] cells
Number of cells per dimension.
-
grid
it.unimi.dsi.fastutil.longs.Long2ObjectOpenHashMap<ModifiableDBIDs> grid
Data grid partitioning.
-
cores
private Core[] cores
Core identifier objects (shared to conserve memory).
-
borders
private Border[] borders
Border identifier objects (shared to conserve memory).
-
clusterids
private WritableDataStore<Assignment> clusterids
Cluster assignments.
-
temporary
private WritableIntegerDataStore temporary
Temporary assignments of a single run.
-
overflown
private boolean overflown
Indicates that the number of grid cells has overflown.
-
-
Method Detail
-
run
public Clustering<Model> run(Relation<V> relation)
Performs the DBSCAN algorithm on the given database.- Parameters:
relation
- Relation to process
-
runDBSCANOnCell
private int runDBSCANOnCell(DBIDs cellids, Relation<V> relation, ModifiableDoubleDBIDList neighbors, ArrayModifiableDBIDs activeSet, int clusterid)
-
updateCoreBorderObjects
private void updateCoreBorderObjects(int clusterid)
Update the shared arrays for core points (to conserve memory)- Parameters:
clusterid
- Number of clusters
-
computeGridBaseOffsets
private long computeGridBaseOffsets(int size)
Compute the grid base offset.- Parameters:
size
- Data set size- Returns:
- Total number of grid cells
-
buildGrid
protected void buildGrid(Relation<V> relation, int numcells, double[] offset)
Build the data grid.- Parameters:
relation
- Data relationnumcells
- Total number of cellsoffset
- Offset
-
insertIntoGrid
private void insertIntoGrid(DBIDRef id, V obj, int d, int v)
Insert a single object into the grid; potentially into multiple cells (at most 2^d) via recursion.- Parameters:
id
- Object IDobj
- Objectd
- Current dimensionv
- Current cell value
-
checkGridCellSizes
protected int checkGridCellSizes(int size, long numcell)
Perform some sanity checks on the grid cells.- Parameters:
numcell
- Number of cellssize
- Relation size- Returns:
- Number of cells with minPts points
-
expandCluster
protected int expandCluster(DBIDRef seed, int clusterid, WritableIntegerDataStore clusterids, ModifiableDoubleDBIDList neighbors, ArrayModifiableDBIDs activeSet, RangeSearcher<DBIDRef> rq, FiniteProgress pprog)
Set-based expand cluster implementation.- Parameters:
clusterid
- ID of the current cluster.clusterids
- Current object to cluster mapping.neighbors
- Neighbors acquired by initial getNeighbors call.activeSet
- Set to manage active candidates.rq
- Range querypprog
- Object progress- Returns:
- cluster size
-
processCorePoint
protected int processCorePoint(DBIDRef seed, DoubleDBIDList newneighbors, int clusterid, WritableIntegerDataStore clusterids, ArrayModifiableDBIDs activeSet)
Process a single core point.- Parameters:
seed
- Point to processnewneighbors
- New neighborsclusterid
- Cluster to add toclusterids
- Cluster assignment storage.activeSet
- Active set of cluster seeds- Returns:
- Number of new points added to cluster
-
mergeClusterInformation
protected void mergeClusterInformation(ModifiableDBIDs cellids, WritableIntegerDataStore temporary, WritableDataStore<Assignment> clusterids)
Merge cluster information.- Parameters:
cellids
- IDs in current celltemporary
- Temporary assignmentsclusterids
- Merged cluster assignment
-
buildResult
protected Clustering<Model> buildResult(DBIDs ids, int clusterid)
Assemble the clustering result.- Parameters:
ids
- Object IDsclusterid
- Largest valid cluster number- Returns:
- Clustering
-
-