Package elki.clustering.dbscan
Class GriDBSCAN.Instance<V extends NumberVector>
- java.lang.Object
-
- elki.clustering.dbscan.GriDBSCAN.Instance<V>
-
- Type Parameters:
V- Vector type
- Enclosing class:
- GriDBSCAN<V extends NumberVector>
protected static class GriDBSCAN.Instance<V extends NumberVector> extends java.lang.ObjectInstance, for a single run.- Author:
- Erich Schubert
-
-
Field Summary
Fields Modifier and Type Field Description private Border[]bordersBorder identifier objects (shared to conserve memory).protected int[]cellsNumber of cells per dimension.private WritableDataStore<Assignment>clusteridsCluster assignments.private Core[]coresCore identifier objects (shared to conserve memory).protected intdimDimensionality.protected Distance<? super V>distanceDistance function used.protected double[][]domainValue domain.protected doubleepsilonHolds the epsilon radius threshold.(package private) it.unimi.dsi.fastutil.longs.Long2ObjectOpenHashMap<ModifiableDBIDs>gridData grid partitioning.protected doublegridwidthWidth of the grid cells.protected intminptsHolds the minimum cluster size.protected static intNOISENoise IDs.protected double[]offsetGrid offset.private booleanoverflownIndicates that the number of grid cells has overflown.private WritableIntegerDataStoretemporaryTemporary assignments of a single run.protected static intUNPROCESSEDUnprocessed IDs.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected voidbuildGrid(Relation<V> relation, int numcells, double[] offset)Build the data grid.protected Clustering<Model>buildResult(DBIDs ids, int clusterid)Assemble the clustering result.protected intcheckGridCellSizes(int size, long numcell)Perform some sanity checks on the grid cells.private longcomputeGridBaseOffsets(int size)Compute the grid base offset.protected intexpandCluster(DBIDRef seed, int clusterid, WritableIntegerDataStore clusterids, ModifiableDoubleDBIDList neighbors, ArrayModifiableDBIDs activeSet, RangeSearcher<DBIDRef> rq, FiniteProgress pprog)Set-based expand cluster implementation.private voidinsertIntoGrid(DBIDRef id, V obj, int d, int v)Insert a single object into the grid; potentially into multiple cells (at most 2^d) via recursion.protected voidmergeClusterInformation(ModifiableDBIDs cellids, WritableIntegerDataStore temporary, WritableDataStore<Assignment> clusterids)Merge cluster information.protected intprocessCorePoint(DBIDRef seed, DoubleDBIDList newneighbors, int clusterid, WritableIntegerDataStore clusterids, ArrayModifiableDBIDs activeSet)Process a single core point.Clustering<Model>run(Relation<V> relation)Performs the DBSCAN algorithm on the given database.private intrunDBSCANOnCell(DBIDs cellids, Relation<V> relation, ModifiableDoubleDBIDList neighbors, ArrayModifiableDBIDs activeSet, int clusterid)private voidupdateCoreBorderObjects(int clusterid)Update the shared arrays for core points (to conserve memory)
-
-
-
Field Detail
-
UNPROCESSED
protected static final int UNPROCESSED
Unprocessed IDs.- See Also:
- Constant Field Values
-
NOISE
protected static final int NOISE
Noise IDs.- See Also:
- Constant Field Values
-
distance
protected Distance<? super V extends NumberVector> distance
Distance function used.
-
epsilon
protected double epsilon
Holds the epsilon radius threshold.
-
minpts
protected int minpts
Holds the minimum cluster size.
-
gridwidth
protected double gridwidth
Width of the grid cells. Must be at least 2 epsilon!
-
domain
protected double[][] domain
Value domain.
-
dim
protected int dim
Dimensionality.
-
offset
protected double[] offset
Grid offset.
-
cells
protected int[] cells
Number of cells per dimension.
-
grid
it.unimi.dsi.fastutil.longs.Long2ObjectOpenHashMap<ModifiableDBIDs> grid
Data grid partitioning.
-
cores
private Core[] cores
Core identifier objects (shared to conserve memory).
-
borders
private Border[] borders
Border identifier objects (shared to conserve memory).
-
clusterids
private WritableDataStore<Assignment> clusterids
Cluster assignments.
-
temporary
private WritableIntegerDataStore temporary
Temporary assignments of a single run.
-
overflown
private boolean overflown
Indicates that the number of grid cells has overflown.
-
-
Method Detail
-
run
public Clustering<Model> run(Relation<V> relation)
Performs the DBSCAN algorithm on the given database.- Parameters:
relation- Relation to process
-
runDBSCANOnCell
private int runDBSCANOnCell(DBIDs cellids, Relation<V> relation, ModifiableDoubleDBIDList neighbors, ArrayModifiableDBIDs activeSet, int clusterid)
-
updateCoreBorderObjects
private void updateCoreBorderObjects(int clusterid)
Update the shared arrays for core points (to conserve memory)- Parameters:
clusterid- Number of clusters
-
computeGridBaseOffsets
private long computeGridBaseOffsets(int size)
Compute the grid base offset.- Parameters:
size- Data set size- Returns:
- Total number of grid cells
-
buildGrid
protected void buildGrid(Relation<V> relation, int numcells, double[] offset)
Build the data grid.- Parameters:
relation- Data relationnumcells- Total number of cellsoffset- Offset
-
insertIntoGrid
private void insertIntoGrid(DBIDRef id, V obj, int d, int v)
Insert a single object into the grid; potentially into multiple cells (at most 2^d) via recursion.- Parameters:
id- Object IDobj- Objectd- Current dimensionv- Current cell value
-
checkGridCellSizes
protected int checkGridCellSizes(int size, long numcell)Perform some sanity checks on the grid cells.- Parameters:
numcell- Number of cellssize- Relation size- Returns:
- Number of cells with minPts points
-
expandCluster
protected int expandCluster(DBIDRef seed, int clusterid, WritableIntegerDataStore clusterids, ModifiableDoubleDBIDList neighbors, ArrayModifiableDBIDs activeSet, RangeSearcher<DBIDRef> rq, FiniteProgress pprog)
Set-based expand cluster implementation.- Parameters:
clusterid- ID of the current cluster.clusterids- Current object to cluster mapping.neighbors- Neighbors acquired by initial getNeighbors call.activeSet- Set to manage active candidates.rq- Range querypprog- Object progress- Returns:
- cluster size
-
processCorePoint
protected int processCorePoint(DBIDRef seed, DoubleDBIDList newneighbors, int clusterid, WritableIntegerDataStore clusterids, ArrayModifiableDBIDs activeSet)
Process a single core point.- Parameters:
seed- Point to processnewneighbors- New neighborsclusterid- Cluster to add toclusterids- Cluster assignment storage.activeSet- Active set of cluster seeds- Returns:
- Number of new points added to cluster
-
mergeClusterInformation
protected void mergeClusterInformation(ModifiableDBIDs cellids, WritableIntegerDataStore temporary, WritableDataStore<Assignment> clusterids)
Merge cluster information.- Parameters:
cellids- IDs in current celltemporary- Temporary assignmentsclusterids- Merged cluster assignment
-
buildResult
protected Clustering<Model> buildResult(DBIDs ids, int clusterid)
Assemble the clustering result.- Parameters:
ids- Object IDsclusterid- Largest valid cluster number- Returns:
- Clustering
-
-