de.lmu.ifi.dbs.elki.index.preprocessed.fastoptics

## Class RandomProjectedNeighborsAndDensities<V extends NumberVector>

• java.lang.Object
• de.lmu.ifi.dbs.elki.index.preprocessed.fastoptics.RandomProjectedNeighborsAndDensities<V>

• @Reference(authors="J. Schneider, M. Vlachos",
title="Fast parameterless density-based clustering via random projections",
booktitle="Proc. 22nd ACM Int. Conf. on Information & Knowledge Management (CIKM 2013)",
url="https://doi.org/10.1145/2505515.2505590",
bibkey="DBLP:conf/cikm/SchneiderV13")
public class RandomProjectedNeighborsAndDensities<V extends NumberVector>
extends java.lang.Object
Random Projections used for computing neighbors and density estimates.

This index is specialized for the algorithm FastOPTICS

Reference:

J. Schneider, M. Vlachos
Fast parameterless density-based clustering via random projections
Proc. 22nd ACM Int. Conf. on Information and Knowledge Management (CIKM 2013)

This is based on the original code provided by Johannes Schneider, with ELKIfications and optimizations by Erich Schubert.

TODO: implement one of the Index APIs?

Since:
0.7.0
Author:
Johannes Schneider, Erich Schubert
• ### Nested Class Summary

Nested Classes
Modifier and Type Class and Description
static class  RandomProjectedNeighborsAndDensities.Parameterizer
Parameterization class.
• ### Field Summary

Fields
Modifier and Type Field and Description
(package private) long distanceComputations
Count the number of distance computations.
private static Logging LOG
Class logger.
private static int logOProjectionConst
Default constant used to compute number of projections as well as number of splits of point set, ie. constant *log N*d
(package private) int minSplitSize
minimum size for which a point set is further partitioned (roughly corresponds to minPts in OPTICS)
(package private) Relation<V> points
entire point set
private static java.lang.String PREFIX
Statistics logging prefix.
(package private) DoubleDataStore[] projectedPoints
all projected points
(package private) RandomFactory rnd
Random factory.
private static float sizeTolerance
Sets used for neighborhood computation should be about minSplitSize Sets are still used if they deviate by less (1+/- sizeTolerance)
(package private) java.util.ArrayList<ArrayDBIDs> splitsets
sets that resulted from recursive split of entire point set
• ### Constructor Summary

Constructors
Constructor and Description
RandomProjectedNeighborsAndDensities(RandomFactory rnd)
Constructor.
• ### Method Summary

All Methods
Modifier and Type Method and Description
DoubleDataStore computeAverageDistInSet()
Compute for each point a density estimate as inverse of average distance to a point in a projected set
void computeSetsBounds(Relation<V> points, int minSplitSize, DBIDs ptList)
Create random projections, project points and put points into sets of size about minSplitSize/2
DataStore<? extends DBIDs> getNeighs()
Compute list of neighbors for each point from sets resulting from projection
void logStatistics()
Log some statistics.
int splitByDistance(ArrayModifiableDBIDs ind, int begin, int end, DoubleDataStore tpro, java.util.Random rand)
Split the data set by distances.
int splitRandomly(ArrayModifiableDBIDs ind, int begin, int end, DoubleDataStore tpro, java.util.Random rand)
Split the data set randomly.
void splitupNoSort(ArrayModifiableDBIDs ind, int begin, int end, int dim, java.util.Random rand)
Recursively splits entire point set until the set is below a threshold
• ### Methods inherited from class java.lang.Object

clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
• ### Field Detail

• #### LOG

private static final Logging LOG
Class logger.
• #### PREFIX

private static final java.lang.String PREFIX
Statistics logging prefix.
• #### logOProjectionConst

private static final int logOProjectionConst
Default constant used to compute number of projections as well as number of splits of point set, ie. constant *log N*d
Constant Field Values
• #### sizeTolerance

private static final float sizeTolerance
Sets used for neighborhood computation should be about minSplitSize Sets are still used if they deviate by less (1+/- sizeTolerance)
Constant Field Values
• #### minSplitSize

int minSplitSize
minimum size for which a point set is further partitioned (roughly corresponds to minPts in OPTICS)
• #### points

Relation<V extends NumberVector> points
entire point set
• #### splitsets

java.util.ArrayList<ArrayDBIDs> splitsets
sets that resulted from recursive split of entire point set
• #### projectedPoints

DoubleDataStore[] projectedPoints
all projected points
• #### rnd

RandomFactory rnd
Random factory.
• #### distanceComputations

long distanceComputations
Count the number of distance computations.
• ### Constructor Detail

• #### RandomProjectedNeighborsAndDensities

public RandomProjectedNeighborsAndDensities(RandomFactory rnd)
Constructor.
Parameters:
rnd - Random factory.
• ### Method Detail

• #### computeSetsBounds

public void computeSetsBounds(Relation<V> points,
int minSplitSize,
DBIDs ptList)
Create random projections, project points and put points into sets of size about minSplitSize/2
Parameters:
points - points to process
minSplitSize - minimum size for which a point set is further partitioned (roughly corresponds to minPts in OPTICS)
ptList - Points that are to be projected
• #### splitupNoSort

public void splitupNoSort(ArrayModifiableDBIDs ind,
int begin,
int end,
int dim,
java.util.Random rand)
Recursively splits entire point set until the set is below a threshold
Parameters:
ind - points that are in the current set
begin - Interval begin in the ind array
end - Interval end in the ind array
dim - depth of projection (how many times point set has been split already)
rand - Random generator
• #### splitRandomly

public int splitRandomly(ArrayModifiableDBIDs ind,
int begin,
int end,
DoubleDataStore tpro,
java.util.Random rand)
Split the data set randomly.
Parameters:
ind - Object index
begin - Interval begin
end - Interval end
tpro - Projection
rand - Random generator
Returns:
Splitting point
• #### splitByDistance

public int splitByDistance(ArrayModifiableDBIDs ind,
int begin,
int end,
DoubleDataStore tpro,
java.util.Random rand)
Split the data set by distances.
Parameters:
ind - Object index
begin - Interval begin
end - Interval end
tpro - Projection
rand - Random generator
Returns:
Splitting point
• #### getNeighs

public DataStore<? extends DBIDs> getNeighs()
Compute list of neighbors for each point from sets resulting from projection
Returns:
list of neighbors for each point
• #### computeAverageDistInSet

public DoubleDataStore computeAverageDistInSet()
Compute for each point a density estimate as inverse of average distance to a point in a projected set
Returns:
for each point average distance to point in a set
• #### logStatistics

public void logStatistics()
Log some statistics.