Class RandomProjectedNeighborsAndDensities


  • @Reference(authors="J. Schneider, M. Vlachos",
               title="Fast parameterless density-based clustering via random projections",
               booktitle="Proc. 22nd ACM Int. Conf. on Information & Knowledge Management (CIKM 2013)",
               url="https://doi.org/10.1145/2505515.2505590",
               bibkey="DBLP:conf/cikm/SchneiderV13")
    public class RandomProjectedNeighborsAndDensities
    extends java.lang.Object
    Random Projections used for computing neighbors and density estimates.

    This index is specialized for the algorithm FastOPTICS

    Reference:

    J. Schneider, M. Vlachos
    Fast parameterless density-based clustering via random projections
    Proc. 22nd ACM Int. Conf. on Information and Knowledge Management (CIKM 2013)

    This is based on the original code provided by Johannes Schneider, with ELKIfications and optimizations by Erich Schubert.

    Since:
    0.7.0
    Author:
    Johannes Schneider, Erich Schubert
    • Field Detail

      • LOG

        private static final Logging LOG
        Class logger.
      • PREFIX

        private static final java.lang.String PREFIX
        Statistics logging prefix.
      • logOProjectionConst

        private static final int logOProjectionConst
        Default constant used to compute number of projections as well as number of splits of point set, ie. constant *log N*d
        See Also:
        Constant Field Values
      • sizeTolerance

        private static final float sizeTolerance
        Sets used for neighborhood computation should be about minSplitSize Sets are still used if they deviate by less (1+/- sizeTolerance)
        See Also:
        Constant Field Values
      • minSplitSize

        int minSplitSize
        minimum size for which a point set is further partitioned (roughly corresponds to minPts in OPTICS)
      • splitsets

        java.util.ArrayList<ArrayDBIDs> splitsets
        sets that resulted from recursive split of entire point set
      • distanceComputations

        long distanceComputations
        Count the number of distance computations.
    • Constructor Detail

      • RandomProjectedNeighborsAndDensities

        public RandomProjectedNeighborsAndDensities​(RandomFactory rnd)
        Constructor.
        Parameters:
        rnd - Random factory.
    • Method Detail

      • computeSetsBounds

        public void computeSetsBounds​(Relation<? extends NumberVector> points,
                                      int minSplitSize,
                                      DBIDs ptList)
        Create random projections, project points and put points into sets of size about minSplitSize/2
        Parameters:
        points - points to process
        minSplitSize - minimum size for which a point set is further partitioned (roughly corresponds to minPts in OPTICS)
        ptList - Points that are to be projected
      • splitupNoSort

        public void splitupNoSort​(ArrayModifiableDBIDs ind,
                                  int begin,
                                  int end,
                                  int dim,
                                  java.util.Random rand)
        Recursively splits entire point set until the set is below a threshold
        Parameters:
        ind - points that are in the current set
        begin - Interval begin in the ind array
        end - Interval end in the ind array
        dim - depth of projection (how many times point set has been split already)
        rand - Random generator
      • splitRandomly

        public int splitRandomly​(ArrayModifiableDBIDs ind,
                                 int begin,
                                 int end,
                                 DoubleDataStore tpro,
                                 java.util.Random rand)
        Split the data set randomly.
        Parameters:
        ind - Object index
        begin - Interval begin
        end - Interval end
        tpro - Projection
        rand - Random generator
        Returns:
        Splitting point
      • splitByDistance

        public int splitByDistance​(ArrayModifiableDBIDs ind,
                                   int begin,
                                   int end,
                                   DoubleDataStore tpro,
                                   java.util.Random rand)
        Split the data set by distances.
        Parameters:
        ind - Object index
        begin - Interval begin
        end - Interval end
        tpro - Projection
        rand - Random generator
        Returns:
        Splitting point
      • getNeighs

        public DataStore<DBIDs> getNeighs()
        Compute list of neighbors for each point from sets resulting from projection
        Returns:
        list of neighbors for each point
      • computeAverageDistInSet

        public DoubleDataStore computeAverageDistInSet()
        Compute for each point a density estimate as inverse of average distance to a point in a projected set
        Returns:
        for each point average distance to point in a set
      • logStatistics

        public void logStatistics()
        Log some statistics.