Class VPTree<O>

  • Type Parameters:
    O - Object type
    All Implemented Interfaces:
    DistancePriorityIndex<O>, Index, KNNIndex<O>, RangeIndex<O>

    @Reference(authors="P. N. Yianilos",
               title="Data Structures and Algorithms for Nearest Neighbor Search in General Metric Spaces",
               booktitle="Proc. ACM/SIGACT-SIAM Symposium on Discrete Algorithms",
               url="http://dl.acm.org/citation.cfm?id=313559.313789",
               bibkey="DBLP:conf/soda/Yianilos93")
    public class VPTree<O>
    extends java.lang.Object
    implements DistancePriorityIndex<O>
    Vantage Point Tree with no additional information

    In a Vantage Point Tree the data is split at the median distance to a vantage point at every node.

    The distance function in the original paper was bounded to have a limited tau size. In this class the function is not bounded, as we can just limit tau to Double.MAX_VALUE

    Reference:

    P. N. Yianilos
    Data Structures and Algorithms for Nearest Neighbor Search in General Metric Spaces
    Proc. ACM/SIGACT-SIAM Symposium on Discrete Algorithms

    Since:
    0.8.0
    Author:
    Robert Gehde, Erich Schubert
    • Field Detail

      • LOG

        private static final Logging LOG
        Class logger.
      • relation

        protected final Relation<O> relation
        The representation we are bound to.
      • distFunc

        Distance<? super O> distFunc
        Distance Function to use
      • distQuery

        private DistanceQuery<O> distQuery
        Actual distance query on the Data
      • random

        RandomFactory random
        Random factory for selecting vantage points
      • sampleSize

        int sampleSize
        Sample size for selecting vantage points
      • truncate

        int truncate
        Truncation parameter
      • distComputations

        long distComputations
        Counter for distance computations.
    • Constructor Detail

      • VPTree

        public VPTree​(Relation<O> relation,
                      Distance<? super O> distance,
                      int leafsize)
        Constructor with default values, used by EmpiricalQueryOptimizer
        Parameters:
        relation - data for tree construction
        distance - distance function for tree construction
        leafsize - Leaf size and sample size (simpler parameterization)
      • VPTree

        public VPTree​(Relation<O> relation,
                      Distance<? super O> distance,
                      RandomFactory random,
                      int sampleSize,
                      int truncate)
        Constructor.
        Parameters:
        relation - data for tree construction
        distance - distance function for tree construction
        random - Random generator for sampling
        sampleSize - Sample size for finding the vantage point
        truncate - Leaf size threshold
    • Method Detail

      • initialize

        public void initialize()
        Description copied from interface: Index
        Initialize the index. For static indexes, this is the moment the index is bulk loaded.
        Specified by:
        initialize in interface Index
      • distance

        private double distance​(DBIDRef a,
                                DBIDRef b)
        Compute a distance, and count.
        Parameters:
        a - First object
        b - Second object
        Returns:
        Distance
      • distance

        private double distance​(O a,
                                DBIDRef b)
        Compute a distance, and count.
        Parameters:
        a - First object
        b - Second object
        Returns:
        Distance
      • kNNByObject

        public KNNSearcher<O> kNNByObject​(DistanceQuery<O> distanceQuery,
                                          int maxk,
                                          int flags)
        Description copied from interface: KNNIndex
        Get a KNN query object for the given distance query and k.

        This function MAY return null, when the given distance is not supported!

        Specified by:
        kNNByObject in interface DistancePriorityIndex<O>
        Specified by:
        kNNByObject in interface KNNIndex<O>
        Parameters:
        distanceQuery - Distance query
        maxk - Maximum value of k
        flags - Hints for the optimizer
        Returns:
        KNN Query object or null
      • kNNByDBID

        public KNNSearcher<DBIDRef> kNNByDBID​(DistanceQuery<O> distanceQuery,
                                              int maxk,
                                              int flags)
        Description copied from interface: KNNIndex
        Get a KNN query object for the given distance query and k.

        This function MAY return null, when the given distance is not supported!

        Specified by:
        kNNByDBID in interface KNNIndex<O>
        Parameters:
        distanceQuery - Distance query
        maxk - Maximum value of k
        flags - Hints for the optimizer
        Returns:
        KNN Query object or null
      • rangeByObject

        public RangeSearcher<O> rangeByObject​(DistanceQuery<O> distanceQuery,
                                              double maxrange,
                                              int flags)
        Description copied from interface: RangeIndex
        Get a range query object for the given distance query and k.

        This function MAY return null, when the given distance is not supported!

        Specified by:
        rangeByObject in interface DistancePriorityIndex<O>
        Specified by:
        rangeByObject in interface RangeIndex<O>
        Parameters:
        distanceQuery - Distance query
        maxrange - Maximum range
        flags - Hints for the optimizer
        Returns:
        KNN Query object or null
      • rangeByDBID

        public RangeSearcher<DBIDRef> rangeByDBID​(DistanceQuery<O> distanceQuery,
                                                  double maxrange,
                                                  int flags)
        Description copied from interface: RangeIndex
        Get a range query object for the given distance query and k.

        This function MAY return null, when the given distance is not supported!

        Specified by:
        rangeByDBID in interface RangeIndex<O>
        Parameters:
        distanceQuery - Distance query
        maxrange - Maximum range
        flags - Hints for the optimizer
        Returns:
        KNN Query object or null
      • logStatistics

        public void logStatistics()
        Description copied from interface: Index
        Send statistics to the logger, if enabled.

        Note: you must have set the logging level appropriately before initializing the index! Otherwise, the index might not have collected the desired statistics.

        Specified by:
        logStatistics in interface Index