Class SmallMemoryKDTree<O extends NumberVector>

  • Type Parameters:
    O - Vector type
    All Implemented Interfaces:
    DistancePriorityIndex<O>, Index, KNNIndex<O>, RangeIndex<O>

    @Reference(authors="J. L. Bentley",
               title="Multidimensional binary search trees used for associative searching",
               booktitle="Communications of the ACM 18(9)",
               url="https://doi.org/10.1145/361002.361007",
               bibkey="DBLP:journals/cacm/Bentley75")
    @Priority(-101)
    public class SmallMemoryKDTree<O extends NumberVector>
    extends java.lang.Object
    implements DistancePriorityIndex<O>
    Simple implementation of a static in-memory K-D-tree. Does not support dynamic updates or anything, but also is very simple and memory efficient: all it uses is one ModifiableDoubleDBIDList to sort the data in a serialized tree and store the current attribute value.

    It needs about 3 times as much memory as MinimalisticMemoryKDTree but it is also considerably faster because it does not need to lookup this value from the vectors. MemoryKDTree needs even more memory, but uses much better splits and hence is usually the best choice.

    Reference:

    J. L. Bentley
    Multidimensional binary search trees used for associative searching
    Communications of the ACM 18(9)

    The search uses an improved search strategy published by:

    S. Arya and D. M. Mount
    Algorithms for fast vector quantization
    Proc. DCC '93: Data Compression Conference

    TODO: add support for weighted Minkowski distances.

    Since:
    0.7.0
    Author:
    Erich Schubert
    • Field Detail

      • LOG

        private static final Logging LOG
        Class logger
      • relation

        protected final Relation<O extends NumberVector> relation
        The representation we are bound to.
      • dims

        protected int dims
        The number of dimensions.
      • leafsize

        protected int leafsize
        Maximum size of leaf nodes.
      • objaccess

        protected final Counter objaccess
        Counter for comparisons.
      • distcalc

        protected final Counter distcalc
        Counter for distance computations.
    • Constructor Detail

      • SmallMemoryKDTree

        public SmallMemoryKDTree​(Relation<O> relation,
                                 int leafsize)
        Constructor.
        Parameters:
        relation - Relation to index
        leafsize - Maximum size of leaf nodes
    • Method Detail

      • initialize

        public void initialize()
        Description copied from interface: Index
        Initialize the index. For static indexes, this is the moment the index is bulk loaded.
        Specified by:
        initialize in interface Index
      • buildTree

        private void buildTree​(int left,
                               int right,
                               int axis,
                               DoubleDBIDListMIter iter)
        Recursively build the tree by partial sorting. O(n log n) complexity. Apparently there exists a variant in only O(n log log n)? Please contribute!
        Parameters:
        left - Interval minimum
        right - Interval maximum
        axis - Current splitting axis
        iter - Iterator
      • next

        private int next​(int axis)
        Next axis.
        Parameters:
        axis - Current axis
        Returns:
        Next axis
      • logStatistics

        public void logStatistics()
        Description copied from interface: Index
        Send statistics to the logger, if enabled.

        Note: you must have set the logging level appropriately before initializing the index! Otherwise, the index might not have collected the desired statistics.

        Specified by:
        logStatistics in interface Index
      • countObjectAccess

        protected void countObjectAccess()
        Count a single object access.
      • countDistanceComputation

        protected void countDistanceComputation()
        Count a distance computation.
      • kNNByObject

        public KNNSearcher<O> kNNByObject​(DistanceQuery<O> distanceQuery,
                                          int maxk,
                                          int flags)
        Description copied from interface: KNNIndex
        Get a KNN query object for the given distance query and k.

        This function MAY return null, when the given distance is not supported!

        Specified by:
        kNNByObject in interface DistancePriorityIndex<O extends NumberVector>
        Specified by:
        kNNByObject in interface KNNIndex<O extends NumberVector>
        Parameters:
        distanceQuery - Distance query
        maxk - Maximum value of k
        flags - Hints for the optimizer
        Returns:
        KNN Query object or null