Class MinimalisticMemoryKDTree<O extends NumberVector>

  • Type Parameters:
    O - Vector type
    All Implemented Interfaces:
    DistancePriorityIndex<O>, Index, KNNIndex<O>, RangeIndex<O>

    @Reference(authors="J. L. Bentley",
               title="Multidimensional binary search trees used for associative searching",
               booktitle="Communications of the ACM 18(9)",
               url="https://doi.org/10.1145/361002.361007",
               bibkey="DBLP:journals/cacm/Bentley75")
    @Priority(-101)
    public class MinimalisticMemoryKDTree<O extends NumberVector>
    extends java.lang.Object
    implements DistancePriorityIndex<O>
    Simple implementation of a static in-memory K-D-tree. Does not support dynamic updates or anything, but also is very simple and memory efficient: all it uses is one ArrayModifiableDBIDs to sort the data in a serialized tree.

    Reference:

    J. L. Bentley
    Multidimensional binary search trees used for associative searching
    Communications of the ACM 18(9)

    The version SmallMemoryKDTree uses 3x more memory, but is considerably faster because it keeps a local copy of the attribute values, thus reducing the number of accesses to the relation substantially. In particular, this reduces construction time. MemoryKDTree needs even more memory, but uses much better splits and hence is usually the best choice.

    The search uses an improved search strategy published by:

    S. Arya and D. M. Mount
    Algorithms for fast vector quantization
    Proc. DCC '93: Data Compression Conference

    TODO: add support for weighted Minkowski distances.

    Since:
    0.6.0
    Author:
    Erich Schubert
    • Field Detail

      • LOG

        private static final Logging LOG
        Class logger
      • relation

        protected final Relation<O extends NumberVector> relation
        The representation we are bound to.
      • dims

        protected int dims
        The number of dimensions.
      • leafsize

        protected int leafsize
        Maximum size of leaf nodes.
      • objaccess

        protected final Counter objaccess
        Counter for comparisons.
      • distcalc

        protected final Counter distcalc
        Counter for distance computations.
    • Constructor Detail

      • MinimalisticMemoryKDTree

        public MinimalisticMemoryKDTree​(Relation<O> relation,
                                        int leafsize)
        Constructor.
        Parameters:
        relation - Relation to index
        leafsize - Maximum size of leaf nodes
    • Method Detail

      • initialize

        public void initialize()
        Description copied from interface: Index
        Initialize the index. For static indexes, this is the moment the index is bulk loaded.
        Specified by:
        initialize in interface Index
      • buildTree

        private void buildTree​(int left,
                               int right,
                               int axis,
                               VectorUtil.SortDBIDsBySingleDimension comp)
        Recursively build the tree by partial sorting. O(n log n) complexity. Apparently there exists a variant in only O(n log log n)? Please contribute!
        Parameters:
        left - Interval minimum
        right - Interval maximum
        axis - Current splitting axis
        comp - Comparator
      • next

        private int next​(int axis)
        Next axis.
        Parameters:
        axis - Current axis
        Returns:
        Next axis
      • logStatistics

        public void logStatistics()
        Description copied from interface: Index
        Send statistics to the logger, if enabled.

        Note: you must have set the logging level appropriately before initializing the index! Otherwise, the index might not have collected the desired statistics.

        Specified by:
        logStatistics in interface Index
      • countObjectAccess

        protected void countObjectAccess()
        Count a single object access.
      • countDistanceComputation

        protected void countDistanceComputation()
        Count a distance computation.
      • kNNByObject

        public KNNSearcher<O> kNNByObject​(DistanceQuery<O> distanceQuery,
                                          int maxk,
                                          int flags)
        Description copied from interface: KNNIndex
        Get a KNN query object for the given distance query and k.

        This function MAY return null, when the given distance is not supported!

        Specified by:
        kNNByObject in interface DistancePriorityIndex<O extends NumberVector>
        Specified by:
        kNNByObject in interface KNNIndex<O extends NumberVector>
        Parameters:
        distanceQuery - Distance query
        maxk - Maximum value of k
        flags - Hints for the optimizer
        Returns:
        KNN Query object or null
      • rangeByObject

        public RangeSearcher<O> rangeByObject​(DistanceQuery<O> distanceQuery,
                                              double maxradius,
                                              int flags)
        Description copied from interface: RangeIndex
        Get a range query object for the given distance query and k.

        This function MAY return null, when the given distance is not supported!

        Specified by:
        rangeByObject in interface DistancePriorityIndex<O extends NumberVector>
        Specified by:
        rangeByObject in interface RangeIndex<O extends NumberVector>
        Parameters:
        distanceQuery - Distance query
        maxradius - Maximum range
        flags - Hints for the optimizer
        Returns:
        KNN Query object or null