Class CoverTree<O>

  • All Implemented Interfaces:
    DistancePriorityIndex<O>, Index, KNNIndex<O>, RangeIndex<O>

    @Reference(authors="A. Beygelzimer, S. Kakade, J. Langford",
               title="Cover trees for nearest neighbor",
               booktitle="In Proc. 23rd Int. Conf. Machine Learning (ICML 2006)",
               url="https://doi.org/10.1145/1143844.1143857",
               bibkey="DBLP:conf/icml/BeygelzimerKL06")
    @Priority(200)
    public class CoverTree<O>
    extends AbstractCoverTree<O>
    implements DistancePriorityIndex<O>
    Cover tree data structure (in-memory). This is a metrical data structure that is similar to the M-tree, but not as balanced and disk-oriented. However, by not having these requirements it does not require the expensive splitting procedures of M-tree.

    This implementation contains some optimizations for Java: for example nodes with no children are stored efficiently in the parent node. Sometimes this also comes at some cost, as we currently do not push these onto a heap, but rather process them along with the parent node. This may cause additional distance computations (in particular for k nearest neighbor search), but also saves some overhead in managing these candidates.

    Reference:

    A. Beygelzimer, S. Kakade, J. Langford
    Cover trees for nearest neighbor
    In Proc. 23rd Int. Conf. Machine Learning (ICML 2006)

    This implementation uses metrical pruning, and keeps the distances to the parent nodes. It thus needs more than twice the memory of SimplifiedCoverTree, but computes fewer distances.

    TODO: allow insertions and removals, as in the original publication.

    Since:
    0.7.0
    Author:
    Erich Schubert
    • Constructor Detail

      • CoverTree

        public CoverTree​(Relation<O> relation,
                         Distance<? super O> distance,
                         double expansion,
                         int truncate)
        Constructor.
        Parameters:
        relation - data relation
        distance - distance function
        expansion - Expansion rate
        truncate - Truncate branches with less than this number of instances
      • CoverTree

        public CoverTree​(Relation<O> relation,
                         Distance<? super O> distance,
                         int truncate)
        Constructor.
        Parameters:
        relation - data relation
        distance - distance function
        truncate - Truncation parameter
    • Method Detail

      • initialize

        public void initialize()
        Description copied from interface: Index
        Initialize the index. For static indexes, this is the moment the index is bulk loaded.
        Specified by:
        initialize in interface Index
      • bulkLoad

        public void bulkLoad​(DBIDs ids)
        Bulk-load the index.
        Parameters:
        ids - IDs to load
      • bulkConstruct

        protected CoverTree.Node bulkConstruct​(DBIDRef cur,
                                               int maxScale,
                                               double parentDist,
                                               ModifiableDoubleDBIDList elems)
        Bulk-load the cover tree.

        This bulk-load is slightly simpler than the one used in the original cover-tree source: We do not look back into the "far" set of candidates.

        Parameters:
        cur - Current routing object
        maxScale - Maximum scale
        parentDist - Distance to parent element
        elems - Candidates
        Returns:
        Root node of subtree
      • checkCoverTree

        private void checkCoverTree​(CoverTree.Node cur,
                                    int[] counts,
                                    int depth)
        Collect some statistics on the tree.
        Parameters:
        cur - Current node
        counts - Counter set
        depth - Current depth
      • rangeByObject

        public RangeSearcher<O> rangeByObject​(DistanceQuery<O> distanceQuery,
                                              double maxradius,
                                              int flags)
        Description copied from interface: RangeIndex
        Get a range query object for the given distance query and k.

        This function MAY return null, when the given distance is not supported!

        Specified by:
        rangeByObject in interface DistancePriorityIndex<O>
        Specified by:
        rangeByObject in interface RangeIndex<O>
        Parameters:
        distanceQuery - Distance query
        maxradius - Maximum range
        flags - Hints for the optimizer
        Returns:
        KNN Query object or null
      • rangeByDBID

        public RangeSearcher<DBIDRef> rangeByDBID​(DistanceQuery<O> distanceQuery,
                                                  double maxradius,
                                                  int flags)
        Description copied from interface: RangeIndex
        Get a range query object for the given distance query and k.

        This function MAY return null, when the given distance is not supported!

        Specified by:
        rangeByDBID in interface RangeIndex<O>
        Parameters:
        distanceQuery - Distance query
        maxradius - Maximum range
        flags - Hints for the optimizer
        Returns:
        KNN Query object or null
      • kNNByObject

        public KNNSearcher<O> kNNByObject​(DistanceQuery<O> distanceQuery,
                                          int maxk,
                                          int flags)
        Description copied from interface: KNNIndex
        Get a KNN query object for the given distance query and k.

        This function MAY return null, when the given distance is not supported!

        Specified by:
        kNNByObject in interface DistancePriorityIndex<O>
        Specified by:
        kNNByObject in interface KNNIndex<O>
        Parameters:
        distanceQuery - Distance query
        maxk - Maximum value of k
        flags - Hints for the optimizer
        Returns:
        KNN Query object or null
      • kNNByDBID

        public KNNSearcher<DBIDRef> kNNByDBID​(DistanceQuery<O> distanceQuery,
                                              int maxk,
                                              int flags)
        Description copied from interface: KNNIndex
        Get a KNN query object for the given distance query and k.

        This function MAY return null, when the given distance is not supported!

        Specified by:
        kNNByDBID in interface KNNIndex<O>
        Parameters:
        distanceQuery - Distance query
        maxk - Maximum value of k
        flags - Hints for the optimizer
        Returns:
        KNN Query object or null
      • priorityByObject

        public PrioritySearcher<O> priorityByObject​(DistanceQuery<O> distanceQuery,
                                                    double maxradius,
                                                    int flags)
        Description copied from interface: DistancePriorityIndex
        Get a priority search object.
        Specified by:
        priorityByObject in interface DistancePriorityIndex<O>
        Parameters:
        distanceQuery - Distance query
        maxradius - Maximum search range (may be Double.POSITIVE_INFINITY
        flags - Optimizer hints
        Returns:
        Priority searcher