Class SimplifiedCoverTree<O>

  • All Implemented Interfaces:
    DistancePriorityIndex<O>, Index, KNNIndex<O>, RangeIndex<O>

    public class SimplifiedCoverTree<O>
    extends AbstractCoverTree<O>
    implements DistancePriorityIndex<O>
    Simplified cover tree data structure (in-memory). This is a metrical data structure that is similar to the M-tree, but not as balanced and disk-oriented. However, by not having these requirements it does not require the expensive splitting procedures of M-tree.

    This version does not store the distance to the parent, so it needs only about 40% of the memory of CoverTree but does more distance computations for search.

    Reference:

    A. Beygelzimer, S. Kakade, J. Langford
    Cover trees for nearest neighbor
    In Proc. 23rd Int. Conf. Machine Learning (ICML 2006)

    TODO: allow insertions and removals, as in the original publication.

    Since:
    0.7.0
    Author:
    Erich Schubert
    • Constructor Detail

      • SimplifiedCoverTree

        public SimplifiedCoverTree​(Relation<O> relation,
                                   Distance<? super O> distance,
                                   double expansion,
                                   int truncate)
        Constructor.
        Parameters:
        relation - data relation
        distance - distance function
        expansion - Expansion rate
        truncate - Truncate branches with less than this number of instances
    • Method Detail

      • initialize

        public void initialize()
        Description copied from interface: Index
        Initialize the index. For static indexes, this is the moment the index is bulk loaded.
        Specified by:
        initialize in interface Index
      • bulkLoad

        public void bulkLoad​(DBIDs ids)
        Bulk-load the index.
        Parameters:
        ids - IDs to load
      • bulkConstruct

        protected SimplifiedCoverTree.Node bulkConstruct​(DBIDRef cur,
                                                         int maxScale,
                                                         ModifiableDoubleDBIDList elems)
        Bulk-load the cover tree.

        This bulk-load is slightly simpler than the one used in the original cover-tree source: We do not look back into the "far" set of candidates.

        Parameters:
        cur - Current routing object
        maxScale - Maximum scale
        elems - Candidates
        Returns:
        Root node of subtree
      • checkCoverTree

        private void checkCoverTree​(SimplifiedCoverTree.Node cur,
                                    int[] counts,
                                    int depth)
        Collect some statistics on the tree.
        Parameters:
        cur - Current node
        counts - Counter set
        depth - Current depth
      • rangeByObject

        public RangeSearcher<O> rangeByObject​(DistanceQuery<O> distanceQuery,
                                              double maxradius,
                                              int flags)
        Description copied from interface: RangeIndex
        Get a range query object for the given distance query and k.

        This function MAY return null, when the given distance is not supported!

        Specified by:
        rangeByObject in interface DistancePriorityIndex<O>
        Specified by:
        rangeByObject in interface RangeIndex<O>
        Parameters:
        distanceQuery - Distance query
        maxradius - Maximum range
        flags - Hints for the optimizer
        Returns:
        KNN Query object or null
      • rangeByDBID

        public RangeSearcher<DBIDRef> rangeByDBID​(DistanceQuery<O> distanceQuery,
                                                  double maxradius,
                                                  int flags)
        Description copied from interface: RangeIndex
        Get a range query object for the given distance query and k.

        This function MAY return null, when the given distance is not supported!

        Specified by:
        rangeByDBID in interface RangeIndex<O>
        Parameters:
        distanceQuery - Distance query
        maxradius - Maximum range
        flags - Hints for the optimizer
        Returns:
        KNN Query object or null
      • kNNByObject

        public KNNSearcher<O> kNNByObject​(DistanceQuery<O> distanceQuery,
                                          int maxk,
                                          int flags)
        Description copied from interface: KNNIndex
        Get a KNN query object for the given distance query and k.

        This function MAY return null, when the given distance is not supported!

        Specified by:
        kNNByObject in interface DistancePriorityIndex<O>
        Specified by:
        kNNByObject in interface KNNIndex<O>
        Parameters:
        distanceQuery - Distance query
        maxk - Maximum value of k
        flags - Hints for the optimizer
        Returns:
        KNN Query object or null
      • kNNByDBID

        public KNNSearcher<DBIDRef> kNNByDBID​(DistanceQuery<O> distanceQuery,
                                              int maxk,
                                              int flags)
        Description copied from interface: KNNIndex
        Get a KNN query object for the given distance query and k.

        This function MAY return null, when the given distance is not supported!

        Specified by:
        kNNByDBID in interface KNNIndex<O>
        Parameters:
        distanceQuery - Distance query
        maxk - Maximum value of k
        flags - Hints for the optimizer
        Returns:
        KNN Query object or null
      • priorityByObject

        public PrioritySearcher<O> priorityByObject​(DistanceQuery<O> distanceQuery,
                                                    double maxradius,
                                                    int flags)
        Description copied from interface: DistancePriorityIndex
        Get a priority search object.
        Specified by:
        priorityByObject in interface DistancePriorityIndex<O>
        Parameters:
        distanceQuery - Distance query
        maxradius - Maximum search range (may be Double.POSITIVE_INFINITY
        flags - Optimizer hints
        Returns:
        Priority searcher