Class VAFile<V extends NumberVector>

  • Type Parameters:
    V - Vector type
    All Implemented Interfaces:
    Index, KNNIndex<V>, RangeIndex<V>

    @Title("An approximation based data structure for similarity search")
    @Reference(authors="R. Weber, S. Blott",
               title="An approximation based data structure for similarity search",
               booktitle="Report TR1997b, ETH Zentrum, Zurich, Switzerland",
               url="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.40.480&rep=rep1&type=pdf",
               bibkey="tr/ethz/WeberS97")
    public class VAFile<V extends NumberVector>
    extends AbstractRefiningIndex<V>
    implements KNNIndex<V>, RangeIndex<V>
    Vector-approximation file (VAFile)

    Reference:

    R. Weber, S. Blott
    An approximation based data structure for similarity search
    Report TR1997b, ETH Zentrum, Zurich, Switzerland

    TODO: this needs to be optimized & more low-level.

    Since:
    0.5.0
    Author:
    Thomas Bernecker, Erich Schubert
    • Field Detail

      • LOG

        private static final Logging LOG
        Logging class.
      • partitions

        private int partitions
        Number of partitions.
      • splitPositions

        private double[][] splitPositions
        Quantile grid we use.
      • pageSize

        int pageSize
        Page size, for estimating the VA file size.
      • scans

        int scans
        Number of scans we performed.
    • Constructor Detail

      • VAFile

        public VAFile​(int pageSize,
                      Relation<V> relation,
                      int partitions)
        Constructor.
        Parameters:
        pageSize - Page size of simulated index
        relation - Relation to index
        partitions - Number of partitions for each dimension.
    • Method Detail

      • initialize

        public void initialize()
        Description copied from interface: Index
        Initialize the index. For static indexes, this is the moment the index is bulk loaded.
        Specified by:
        initialize in interface Index
      • setPartitions

        public void setPartitions​(Relation<V> relation)
                           throws java.lang.IllegalArgumentException
        Initialize the data set grid by computing quantiles.
        Parameters:
        relation - Data relation
        Throws:
        java.lang.IllegalArgumentException
      • calculateApproximation

        public VectorApproximation calculateApproximation​(DBIDRef id,
                                                          V dv)
        Calculate the VA file position given the existing borders.
        Parameters:
        id - Object ID
        dv - Data vector
        Returns:
        Vector approximation
      • getScannedPages

        public long getScannedPages()
        Get the number of scanned bytes.
        Returns:
        Number of scanned bytes.
      • logStatistics

        public void logStatistics()
        Description copied from interface: Index
        Send statistics to the logger, if enabled.

        Note: you must have set the logging level appropriately before initializing the index! Otherwise, the index might not have collected the desired statistics.

        Specified by:
        logStatistics in interface Index
        Overrides:
        logStatistics in class AbstractRefiningIndex<V extends NumberVector>
      • kNNByObject

        public KNNSearcher<V> kNNByObject​(DistanceQuery<V> distanceQuery,
                                          int maxk,
                                          int flags)
        Description copied from interface: KNNIndex
        Get a KNN query object for the given distance query and k.

        This function MAY return null, when the given distance is not supported!

        Specified by:
        kNNByObject in interface KNNIndex<V extends NumberVector>
        Parameters:
        distanceQuery - Distance query
        maxk - Maximum value of k
        flags - Hints for the optimizer
        Returns:
        KNN Query object or null
      • rangeByObject

        public RangeSearcher<V> rangeByObject​(DistanceQuery<V> distanceQuery,
                                              double maxradius,
                                              int flags)
        Description copied from interface: RangeIndex
        Get a range query object for the given distance query and k.

        This function MAY return null, when the given distance is not supported!

        Specified by:
        rangeByObject in interface RangeIndex<V extends NumberVector>
        Parameters:
        distanceQuery - Distance query
        maxradius - Maximum range
        flags - Hints for the optimizer
        Returns:
        KNN Query object or null