Class PrecomputedSimilarityMatrix<O>

  • Type Parameters:
    O - Object type
    All Implemented Interfaces:
    Index, SimilarityIndex<O>, SimilarityRangeIndex<O>

    public class PrecomputedSimilarityMatrix<O>
    extends java.lang.Object
    implements SimilarityIndex<O>, SimilarityRangeIndex<O>
    Precomputed similarity matrix, for a small data set.

    This class uses a linear memory layout (not a ragged array), and assumes symmetry as well as strictness. This way, it only stores the upper triangle matrix with double precision. It has to store (n-1) * (n-2) similarity values in memory, requiring 8 * (n-1) * (n-2) bytes. Since Java has a size limit of arrays of 31 bits (signed integer), we can store at most \(2^16\) objects (precisely, 65536 objects) in a single array, which needs about 16 GB of RAM.

    Since:
    0.7.0
    Author:
    Erich Schubert
    • Field Detail

      • LOG

        private static final Logging LOG
        Class logger.
      • relation

        protected final Relation<O> relation
        The representation we are bound to.
      • similarityFunction

        protected Similarity<? super O> similarityFunction
        Nested similarity function.
      • similarityQuery

        protected SimilarityQuery<O> similarityQuery
        Nested similarity query.
      • matrix

        private double[] matrix
        Similarity matrix.
      • size

        private int size
        Size of DBID range.
    • Constructor Detail

      • PrecomputedSimilarityMatrix

        public PrecomputedSimilarityMatrix​(Relation<O> relation,
                                           Similarity<? super O> similarityFunction)
        Constructor.
        Parameters:
        relation - Data relation
        similarityFunction - Similarity function
    • Method Detail

      • initialize

        public void initialize()
        Description copied from interface: Index
        Initialize the index. For static indexes, this is the moment the index is bulk loaded.
        Specified by:
        initialize in interface Index
      • triangleSize

        protected static int triangleSize​(int x)
        Compute the size of a complete x by x triangle (minus diagonal).
        Parameters:
        x - Offset
        Returns:
        Size of complete triangle
      • getOffset

        private int getOffset​(int x,
                              int y)
        Array offset computation.
        Parameters:
        x - X parameter
        y - Y parameter
        Returns:
        Array offset
      • logStatistics

        public void logStatistics()
        Description copied from interface: Index
        Send statistics to the logger, if enabled.

        Note: you must have set the logging level appropriately before initializing the index! Otherwise, the index might not have collected the desired statistics.

        Specified by:
        logStatistics in interface Index
      • getSimilarityQuery

        public SimilarityQuery<O> getSimilarityQuery​(Similarity<? super O> similarityFunction)
        Description copied from interface: SimilarityIndex
        Get a similarity query object for the given similarity function.
        Specified by:
        getSimilarityQuery in interface SimilarityIndex<O>
        Parameters:
        similarityFunction - Similarity function to use.
        Returns:
        similarity query object or null
      • similarityRangeByDBID

        public RangeSearcher<DBIDRef> similarityRangeByDBID​(SimilarityQuery<O> simQuery,
                                                            double maxradius,
                                                            int flags)
        Description copied from interface: SimilarityRangeIndex
        Get a range query object for the given distance query and k.

        This function MAY return null, when the given distance is not supported!

        Specified by:
        similarityRangeByDBID in interface SimilarityRangeIndex<O>
        Parameters:
        simQuery - Similarity query
        maxradius - Maximum range
        flags - Hints for the optimizer
        Returns:
        KNN Query object or null
      • similarityRangeByObject

        public RangeSearcher<O> similarityRangeByObject​(SimilarityQuery<O> simQuery,
                                                        double maxrange,
                                                        int flags)
        Description copied from interface: SimilarityRangeIndex
        Get a range query object for the given distance query and k.

        This function MAY return null, when the given distance is not supported!

        Specified by:
        similarityRangeByObject in interface SimilarityRangeIndex<O>
        Parameters:
        simQuery - Similarity query
        maxrange - Maximum range
        flags - Hints for the optimizer
        Returns:
        KNN Query object or null