There are three basic types of distance functions:
Primitive Distance Functions that can be computed for any two objects.
DBID Distance Functions, that are only defined for object IDs, e.g., an external distance matrix
Index-Based Distance Functions, that require an indexing/preprocessing step, and are then valid for existing database objects.
Using distance functions
As a 'consumer' of distances, you usually do not care about the type of distance function you
want to use. To facilitate this, a distance function can be bound to a database by calling
the 'instantiate' method to obtain a
A distance query is a best-effort adapter for the given distance function. Usually, you pass it
two DBIDs and get the distance value back. When required, the adapter will get the appropriate
records from the database needed to compute the distance.
Note: instantiating a preprocessor based distance will invoke the preprocessing step. It is recommended to do this as soon as possible, and only instantiate the query once, then pass the query object through the various methods.
DistanceQuery<V> distanceQuery = database.getDistanceQuery(EuclideanDistance.STATIC);
Interface Summary Interface Description DBIDDistanceDistance functions valid in a database context only (i.e. for DBIDs) DBIDRangeDistanceDistance functions valid in a static database context only (i.e. for DBIDRanges) For any "distance" that cannot be computed for arbitrary objects, only those that exist in the database and referenced by their ID. Distance<O>Base interface for any kind of distances. IndexBasedDistance<O>Distance function relying on an index (such as preprocessed neighborhoods). IndexBasedDistance.Instance<T,I extends Index>Instance interface for Index based distance functions. Norm<O>Abstract interface for a mathematical norm. NumberVectorDistance<O>Base interface for the common case of distance functions defined on numerical vectors. PrimitiveDistance<O>Primitive distance function that is defined on some kind of object. SpatialPrimitiveDistance<V extends SpatialComparable>API for a spatial primitive distance function. WeightedNumberVectorDistance<V>Distance functions where each dimension is assigned a weight.
Class Summary Class Description AbstractDatabaseDistance<O>Abstract super class for distance functions needing a database context. AbstractDatabaseDistance.Instance<O>The actual instance bound to a particular database. AbstractDBIDRangeDistanceAbstract base class for distance functions that rely on integer offsets within a consecutive range. AbstractIndexBasedDistance<O,F extends IndexFactory<O>>Abstract super class for distance functions needing a database index. AbstractIndexBasedDistance.Instance<O,I extends Index,F extends Distance<? super O>>The actual instance bound to a particular database. AbstractNumberVectorDistanceAbstract base class for the most common family of distance functions: defined on number vectors and returning double values. ArcCosineDistanceArcus cosine distance function for feature vectors. ArcCosineDistance.ParParameterization class. ArcCosineUnitlengthDistanceArcus cosine distance function for feature vectors. ArcCosineUnitlengthDistance.ParParameterization class. BrayCurtisDistanceBray-Curtis distance function / Sørensen–Dice coefficient for continuous vector spaces (not only binary data). BrayCurtisDistance.ParParameterization class. CanberraDistanceCanberra distance function, a variation of Manhattan distance. CanberraDistance.ParParameterization class. ClarkDistanceClark distance function for vector spaces. ClarkDistance.ParParameterization class. CosineDistanceCosine distance function for feature vectors. CosineDistance.ParParameterization class. CosineUnitlengthDistanceCosine distance function for unit length feature vectors. CosineUnitlengthDistance.ParParameterization class. MahalanobisDistanceMahalanobis quadratic form distance for feature vectors. MatrixWeightedQuadraticDistanceMatrix weighted quadratic distance, the squared form of
RandomStableDistanceThis is a dummy distance providing random values (obviously not metrical), useful mostly for unit tests and baseline evaluations: obviously this distance provides no benefit whatsoever. RandomStableDistance.ParParameterization class. SharedNearestNeighborJaccardDistance<O>SharedNearestNeighborJaccardDistance computes the Jaccard coefficient, which is a proper distance metric. SharedNearestNeighborJaccardDistance.Instance<T>Actual instance for a dataset. SharedNearestNeighborJaccardDistance.Par<O>Parameterization class. SqrtCosineDistanceCosine distance function for feature vectors using the square root. SqrtCosineDistance.ParParameterization class. SqrtCosineUnitlengthDistanceCosine distance function for unit length feature vectors using the square root. SqrtCosineUnitlengthDistance.ParParameterization class. WeightedCanberraDistanceWeighted Canberra distance function, a variation of Manhattan distance. WeightedCanberraDistance.ParParameterization class.