Package elki.distance
Distance functions
There are three basic types of distance functions:
Primitive Distance Function
s that can be computed for any two objects.DBID Distance Function
s, that are only defined for object IDs, e.g., an external distance matrixIndex-Based Distance Function
s, that require an indexing/preprocessing step, and are then valid for existing database objects.
Using distance functions
As a 'consumer' of distances, you usually do not care about the type of distance function you
want to use. To facilitate this, a distance function can be bound to a database by calling
the 'instantiate' method to obtain a DistanceQuery
object.
A distance query is a best-effort adapter for the given distance function. Usually, you pass it
two DBIDs and get the distance value back. When required, the adapter will get the appropriate
records from the database needed to compute the distance.
Note: instantiating a preprocessor based distance will invoke the preprocessing step. It is recommended to do this as soon as possible, and only instantiate the query once, then pass the query object through the various methods.
Code example
DistanceQuery<V> distanceQuery = database.getDistanceQuery(EuclideanDistance.STATIC);
-
Interface Summary Interface Description DBIDDistance Distance functions valid in a database context only (i.e. for DBIDs)DBIDRangeDistance Distance functions valid in a static database context only (i.e. for DBIDRanges) For any "distance" that cannot be computed for arbitrary objects, only those that exist in the database and referenced by their ID.Distance<O> Base interface for any kind of distances.IndexBasedDistance<O> Distance function relying on an index (such as preprocessed neighborhoods).IndexBasedDistance.Instance<T,I extends Index> Instance interface for Index based distance functions.Norm<O> Abstract interface for a mathematical norm.NumberVectorDistance<O> Base interface for the common case of distance functions defined on numerical vectors.PrimitiveDistance<O> Primitive distance function that is defined on some kind of object.SpatialPrimitiveDistance<V extends SpatialComparable> API for a spatial primitive distance function.WeightedNumberVectorDistance<V> Distance functions where each dimension is assigned a weight. -
Class Summary Class Description AbstractDatabaseDistance<O> Abstract super class for distance functions needing a database context.AbstractDatabaseDistance.Instance<O> The actual instance bound to a particular database.AbstractDBIDRangeDistance Abstract base class for distance functions that rely on integer offsets within a consecutive range.AbstractIndexBasedDistance<O,F extends IndexFactory<O>> Abstract super class for distance functions needing a database index.AbstractIndexBasedDistance.Instance<O,I extends Index,F extends Distance<? super O>> The actual instance bound to a particular database.AbstractNumberVectorDistance Abstract base class for the most common family of distance functions: defined on number vectors and returning double values.ArcCosineDistance Arcus cosine distance function for feature vectors.ArcCosineDistance.Par Parameterization class.ArcCosineUnitlengthDistance Arcus cosine distance function for feature vectors.ArcCosineUnitlengthDistance.Par Parameterization class.BrayCurtisDistance Bray-Curtis distance function / Sørensen–Dice coefficient for continuous vector spaces (not only binary data).BrayCurtisDistance.Par Parameterization class.CanberraDistance Canberra distance function, a variation of Manhattan distance.CanberraDistance.Par Parameterization class.ClarkDistance Clark distance function for vector spaces.ClarkDistance.Par Parameterization class.CosineDistance Cosine distance function for feature vectors.CosineDistance.Par Parameterization class.CosineUnitlengthDistance Cosine distance function for unit length feature vectors.CosineUnitlengthDistance.Par Parameterization class.MahalanobisDistance Mahalanobis quadratic form distance for feature vectors.MatrixWeightedQuadraticDistance Matrix weighted quadratic distance, the squared form ofMahalanobisDistance
.RandomStableDistance This is a dummy distance providing random values (obviously not metrical), useful mostly for unit tests and baseline evaluations: obviously this distance provides no benefit whatsoever.RandomStableDistance.Par Parameterization class.SharedNearestNeighborJaccardDistance<O> SharedNearestNeighborJaccardDistance computes the Jaccard coefficient, which is a proper distance metric.SharedNearestNeighborJaccardDistance.Instance<T> Actual instance for a dataset.SharedNearestNeighborJaccardDistance.Par<O> Parameterization class.SqrtCosineDistance Cosine distance function for feature vectors using the square root.SqrtCosineDistance.Par Parameterization class.SqrtCosineUnitlengthDistance Cosine distance function for unit length feature vectors using the square root.SqrtCosineUnitlengthDistance.Par Parameterization class.WeightedCanberraDistance Weighted Canberra distance function, a variation of Manhattan distance.WeightedCanberraDistance.Par Parameterization class.