Distance Functions
ELKI release 0.7.5 includes the following distance functions
- Minkowski family:
- Sparse optimized versions of Minkowski distances:
- Weighted versions of Minkowski distances:
- Angular distances:
- ArcCosineDistanceFunction
- CosineDistanceFunction
-
ArcCosineUnitlengthDistanceFunction (for data with x =1) -
CosineUnitlengthDistanceFunction (for data with x =1)
- BrayCurtisDistanceFunction
- CanberraDistanceFunction
- WeightedCanberraDistanceFunction
- ClarkDistanceFunction
- RandomStableDistanceFunction (pseudo-random distance)
- Adapters for similarity functions:
- ArccosSimilarityAdapter
- LnSimilarityAdapter
- LinearAdapterLinear (to be renamed)
- Distances for probability distributions:
- ChiDistanceFunction
- ChiSquaredDistanceFunction
- FisherRaoDistanceFunction
- HellingerDistanceFunction
- JeffreyDivergenceDistanceFunction
- JensenShannonDivergenceDistanceFunction
- KullbackLeiblerDivergenceAsymmetricDistanceFunction
- KullbackLeiblerDivergenceReverseAsymmetricDistanceFunction
- SqrtJensenShannonDivergenceDistanceFunction
- TriangularDiscriminationDistanceFunction
- TriangularDistanceFunction
- Distance functions for 1-dimensional histograms:
- Color histogram distance functions:
- Correlation distance functions:
- PearsonCorrelationDistanceFunction
- SquaredPearsonCorrelationDistanceFunction
- AbsolutePearsonCorrelationDistanceFunction
- UncenteredCorrelationDistanceFunction
- SquaredUncenteredCorrelationDistanceFunction
- AbsoluteUncenteredCorrelationDistanceFunction
- WeightedPearsonCorrelationDistanceFunction
- WeightedSquaredPearsonCorrelationDistanceFunction
- Set-based distance functions (for binary data):
- String distance functions:
- Spatial distance functions (for geo data mining):
- External distance adapters (to access precomputed and externally computed distances):
- Subspace distance functions:
- Time series distance functions:
- Neighbor based distances:
- Distance functions for comparing clusters:
Similarity Functions as Distances
Similarity functions usable through the adapter classes above include:
- FractionalSharedNearestNeighborSimilarityFunction
- SharedNearestNeighborSimilarityFunction
- Kulczynski1SimilarityFunction
- Kulczynski2SimilarityFunction
- Kernel functions:
Implementing custom distance funtions
When implementing custom distance functions, ask yourself the following questions first:
- Is it defined on the data itself (like euclidean distance) or on the instances (precomputed, external, second order distances)?
- What requirements does it have on the input data?
- What is the output data type?
Most likely, you will be implementing a NumberVectorDistanceFunction and can save yourself some work by deriving from AbstractNumberVectorDistanceFunction, for example for distances defined in coordinate vectors.
The Tutorial on writing a custom distance function takes you through all the steps needed for implementing a custom distance function.