java.lang.Object
- elki.distance.AbstractNumberVectorDistance
- - elki.distance.probabilistic.HellingerDistance

All Implemented Interfaces:

Distance<NumberVector>, NumberVectorDistance<NumberVector>, PrimitiveDistance<NumberVector>, SpatialPrimitiveDistance<NumberVector>, NormalizedPrimitiveSimilarity<NumberVector>, NormalizedSimilarity<NumberVector>, PrimitiveSimilarity<NumberVector>, Similarity<NumberVector>
```
@Reference(authors="E. Hellinger",title="Neue Begr\u00fcndung der Theorie quadratischer Formen von unendlichvielen Ver\u00e4nderlichen",booktitle="Journal f\u00fcr die reine und angewandte Mathematik",url="http://resolver.sub.uni-goettingen.de/purl?GDZPPN002166941",bibkey="journals/mathematik/Hellinger1909") @Reference(authors="M.-M. Deza, E. Deza",title="Dictionary of distances",booktitle="Dictionary of distances",url="https://doi.org/10.1007/978-3-642-00234-2",bibkey="doi:10.1007/978-3-642-00234-2")
@Alias({"hellinger","bhattacharyya"})
public class HellingerDistance
extends AbstractNumberVectorDistance
implements SpatialPrimitiveDistance<NumberVector>, NormalizedPrimitiveSimilarity<NumberVector>
```
Hellinger metric / affinity / kernel, Bhattacharyya coefficient, fidelity similarity, Matusita distance, Hellinger-Kakutani metric on a probability distribution.
We assume vectors represent normalized probability distributions. Then \[\text{Hellinger}(\vec{x},\vec{y}):= \sqrt{\tfrac12\sum\nolimits_i \left(\sqrt{x_i}-\sqrt{y_i}\right)^2 } \]
The corresponding kernel / similarity is \[ K_{\text{Hellinger}}(\vec{x},\vec{y}) := \sum\nolimits_i \sqrt{x_i y_i} \]
If we have normalized probability distributions, we have the nice property that $ K_{\text{Hellinger}}(\vec{x},\vec{x}) = \sum\nolimits_i x_i = 1$. and therefore $ K_{\text{Hellinger}}(\vec{x},\vec{y}) \in [0:1] $.
Furthermore, we have the following relationship between this variant of the distance and this kernel: \[ \text{Hellinger}^2(\vec{x},\vec{y}) = \tfrac12\sum\nolimits_i \left(\sqrt{x_i}-\sqrt{y_i}\right)^2 = \tfrac12\sum\nolimits_i x_i + y_i - 2 \sqrt{x_i y_i} \] \[ \text{Hellinger}^2(\vec{x},\vec{y}) = \tfrac12K_{\text{Hellinger}}(\vec{x},\vec{x}) + \tfrac12K_{\text{Hellinger}}(\vec{y},\vec{y}) - K_{\text{Hellinger}}(\vec{x},\vec{y}) = 1 - K_{\text{Hellinger}}(\vec{x},\vec{y}) \] which implies $\text{Hellinger}(\vec{x},\vec{y}) \in [0;1]$, and is very similar to the Euclidean distance and the linear kernel.
From this, it follows trivially that Hellinger distance corresponds to the kernel transformation $\phi:\vec{x}\mapsto(\tfrac12\sqrt{x_1},\ldots,\tfrac12\sqrt{x_d})$.
Deza and Deza unfortunately also give a second definition, as: \[\text{Hellinger-Deza}(\vec{x},\vec{y}):=\sqrt{2\sum\nolimits_i \left(\sqrt{\tfrac{x_i}{\bar{x}}}-\sqrt{\tfrac{y_i}{\bar{y}}}\right)^2}\] which has a built-in normalization, and a different scaling that is no longer bound to $[0;1]$. The 2 in this definition likely should be a $\frac12$.
This distance is well suited for histograms, but it is then more efficient to once normalize the histograms, apply the square roots, and then use Euclidean distance (i.e., use the "kernel trick" in reverse, materializing the transformation $\phi$ given above).
Reference:
E. Hellinger
Neue Begründung der Theorie quadratischer Formen von unendlichvielen Veränderlichen
Journal für die reine und angewandte Mathematik
M.-M. Deza, E. Deza
Dictionary of distances
TODO: support acceleration for sparse vectors.
TODO: add a second variant, with built-in normalization.

Since:

0.7.0

Author:

Erich Schubert

Nested Class Summary

Nested Classes
Modifier and Type Class Description

static class HellingerDistance.Par
Parameterization class.

Field Summary

Fields
Modifier and Type Field Description

private static java.lang.String NON_NEGATIVE
Assertion error message.

static HellingerDistance STATIC
Static instance.

Constructor Summary

Constructors
Constructor Description

HellingerDistance()
Deprecated.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method	Description
`double`	`distance(NumberVector fv1, NumberVector fv2)`	Computes the distance between two given DatabaseObjects according to this distance function.
`boolean`	`equals(java.lang.Object obj)`
`SimpleTypeInformation<? super NumberVector>`	`getInputTypeRestriction()`	Get the input data type of the function.
`int`	`hashCode()`
`<T extends NumberVector> SpatialPrimitiveDistanceSimilarityQuery<T>`	`instantiate(Relation<T> database)`	Instantiate with a database to get the actual distance query.
`boolean`	`isMetric()`	Is this distance function metric (satisfy the triangle inequality)
`boolean`	`isSymmetric()`	Is this function symmetric?
`double`	`minDist(SpatialComparable mbr1, SpatialComparable mbr2)`	Computes the distance between the two given MBRs according to this distance function.
`double`	`similarity(NumberVector o1, NumberVector o2)`	Computes the similarity between two given DatabaseObjects according to this similarity function.

Methods inherited from class elki.distance.AbstractNumberVectorDistance
dimensionality, dimensionality, dimensionality, dimensionality, dimensionality, dimensionality, dimensionality, dimensionality

Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface elki.distance.Distance
isSquared

- Field Detail
  - STATIC
```
public static final HellingerDistance STATIC
```
    Static instance.
  - NON_NEGATIVE
```
private static final java.lang.String NON_NEGATIVE
```
    Assertion error message.
    
    See Also:
    
    Constant Field Values
- Constructor Detail
  - HellingerDistance
```
@Deprecated
public HellingerDistance()
```
    Deprecated.
    
    Hellinger kernel. Use static instance STATIC!
- Method Detail
  - distance
```
public double distance(NumberVector fv1,
                       NumberVector fv2)
```
    Description copied from interface: PrimitiveDistance
    
    Computes the distance between two given DatabaseObjects according to this distance function.
    
    Specified by:
    
    distance in interface NumberVectorDistance<NumberVector>
    
    Specified by:
    
    distance in interface PrimitiveDistance<NumberVector>
    
    Parameters:
    
    fv1 - first DatabaseObject
    
    fv2 - second DatabaseObject
    
    Returns:
    
    the distance between two given DatabaseObjects according to this distance function
  - minDist
```
public double minDist(SpatialComparable mbr1,
                      SpatialComparable mbr2)
```
    Description copied from interface: SpatialPrimitiveDistance
    
    Computes the distance between the two given MBRs according to this distance function.
    
    Specified by:
    
    minDist in interface SpatialPrimitiveDistance<NumberVector>
    
    Parameters:
    
    mbr1 - the first MBR object
    
    mbr2 - the second MBR object
    
    Returns:
    
    the distance between the two given MBRs according to this distance function
  - similarity
```
public double similarity(NumberVector o1,
                         NumberVector o2)
```
    Description copied from interface: PrimitiveSimilarity
    
    Computes the similarity between two given DatabaseObjects according to this similarity function.
    
    Specified by:
    
    similarity in interface PrimitiveSimilarity<NumberVector>
    
    Parameters:
    
    o1 - first DatabaseObject
    
    o2 - second DatabaseObject
    
    Returns:
    
    the similarity between two given DatabaseObjects according to this similarity function
  - isSymmetric
```
public boolean isSymmetric()
```
    Description copied from interface: Distance
    
    Is this function symmetric?
    
    Specified by:
    
    isSymmetric in interface Distance<NumberVector>
    
    Specified by:
    
    isSymmetric in interface Similarity<NumberVector>
    
    Returns:
    
    true when symmetric
  - isMetric
```
public boolean isMetric()
```
    Description copied from interface: Distance
    
    Is this distance function metric (satisfy the triangle inequality)
    
    Specified by:
    
    isMetric in interface Distance<NumberVector>
    
    Returns:
    
    true when metric.
  - instantiate
```
public <T extends NumberVector> SpatialPrimitiveDistanceSimilarityQuery<T> instantiate(Relation<T> database)
```
    Description copied from interface: Distance
    
    Instantiate with a database to get the actual distance query.
    
    Specified by:
    
    instantiate in interface Distance<NumberVector>
    
    Specified by:
    
    instantiate in interface PrimitiveDistance<NumberVector>
    
    Specified by:
    
    instantiate in interface PrimitiveSimilarity<NumberVector>
    
    Specified by:
    
    instantiate in interface Similarity<NumberVector>
    
    Specified by:
    
    instantiate in interface SpatialPrimitiveDistance<NumberVector>
    
    Parameters:
    
    database - The representation to use
    
    Returns:
    
    Actual distance query.
  - getInputTypeRestriction
```
public SimpleTypeInformation<? super NumberVector> getInputTypeRestriction()
```
    Description copied from interface: Distance
    
    Get the input data type of the function.
    
    Specified by:
    
    getInputTypeRestriction in interface Distance<NumberVector>
    
    Specified by:
    
    getInputTypeRestriction in interface PrimitiveDistance<NumberVector>
    
    Specified by:
    
    getInputTypeRestriction in interface Similarity<NumberVector>
    
    Overrides:
    
    getInputTypeRestriction in class AbstractNumberVectorDistance
    
    Returns:
    
    Type restriction
  - equals
```
public boolean equals(java.lang.Object obj)
```
    Overrides:
    
    equals in class java.lang.Object
  - hashCode
```
public int hashCode()
```
    Overrides:
    
    hashCode in class java.lang.Object

Modifier and Type	Field	Description
`private static java.lang.String`	`NON_NEGATIVE`	Assertion error message.
`static HellingerDistance`	`STATIC`	Static instance.

Class HellingerDistance

Nested Class Summary

Field Summary

Constructor Summary

Method Summary

Methods inherited from class elki.distance.AbstractNumberVectorDistance

Methods inherited from class java.lang.Object

Methods inherited from interface elki.distance.Distance

Field Detail

STATIC

NON_NEGATIVE

Constructor Detail

HellingerDistance

Method Detail

distance

minDist

similarity

isSymmetric

isMetric

instantiate

getInputTypeRestriction

equals

hashCode