Package elki.distance.set
Class JaccardSimilarityDistance
- java.lang.Object
-
- elki.distance.set.AbstractSetDistance<FeatureVector<?>>
-
- elki.distance.set.JaccardSimilarityDistance
-
- All Implemented Interfaces:
Distance<FeatureVector<?>>
,NumberVectorDistance<FeatureVector<?>>
,PrimitiveDistance<FeatureVector<?>>
,NormalizedPrimitiveSimilarity<FeatureVector<?>>
,NormalizedSimilarity<FeatureVector<?>>
,PrimitiveSimilarity<FeatureVector<?>>
,Similarity<FeatureVector<?>>
@Reference(authors="P. Jaccard", title="Distribution de la florine alpine dans la Bassin de Dranses et dans quelques regiones voisines", booktitle="Bulletin del la Soci\u00e9t\u00e9 Vaudoise des Sciences Naturelles", url="http://data.rero.ch/01-R241574160", bibkey="journals/misc/Jaccard1902") public class JaccardSimilarityDistance extends AbstractSetDistance<FeatureVector<?>> implements NormalizedPrimitiveSimilarity<FeatureVector<?>>, NumberVectorDistance<FeatureVector<?>>, PrimitiveDistance<FeatureVector<?>>
A flexible extension of Jaccard similarity to non-binary vectors.Jaccard coefficient is commonly defined as \(\frac{A\cap B}{A\cup B}\).
We can extend this definition to non-binary vectors as follows: \(\tfrac{|\{i\mid a_i = b_i\}|}{|\{i\mid a_i = 0 \wedge b_i = 0\}|}\)
For binary vectors, this will obviously be the same quantity. However, this version is more useful for categorical data.
Reference:
P. Jaccard
Distribution de la florine alpine dans la Bassin de Dranses et dans quelques regiones voisines
Bulletin del la Société Vaudoise des Sciences Naturelles- Since:
- 0.6.0
- Author:
- Erich Schubert
-
-
Field Summary
-
Fields inherited from class elki.distance.set.AbstractSetDistance
DOUBLE_NULL, INTEGER_NULL, STRING_NULL
-
-
Constructor Summary
Constructors Constructor Description JaccardSimilarityDistance()
Constructor.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description double
distance(FeatureVector<?> o1, FeatureVector<?> o2)
Computes the distance between two given DatabaseObjects according to this distance function.double
distance(NumberVector o1, NumberVector o2)
Computes the distance between two given vectors according to this distance function.boolean
equals(java.lang.Object obj)
SimpleTypeInformation<? super FeatureVector<?>>
getInputTypeRestriction()
Get the input data type of the function.int
hashCode()
<T extends FeatureVector<?>>
DistanceSimilarityQuery<T>instantiate(Relation<T> relation)
Instantiate with a representation to get the actual similarity query.boolean
isMetric()
Is this distance function metric (satisfy the triangle inequality)boolean
isSymmetric()
Is this function symmetric?double
similarity(FeatureVector<?> o1, FeatureVector<?> o2)
Computes the similarity between two given DatabaseObjects according to this similarity function.static double
similarityNumberVector(NumberVector o1, NumberVector o2)
Compute Jaccard similarity for two number vectors.-
Methods inherited from class elki.distance.set.AbstractSetDistance
isNull
-
-
-
-
Method Detail
-
similarity
public double similarity(FeatureVector<?> o1, FeatureVector<?> o2)
Description copied from interface:PrimitiveSimilarity
Computes the similarity between two given DatabaseObjects according to this similarity function.- Specified by:
similarity
in interfacePrimitiveSimilarity<FeatureVector<?>>
- Parameters:
o1
- first DatabaseObjecto2
- second DatabaseObject- Returns:
- the similarity between two given DatabaseObjects according to this similarity function
-
similarityNumberVector
public static double similarityNumberVector(NumberVector o1, NumberVector o2)
Compute Jaccard similarity for two number vectors.- Parameters:
o1
- First vectoro2
- Second vector- Returns:
- Jaccard similarity
-
distance
public double distance(FeatureVector<?> o1, FeatureVector<?> o2)
Description copied from interface:PrimitiveDistance
Computes the distance between two given DatabaseObjects according to this distance function.- Specified by:
distance
in interfacePrimitiveDistance<FeatureVector<?>>
- Parameters:
o1
- first DatabaseObjecto2
- second DatabaseObject- Returns:
- the distance between two given DatabaseObjects according to this distance function
-
distance
public double distance(NumberVector o1, NumberVector o2)
Description copied from interface:NumberVectorDistance
Computes the distance between two given vectors according to this distance function.- Specified by:
distance
in interfaceNumberVectorDistance<FeatureVector<?>>
- Parameters:
o1
- first vectoro2
- second vector- Returns:
- the distance between two given vectors according to this distance function
-
isSymmetric
public boolean isSymmetric()
Description copied from interface:Similarity
Is this function symmetric?- Specified by:
isSymmetric
in interfaceDistance<FeatureVector<?>>
- Specified by:
isSymmetric
in interfaceSimilarity<FeatureVector<?>>
- Returns:
true
when symmetric
-
isMetric
public boolean isMetric()
Description copied from interface:Distance
Is this distance function metric (satisfy the triangle inequality)- Specified by:
isMetric
in interfaceDistance<FeatureVector<?>>
- Returns:
true
when metric.
-
getInputTypeRestriction
public SimpleTypeInformation<? super FeatureVector<?>> getInputTypeRestriction()
Description copied from interface:Similarity
Get the input data type of the function.- Specified by:
getInputTypeRestriction
in interfaceDistance<FeatureVector<?>>
- Specified by:
getInputTypeRestriction
in interfacePrimitiveDistance<FeatureVector<?>>
- Specified by:
getInputTypeRestriction
in interfaceSimilarity<FeatureVector<?>>
- Returns:
- Type restriction
-
instantiate
public <T extends FeatureVector<?>> DistanceSimilarityQuery<T> instantiate(Relation<T> relation)
Description copied from interface:Similarity
Instantiate with a representation to get the actual similarity query.- Specified by:
instantiate
in interfaceDistance<FeatureVector<?>>
- Specified by:
instantiate
in interfacePrimitiveDistance<FeatureVector<?>>
- Specified by:
instantiate
in interfacePrimitiveSimilarity<FeatureVector<?>>
- Specified by:
instantiate
in interfaceSimilarity<FeatureVector<?>>
- Parameters:
relation
- Representation to use- Returns:
- Actual distance query.
-
equals
public boolean equals(java.lang.Object obj)
- Overrides:
equals
in classjava.lang.Object
-
hashCode
public int hashCode()
- Overrides:
hashCode
in classjava.lang.Object
-
-