Package elki.distance.set
Class JaccardSimilarityDistance
- java.lang.Object
-
- elki.distance.set.AbstractSetDistance<FeatureVector<?>>
-
- elki.distance.set.JaccardSimilarityDistance
-
- All Implemented Interfaces:
Distance<FeatureVector<?>>,NumberVectorDistance<FeatureVector<?>>,PrimitiveDistance<FeatureVector<?>>,NormalizedPrimitiveSimilarity<FeatureVector<?>>,NormalizedSimilarity<FeatureVector<?>>,PrimitiveSimilarity<FeatureVector<?>>,Similarity<FeatureVector<?>>
@Reference(authors="P. Jaccard", title="Distribution de la florine alpine dans la Bassin de Dranses et dans quelques regiones voisines", booktitle="Bulletin del la Soci\u00e9t\u00e9 Vaudoise des Sciences Naturelles", url="http://data.rero.ch/01-R241574160", bibkey="journals/misc/Jaccard1902") public class JaccardSimilarityDistance extends AbstractSetDistance<FeatureVector<?>> implements NormalizedPrimitiveSimilarity<FeatureVector<?>>, NumberVectorDistance<FeatureVector<?>>, PrimitiveDistance<FeatureVector<?>>
A flexible extension of Jaccard similarity to non-binary vectors.Jaccard coefficient is commonly defined as \(\frac{A\cap B}{A\cup B}\).
We can extend this definition to non-binary vectors as follows: \(\tfrac{|\{i\mid a_i = b_i\}|}{|\{i\mid a_i = 0 \wedge b_i = 0\}|}\)
For binary vectors, this will obviously be the same quantity. However, this version is more useful for categorical data.
Reference:
P. Jaccard
Distribution de la florine alpine dans la Bassin de Dranses et dans quelques regiones voisines
Bulletin del la Société Vaudoise des Sciences Naturelles- Since:
- 0.6.0
- Author:
- Erich Schubert
-
-
Field Summary
-
Fields inherited from class elki.distance.set.AbstractSetDistance
DOUBLE_NULL, INTEGER_NULL, STRING_NULL
-
-
Constructor Summary
Constructors Constructor Description JaccardSimilarityDistance()Constructor.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description doubledistance(FeatureVector<?> o1, FeatureVector<?> o2)Computes the distance between two given DatabaseObjects according to this distance function.doubledistance(NumberVector o1, NumberVector o2)Computes the distance between two given vectors according to this distance function.booleanequals(java.lang.Object obj)SimpleTypeInformation<? super FeatureVector<?>>getInputTypeRestriction()Get the input data type of the function.inthashCode()<T extends FeatureVector<?>>
DistanceSimilarityQuery<T>instantiate(Relation<T> relation)Instantiate with a representation to get the actual similarity query.booleanisMetric()Is this distance function metric (satisfy the triangle inequality)booleanisSymmetric()Is this function symmetric?doublesimilarity(FeatureVector<?> o1, FeatureVector<?> o2)Computes the similarity between two given DatabaseObjects according to this similarity function.static doublesimilarityNumberVector(NumberVector o1, NumberVector o2)Compute Jaccard similarity for two number vectors.-
Methods inherited from class elki.distance.set.AbstractSetDistance
isNull
-
-
-
-
Method Detail
-
similarity
public double similarity(FeatureVector<?> o1, FeatureVector<?> o2)
Description copied from interface:PrimitiveSimilarityComputes the similarity between two given DatabaseObjects according to this similarity function.- Specified by:
similarityin interfacePrimitiveSimilarity<FeatureVector<?>>- Parameters:
o1- first DatabaseObjecto2- second DatabaseObject- Returns:
- the similarity between two given DatabaseObjects according to this similarity function
-
similarityNumberVector
public static double similarityNumberVector(NumberVector o1, NumberVector o2)
Compute Jaccard similarity for two number vectors.- Parameters:
o1- First vectoro2- Second vector- Returns:
- Jaccard similarity
-
distance
public double distance(FeatureVector<?> o1, FeatureVector<?> o2)
Description copied from interface:PrimitiveDistanceComputes the distance between two given DatabaseObjects according to this distance function.- Specified by:
distancein interfacePrimitiveDistance<FeatureVector<?>>- Parameters:
o1- first DatabaseObjecto2- second DatabaseObject- Returns:
- the distance between two given DatabaseObjects according to this distance function
-
distance
public double distance(NumberVector o1, NumberVector o2)
Description copied from interface:NumberVectorDistanceComputes the distance between two given vectors according to this distance function.- Specified by:
distancein interfaceNumberVectorDistance<FeatureVector<?>>- Parameters:
o1- first vectoro2- second vector- Returns:
- the distance between two given vectors according to this distance function
-
isSymmetric
public boolean isSymmetric()
Description copied from interface:SimilarityIs this function symmetric?- Specified by:
isSymmetricin interfaceDistance<FeatureVector<?>>- Specified by:
isSymmetricin interfaceSimilarity<FeatureVector<?>>- Returns:
truewhen symmetric
-
isMetric
public boolean isMetric()
Description copied from interface:DistanceIs this distance function metric (satisfy the triangle inequality)- Specified by:
isMetricin interfaceDistance<FeatureVector<?>>- Returns:
truewhen metric.
-
getInputTypeRestriction
public SimpleTypeInformation<? super FeatureVector<?>> getInputTypeRestriction()
Description copied from interface:SimilarityGet the input data type of the function.- Specified by:
getInputTypeRestrictionin interfaceDistance<FeatureVector<?>>- Specified by:
getInputTypeRestrictionin interfacePrimitiveDistance<FeatureVector<?>>- Specified by:
getInputTypeRestrictionin interfaceSimilarity<FeatureVector<?>>- Returns:
- Type restriction
-
instantiate
public <T extends FeatureVector<?>> DistanceSimilarityQuery<T> instantiate(Relation<T> relation)
Description copied from interface:SimilarityInstantiate with a representation to get the actual similarity query.- Specified by:
instantiatein interfaceDistance<FeatureVector<?>>- Specified by:
instantiatein interfacePrimitiveDistance<FeatureVector<?>>- Specified by:
instantiatein interfacePrimitiveSimilarity<FeatureVector<?>>- Specified by:
instantiatein interfaceSimilarity<FeatureVector<?>>- Parameters:
relation- Representation to use- Returns:
- Actual distance query.
-
equals
public boolean equals(java.lang.Object obj)
- Overrides:
equalsin classjava.lang.Object
-
hashCode
public int hashCode()
- Overrides:
hashCodein classjava.lang.Object
-
-