Package elki.distance.strings
Class LevenshteinDistance
- java.lang.Object
-
- elki.distance.strings.LevenshteinDistance
-
- All Implemented Interfaces:
Distance<java.lang.String>,PrimitiveDistance<java.lang.String>
@Description("Levenshtein distance.") @Reference(authors="V. I. Levenshtein", title="Binary codes capable of correcting deletions, insertions and reversals", booktitle="Soviet physics doklady 10", bibkey="journals/misc/Levenshtein66") public class LevenshteinDistance extends java.lang.Object implements PrimitiveDistance<java.lang.String>
Classic Levenshtein distance on strings.Reference:
V. I. Levenshtein
Binary codes capable of correcting deletions, insertions and reversals
Soviet physics doklady 10TODO: add case insensitive flag.
TODO: add an API that can stop early at a maximum distance
- Since:
- 0.6.0
- Author:
- Felix Stahlberg, Erich Schubert
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classLevenshteinDistance.ParParameterization class.
-
Field Summary
Fields Modifier and Type Field Description static LevenshteinDistanceSTATIC_SENSITIVEStatic instance, case sensitive.protected static SimpleTypeInformation<? super java.lang.String>TYPEInput data type.
-
Constructor Summary
Constructors Constructor Description LevenshteinDistance()Deprecated.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description doubledistance(java.lang.String o1, java.lang.String o2)Computes the distance between two given DatabaseObjects according to this distance function.booleanequals(java.lang.Object obj)SimpleTypeInformation<? super java.lang.String>getInputTypeRestriction()Get the input data type of the function.inthashCode()booleanisMetric()Is this distance function metric (satisfy the triangle inequality)static intlevenshteinDistance(java.lang.String o1, java.lang.String o2)Levenshtein distance for two strings.static intlevenshteinDistance(java.lang.String o1, java.lang.String o2, int prefix, int postfix)Compute the Levenshtein distance, except for prefix and postfix.private static intmin(int a, int b, int c)Three-way integer minimum.private static intpostfixLen(java.lang.String o1, java.lang.String o2, int prefix)Compute the postfix length.private static intprefixLen(java.lang.String o1, java.lang.String o2)Compute the length of the prefix.-
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface elki.distance.Distance
isSquared, isSymmetric
-
Methods inherited from interface elki.distance.PrimitiveDistance
instantiate
-
-
-
-
Field Detail
-
STATIC_SENSITIVE
public static final LevenshteinDistance STATIC_SENSITIVE
Static instance, case sensitive.
-
TYPE
protected static final SimpleTypeInformation<? super java.lang.String> TYPE
Input data type.
-
-
Method Detail
-
distance
public double distance(java.lang.String o1, java.lang.String o2)Description copied from interface:PrimitiveDistanceComputes the distance between two given DatabaseObjects according to this distance function.- Specified by:
distancein interfacePrimitiveDistance<java.lang.String>- Parameters:
o1- first DatabaseObjecto2- second DatabaseObject- Returns:
- the distance between two given DatabaseObjects according to this distance function
-
levenshteinDistance
public static int levenshteinDistance(java.lang.String o1, java.lang.String o2)Levenshtein distance for two strings.- Parameters:
o1- First stringo2- Second string- Returns:
- Levenshtein distance
-
prefixLen
private static int prefixLen(java.lang.String o1, java.lang.String o2)Compute the length of the prefix.- Parameters:
o1- First stringo2- Second string- Returns:
- Prefix length
-
postfixLen
private static int postfixLen(java.lang.String o1, java.lang.String o2, int prefix)Compute the postfix length.- Parameters:
o1- First objecto2- Second objectprefix- Known prefix length- Returns:
- Postfix length
-
levenshteinDistance
public static int levenshteinDistance(java.lang.String o1, java.lang.String o2, int prefix, int postfix)Compute the Levenshtein distance, except for prefix and postfix.- Parameters:
o1- First objecto2- Second objectprefix- Prefix lengthpostfix- Postfix length- Returns:
- Levenshtein distance
-
min
private static int min(int a, int b, int c)Three-way integer minimum.- Parameters:
a- First valueb- Second valuec- Third value.- Returns:
- Minimum
-
getInputTypeRestriction
public SimpleTypeInformation<? super java.lang.String> getInputTypeRestriction()
Description copied from interface:DistanceGet the input data type of the function.- Specified by:
getInputTypeRestrictionin interfaceDistance<java.lang.String>- Specified by:
getInputTypeRestrictionin interfacePrimitiveDistance<java.lang.String>- Returns:
- Type restriction
-
isMetric
public boolean isMetric()
Description copied from interface:DistanceIs this distance function metric (satisfy the triangle inequality)
-
equals
public boolean equals(java.lang.Object obj)
- Overrides:
equalsin classjava.lang.Object
-
hashCode
public int hashCode()
- Overrides:
hashCodein classjava.lang.Object
-
-