Package elki.distance.strings
Class LevenshteinDistance
- java.lang.Object
-
- elki.distance.strings.LevenshteinDistance
-
- All Implemented Interfaces:
Distance<java.lang.String>
,PrimitiveDistance<java.lang.String>
@Description("Levenshtein distance.") @Reference(authors="V. I. Levenshtein", title="Binary codes capable of correcting deletions, insertions and reversals", booktitle="Soviet physics doklady 10", bibkey="journals/misc/Levenshtein66") public class LevenshteinDistance extends java.lang.Object implements PrimitiveDistance<java.lang.String>
Classic Levenshtein distance on strings.Reference:
V. I. Levenshtein
Binary codes capable of correcting deletions, insertions and reversals
Soviet physics doklady 10TODO: add case insensitive flag.
TODO: add an API that can stop early at a maximum distance
- Since:
- 0.6.0
- Author:
- Felix Stahlberg, Erich Schubert
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
LevenshteinDistance.Par
Parameterization class.
-
Field Summary
Fields Modifier and Type Field Description static LevenshteinDistance
STATIC_SENSITIVE
Static instance, case sensitive.protected static SimpleTypeInformation<? super java.lang.String>
TYPE
Input data type.
-
Constructor Summary
Constructors Constructor Description LevenshteinDistance()
Deprecated.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description double
distance(java.lang.String o1, java.lang.String o2)
Computes the distance between two given DatabaseObjects according to this distance function.boolean
equals(java.lang.Object obj)
SimpleTypeInformation<? super java.lang.String>
getInputTypeRestriction()
Get the input data type of the function.int
hashCode()
boolean
isMetric()
Is this distance function metric (satisfy the triangle inequality)static int
levenshteinDistance(java.lang.String o1, java.lang.String o2)
Levenshtein distance for two strings.static int
levenshteinDistance(java.lang.String o1, java.lang.String o2, int prefix, int postfix)
Compute the Levenshtein distance, except for prefix and postfix.private static int
min(int a, int b, int c)
Three-way integer minimum.private static int
postfixLen(java.lang.String o1, java.lang.String o2, int prefix)
Compute the postfix length.private static int
prefixLen(java.lang.String o1, java.lang.String o2)
Compute the length of the prefix.-
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface elki.distance.Distance
isSquared, isSymmetric
-
Methods inherited from interface elki.distance.PrimitiveDistance
instantiate
-
-
-
-
Field Detail
-
STATIC_SENSITIVE
public static final LevenshteinDistance STATIC_SENSITIVE
Static instance, case sensitive.
-
TYPE
protected static final SimpleTypeInformation<? super java.lang.String> TYPE
Input data type.
-
-
Method Detail
-
distance
public double distance(java.lang.String o1, java.lang.String o2)
Description copied from interface:PrimitiveDistance
Computes the distance between two given DatabaseObjects according to this distance function.- Specified by:
distance
in interfacePrimitiveDistance<java.lang.String>
- Parameters:
o1
- first DatabaseObjecto2
- second DatabaseObject- Returns:
- the distance between two given DatabaseObjects according to this distance function
-
levenshteinDistance
public static int levenshteinDistance(java.lang.String o1, java.lang.String o2)
Levenshtein distance for two strings.- Parameters:
o1
- First stringo2
- Second string- Returns:
- Levenshtein distance
-
prefixLen
private static int prefixLen(java.lang.String o1, java.lang.String o2)
Compute the length of the prefix.- Parameters:
o1
- First stringo2
- Second string- Returns:
- Prefix length
-
postfixLen
private static int postfixLen(java.lang.String o1, java.lang.String o2, int prefix)
Compute the postfix length.- Parameters:
o1
- First objecto2
- Second objectprefix
- Known prefix length- Returns:
- Postfix length
-
levenshteinDistance
public static int levenshteinDistance(java.lang.String o1, java.lang.String o2, int prefix, int postfix)
Compute the Levenshtein distance, except for prefix and postfix.- Parameters:
o1
- First objecto2
- Second objectprefix
- Prefix lengthpostfix
- Postfix length- Returns:
- Levenshtein distance
-
min
private static int min(int a, int b, int c)
Three-way integer minimum.- Parameters:
a
- First valueb
- Second valuec
- Third value.- Returns:
- Minimum
-
getInputTypeRestriction
public SimpleTypeInformation<? super java.lang.String> getInputTypeRestriction()
Description copied from interface:Distance
Get the input data type of the function.- Specified by:
getInputTypeRestriction
in interfaceDistance<java.lang.String>
- Specified by:
getInputTypeRestriction
in interfacePrimitiveDistance<java.lang.String>
- Returns:
- Type restriction
-
isMetric
public boolean isMetric()
Description copied from interface:Distance
Is this distance function metric (satisfy the triangle inequality)
-
equals
public boolean equals(java.lang.Object obj)
- Overrides:
equals
in classjava.lang.Object
-
hashCode
public int hashCode()
- Overrides:
hashCode
in classjava.lang.Object
-
-