Package elki.index.lsh.hashfamilies
Class CosineHashFunctionFamily
- java.lang.Object
-
- elki.index.lsh.hashfamilies.CosineHashFunctionFamily
-
- All Implemented Interfaces:
LocalitySensitiveHashFunctionFamily<NumberVector>
@Reference(authors="M. S. Charikar",title="Similarity estimation techniques from rounding algorithms",booktitle="Proc. 34th ACM Symposium on Theory of Computing, STOC\'02",url="https://doi.org/10.1145/509907.509965",bibkey="DBLP:conf/stoc/Charikar02") @Reference(authors="M. Henzinger",title="Finding near-duplicate web pages: a large-scale evaluation of algorithms",booktitle="Proc. 29th ACM Conf. Research and Development in Information Retrieval (SIGIR 2006)",url="https://doi.org/10.1145/1148170.1148222",bibkey="DBLP:conf/sigir/Henzinger06") public class CosineHashFunctionFamily extends java.lang.Object implements LocalitySensitiveHashFunctionFamily<NumberVector>
Hash function family to use with Cosine distance, using simplified hash functions where the projection is only drawn from +-1, instead of Gaussian distributions.References:
M. S. Charikar
Similarity estimation techniques from rounding algorithms
Proc. 34th ACM Symposium on Theory of Computing, STOC'02M. Henzinger
Finding near-duplicate web pages: a large-scale evaluation of algorithms
Proc. 29th ACM Conf. Research and Development in Information Retrieval (SIGIR 2006)- Since:
- 0.7.0
- Author:
- Evgeniy Faerman
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
CosineHashFunctionFamily.Par
Parameterization class.
-
Field Summary
Fields Modifier and Type Field Description private int
k
The number of projections to use for each hash function.private RandomProjectionFamily
proj
Projection family to use.
-
Constructor Summary
Constructors Constructor Description CosineHashFunctionFamily(int k, RandomFactory random)
Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description java.util.ArrayList<? extends LocalitySensitiveHashFunction<? super NumberVector>>
generateHashFunctions(Relation<? extends NumberVector> relation, int l)
Generate hash functions for the given relation.TypeInformation
getInputTypeRestriction()
Get the input type information.boolean
isCompatible(Distance<?> df)
Check whether the given distance function can be accelerated using this hash family.
-
-
-
Field Detail
-
proj
private RandomProjectionFamily proj
Projection family to use.
-
k
private int k
The number of projections to use for each hash function.
-
-
Constructor Detail
-
CosineHashFunctionFamily
public CosineHashFunctionFamily(int k, RandomFactory random)
Constructor.- Parameters:
k
- Number of projections to use.random
- Random factory.
-
-
Method Detail
-
getInputTypeRestriction
public TypeInformation getInputTypeRestriction()
Description copied from interface:LocalitySensitiveHashFunctionFamily
Get the input type information.- Specified by:
getInputTypeRestriction
in interfaceLocalitySensitiveHashFunctionFamily<NumberVector>
- Returns:
- Input type information.
-
generateHashFunctions
public java.util.ArrayList<? extends LocalitySensitiveHashFunction<? super NumberVector>> generateHashFunctions(Relation<? extends NumberVector> relation, int l)
Description copied from interface:LocalitySensitiveHashFunctionFamily
Generate hash functions for the given relation.- Specified by:
generateHashFunctions
in interfaceLocalitySensitiveHashFunctionFamily<NumberVector>
- Parameters:
relation
- Relation to indexl
- Number of hash tables to use- Returns:
- Family of hash functions
-
isCompatible
public boolean isCompatible(Distance<?> df)
Description copied from interface:LocalitySensitiveHashFunctionFamily
Check whether the given distance function can be accelerated using this hash family.- Specified by:
isCompatible
in interfaceLocalitySensitiveHashFunctionFamily<NumberVector>
- Parameters:
df
- Distance function.- Returns:
true
when appropriate.
-
-