Package elki.index.preprocessed.knn
Class NNDescent<O>
- java.lang.Object
-
- elki.index.preprocessed.knn.AbstractMaterializeKNNPreprocessor<O>
-
- elki.index.preprocessed.knn.NNDescent<O>
-
- Type Parameters:
O
- Object type
@Reference(authors="W. Dong, C. Moses, K. Li", title="Efficient k-nearest neighbor graph construction for generic similarity measures", booktitle="Proc. 20th Int. Conf. on World Wide Web (WWW\'11)", url="https://doi.org/10.1145/1963405.1963487", bibkey="DBLP:conf/www/DongCL11") public class NNDescent<O> extends AbstractMaterializeKNNPreprocessor<O>
NN-descent (also known as KNNGraph) is an approximate nearest neighbor search algorithm beginning with a random sample, then iteratively refining this sample until.Reference:
W. Dong and C. Moses and K. Li
Efficient k-nearest neighbor graph construction for generic similarity measures
Proc. 20th Int. Conf. on World Wide Web (WWW'11)TODO: collect and log some query statistics.
- Since:
- 0.7.5
- Author:
- Evelyn Kirner
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
NNDescent.Factory<O>
Index factory.
-
Field Summary
Fields Modifier and Type Field Description private double
delta
early termination parameterprivate int
iterations
maximum number of iterationsprivate static Logging
LOG
Loggerprivate boolean
noInitialNeighbors
Do not use initial neighborsprivate java.lang.String
prefix
Log prefix.private double
rho
sample rateprivate RandomFactory
rnd
Random generatorprivate WritableDataStore<KNNHeap>
store
store for neighbors-
Fields inherited from class elki.index.preprocessed.knn.AbstractMaterializeKNNPreprocessor
distance, distanceQuery, k, relation, storage
-
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description private boolean
add(DBIDRef cur, DBIDRef cand, double distance)
Add cand to cur's heap neighbors with distanceprivate void
addpair(WritableDataStore<HashSetModifiableDBIDs> newNeighbors, DBIDRef o1, DBIDRef o2)
private void
boundSize(HashSetModifiableDBIDs set, int items)
Bound the size of a set by random sampling.private void
clearAll(DBIDs ids, WritableDataStore<HashSetModifiableDBIDs> sets)
Clear (but reuse) all sets in the given storage.protected Logging
getLogger()
Get the classes static logger.KNNSearcher<O>
kNNByObject(DistanceQuery<O> distanceQuery, int maxk, int flags)
Get a KNN query object for the given distance query and k.protected void
preprocess()
Perform the preprocessing step.private int
processNewNeighbors(WritableDataStore<HashSetModifiableDBIDs> flag, HashSetModifiableDBIDs newFwd, HashSetModifiableDBIDs oldFwd, HashSetModifiableDBIDs newRev, HashSetModifiableDBIDs oldRev)
Process new neighbors.private void
reverse(WritableDataStore<HashSetModifiableDBIDs> sampleNewHash, WritableDataStore<HashSetModifiableDBIDs> newReverseNeighbors, WritableDataStore<HashSetModifiableDBIDs> oldReverseNeighbors)
calculates new and old neighbors for databaseprivate int
sampleNew(DBIDs ids, WritableDataStore<HashSetModifiableDBIDs> sampleNewNeighbors, WritableDataStore<HashSetModifiableDBIDs> newNeighborHash, int items)
samples newNeighbors for every object-
Methods inherited from class elki.index.preprocessed.knn.AbstractMaterializeKNNPreprocessor
createStorage, get, getDistanceQuery, getK, initialize, kNNByDBID
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface elki.index.Index
logStatistics
-
-
-
-
Field Detail
-
LOG
private static final Logging LOG
Logger
-
prefix
private java.lang.String prefix
Log prefix.
-
rnd
private final RandomFactory rnd
Random generator
-
delta
private double delta
early termination parameter
-
rho
private double rho
sample rate
-
iterations
private int iterations
maximum number of iterations
-
noInitialNeighbors
private boolean noInitialNeighbors
Do not use initial neighbors
-
store
private WritableDataStore<KNNHeap> store
store for neighbors
-
-
Constructor Detail
-
NNDescent
public NNDescent(Relation<O> relation, Distance<? super O> distance, int k, RandomFactory rnd, double delta, double rho, boolean noInitialNeighbors, int iterations)
Constructor.- Parameters:
relation
- Relation to indexdistance
- distance functionk
- krnd
- Random generatordelta
- Delta thresholdrho
- Rho thresholdnoInitialNeighbors
- Do not use initial neighborsiterations
- Maximum number of iterations
-
-
Method Detail
-
preprocess
protected void preprocess()
Description copied from class:AbstractMaterializeKNNPreprocessor
Perform the preprocessing step.- Specified by:
preprocess
in classAbstractMaterializeKNNPreprocessor<O>
-
clearAll
private void clearAll(DBIDs ids, WritableDataStore<HashSetModifiableDBIDs> sets)
Clear (but reuse) all sets in the given storage.- Parameters:
ids
- Ids to processsets
- Sets to clear
-
boundSize
private void boundSize(HashSetModifiableDBIDs set, int items)
Bound the size of a set by random sampling.- Parameters:
set
- Set to processitems
- Maximum size
-
processNewNeighbors
private int processNewNeighbors(WritableDataStore<HashSetModifiableDBIDs> flag, HashSetModifiableDBIDs newFwd, HashSetModifiableDBIDs oldFwd, HashSetModifiableDBIDs newRev, HashSetModifiableDBIDs oldRev)
Process new neighbors. This is a complex join, because we do not need to join old neighbors with old neighbors, and we have forward- and reverse neighbors each.- Parameters:
flag
- Flags to mark new neighbors.newFwd
- New forward neighborsoldFwd
- Old forward neighborsnewRev
- New reverse neighborsoldRev
- Old reverse neighbors- Returns:
- Number of new neighbors
-
add
private boolean add(DBIDRef cur, DBIDRef cand, double distance)
Add cand to cur's heap neighbors with distance- Parameters:
cur
- Current objectcand
- Neighbor candidatedistance
- Distance- Returns:
true
if it was a new neighbor.
-
addpair
private void addpair(WritableDataStore<HashSetModifiableDBIDs> newNeighbors, DBIDRef o1, DBIDRef o2)
-
sampleNew
private int sampleNew(DBIDs ids, WritableDataStore<HashSetModifiableDBIDs> sampleNewNeighbors, WritableDataStore<HashSetModifiableDBIDs> newNeighborHash, int items)
samples newNeighbors for every object- Parameters:
ids
- All idssampleNewNeighbors
- Output of sampled new neighborsnewNeighborHash
- - new neighbors for every objectitems
- Number of items to collect- Returns:
- Number of new neighbors
-
reverse
private void reverse(WritableDataStore<HashSetModifiableDBIDs> sampleNewHash, WritableDataStore<HashSetModifiableDBIDs> newReverseNeighbors, WritableDataStore<HashSetModifiableDBIDs> oldReverseNeighbors)
calculates new and old neighbors for database- Parameters:
sampleNewHash
- new neighbors for every objectnewReverseNeighbors
- new reverse neighborsoldReverseNeighbors
- old reverse neighbors
-
getLogger
protected Logging getLogger()
Description copied from class:AbstractMaterializeKNNPreprocessor
Get the classes static logger.- Specified by:
getLogger
in classAbstractMaterializeKNNPreprocessor<O>
- Returns:
- Logger
-
kNNByObject
public KNNSearcher<O> kNNByObject(DistanceQuery<O> distanceQuery, int maxk, int flags)
Description copied from interface:KNNIndex
Get a KNN query object for the given distance query and k.This function MAY return null, when the given distance is not supported!
- Specified by:
kNNByObject
in interfaceKNNIndex<O>
- Overrides:
kNNByObject
in classAbstractMaterializeKNNPreprocessor<O>
- Parameters:
distanceQuery
- Distance querymaxk
- Maximum value of kflags
- Hints for the optimizer- Returns:
- KNN Query object or
null
-
-