de.lmu.ifi.dbs.elki.algorithm.outlier.lof

Class INFLO<O>

• Type Parameters:
O - the type of DatabaseObject the algorithm is applied on
All Implemented Interfaces:
Algorithm, DistanceBasedAlgorithm<O>, OutlierAlgorithm

@Title(value="INFLO: Influenced Outlierness Factor")
@Description(value="Ranking Outliers Using Symmetric Neigborhood Relationship")
@Reference(authors="W. Jin, A. Tung, J. Han, W. Wang",
title="Ranking outliers using symmetric neighborhood relationship",
booktitle="Proc. 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining",
url="https://doi.org/10.1007/11731139_68",
bibkey="DBLP:conf/pakdd/JinTHW06")
@Alias(value="de.lmu.ifi.dbs.elki.algorithm.outlier.INFLO")
public class INFLO<O>
extends AbstractDistanceBasedAlgorithm<O,OutlierResult>
implements OutlierAlgorithm
Influence Outliers using Symmetric Relationship (INFLO) using two-way search, is an outlier detection method based on LOF; but also using the reverse kNN.

Reference:

W. Jin, A. Tung, J. Han, W. Wang
Ranking outliers using symmetric neighborhood relationship
Proc. 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining, 2006.

There is an error in the two-way search algorithm proposed in the article above. It does not correctly compute the RNN, but it will find only those RNN that are in the kNN, making the entire RNN computation redundant, as it uses the kNN + RNN later anyway.

Given the errors in the INFLO paper, as of ELKI 0.8, we do:

• Assume $$IS_k(p) := kNN(p) \cup RkNN(p)$$
• Implement the pruning of two-way search, i.e., use INFLO=1 if $$|kNN(p) \cap RkNN(p)|\geq m\cdot |kNN(p)|$$
Since:
0.3
Author:
Ahmed Hettab, Erich Schubert
• Field Detail

• LOG

private static final Logging LOG
The logger for this class.
• m

private double m
Pruning threshold m.
• kplus1

private int kplus1
Number of neighbors to use.
• Constructor Detail

• INFLO

public INFLO(DistanceFunction<? super O> distanceFunction,
double m,
int k)
Constructor with parameters.
Parameters:
distanceFunction - Distance function in use
m - m Parameter
k - k Parameter
• Method Detail

• run

public OutlierResult run(Database database,
Relation<O> relation)
Run the algorithm
Parameters:
database - Database to process
relation - Relation to process
Returns:
Outlier result
• computeNeighborhoods

private void computeNeighborhoods(Relation<O> relation,
DataStore<SetDBIDs> knns,
ModifiableDBIDs pruned,
WritableDataStore<ModifiableDBIDs> rNNminuskNNs)
Compute the reverse kNN minus the kNN.

This is based on algorithm 2 (two-way search) from the INFLO paper, but unfortunately this algorithm does not compute the RkNN correctly, but rather $$RkNN \cap kNN$$, which is quite useless given that we will use the union of that with kNN later on. Therefore, we decided to rather follow what appears to be the idea of the method, not the literal pseudocode included.

Parameters:
relation - Data relation
knns - Stored nearest neighbors
pruned - Pruned objects: with too many neighbors
rNNminuskNNs - reverse kNN storage