Class INFLO<O>

  • Type Parameters:
    O - the type of DatabaseObject the algorithm is applied on
    All Implemented Interfaces:
    Algorithm, OutlierAlgorithm

    @Title("INFLO: Influenced Outlierness Factor")
    @Description("Ranking Outliers Using Symmetric Neigborhood Relationship")
    @Reference(authors="W. Jin, A. Tung, J. Han, W. Wang",
               title="Ranking outliers using symmetric neighborhood relationship",
               booktitle="Proc. 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining",
               url="https://doi.org/10.1007/11731139_68",
               bibkey="DBLP:conf/pakdd/JinTHW06")
    public class INFLO<O>
    extends java.lang.Object
    implements OutlierAlgorithm
    Influence Outliers using Symmetric Relationship (INFLO) using two-way search, is an outlier detection method based on LOF; but also using the reverse kNN.

    Reference:

    W. Jin, A. Tung, J. Han, W. Wang
    Ranking outliers using symmetric neighborhood relationship
    Proc. 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining, 2006.

    There is an error in the two-way search algorithm proposed in the article above. It does not correctly compute the RNN, but it will find only those RNN that are in the kNN, making the entire RNN computation redundant, as it uses the kNN + RNN later anyway.

    Given the errors in the INFLO paper, as of ELKI 0.8, we do:

    • Assume \( IS_k(p) := kNN(p) \cup RkNN(p) \)
    • Implement the pruning of two-way search, i.e., use INFLO=1 if \( |kNN(p) \cap RkNN(p)|\geq m\cdot |kNN(p)| \)
    Since:
    0.3
    Author:
    Ahmed Hettab, Erich Schubert
    • Field Detail

      • LOG

        private static final Logging LOG
        The logger for this class.
      • distance

        private Distance<? super O> distance
        Distance function used.
      • m

        private double m
        Pruning threshold m.
      • kplus

        private int kplus
        Number of neighbors to use.
    • Constructor Detail

      • INFLO

        public INFLO​(Distance<? super O> distance,
                     double m,
                     int k)
        Constructor with parameters.
        Parameters:
        distance - Distance function in use
        m - m Parameter
        k - k Parameter
    • Method Detail

      • getInputTypeRestriction

        public TypeInformation[] getInputTypeRestriction()
        Description copied from interface: Algorithm
        Get the input type restriction used for negotiating the data query.
        Specified by:
        getInputTypeRestriction in interface Algorithm
        Returns:
        Type restriction
      • run

        public OutlierResult run​(Relation<O> relation)
        Run the algorithm
        Parameters:
        relation - Relation to process
        Returns:
        Outlier result
      • computeNeighborhoods

        private void computeNeighborhoods​(Relation<O> relation,
                                          DataStore<SetDBIDs> knns,
                                          ModifiableDBIDs pruned,
                                          WritableDataStore<ModifiableDBIDs> rNNminuskNNs)
        Compute the reverse kNN minus the kNN.

        This is based on algorithm 2 (two-way search) from the INFLO paper, but unfortunately this algorithm does not compute the RkNN correctly, but rather \( RkNN \cap kNN \), which is quite useless given that we will use the union of that with kNN later on. Therefore, we decided to rather follow what appears to be the idea of the method, not the literal pseudocode included.

        Parameters:
        relation - Data relation
        knns - Stored nearest neighbors
        pruned - Pruned objects: with too many neighbors
        rNNminuskNNs - reverse kNN storage