Class KNNSOS<O>

  • Type Parameters:
    O - Object type.
    All Implemented Interfaces:
    Algorithm, OutlierAlgorithm

    @Title("KNNSOS: k-Nearest-Neighbor Stochastic Outlier Selection")
    @Reference(authors="Erich Schubert, Michael Gertz",title="Intrinsic t-Stochastic Neighbor Embedding for Visualization and Outlier Detection: A Remedy Against the Curse of Dimensionality?",booktitle="Proc. Int. Conf. Similarity Search and Applications, SISAP 2017",url="https://doi.org/10.1007/978-3-319-68474-1_13",bibkey="DBLP:conf/sisap/SchubertG17") @Reference(authors="J. Janssens, F. Husz\u00e1r, E. Postma, J. van den Herik",title="Stochastic Outlier Selection",booktitle="TiCC TR 2012\u2013001",url="https://www.tilburguniversity.edu/upload/b7bac5b2-9b00-402a-9261-7849aa019fbb_sostr.pdf",bibkey="tr/tilburg/JanssensHPv12")
    public class KNNSOS<O>
    extends java.lang.Object
    implements OutlierAlgorithm
    kNN-based adaption of Stochastic Outlier Selection.

    This is a trivial variation of Stochastic Outlier Selection to benefit from KNN indexes, but not discussed in the original publication. Instead of setting perplexity, we choose the number of neighbors k, and set perplexity simply to k/3. Objects outside of the kNN are not considered anymore.

    Reference of the kNN variant:

    Erich Schubert, Michael Gertz
    Intrinsic t-Stochastic Neighbor Embedding for Visualization and Outlier Detection: A Remedy Against the Curse of Dimensionality?
    Proc. Int. Conf. Similarity Search and Applications, SISAP 2017

    Original reference:

    J. Janssens, F. Huszár, E. Postma, J. van den Herik
    Stochastic Outlier Selection
    TiCC TR 2012–001

    Since:
    0.7.5
    Author:
    Erich Schubert
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected Distance<? super O> distance
      Distance function used.
      protected int k
      Number of neighbors (not including query point).
      private static Logging LOG
      Class logger.
      protected double phi
      Expected outlier rate.
    • Constructor Summary

      Constructors 
      Constructor Description
      KNNSOS​(Distance<? super O> distance, int k)
      Constructor.
    • Field Detail

      • LOG

        private static final Logging LOG
        Class logger.
      • distance

        protected Distance<? super O> distance
        Distance function used.
      • k

        protected int k
        Number of neighbors (not including query point).
      • phi

        protected double phi
        Expected outlier rate.
    • Constructor Detail

      • KNNSOS

        public KNNSOS​(Distance<? super O> distance,
                      int k)
        Constructor.
        Parameters:
        distance - Distance function
        k - Number of neighbors to consider
    • Method Detail

      • getInputTypeRestriction

        public TypeInformation[] getInputTypeRestriction()
        Description copied from interface: Algorithm
        Get the input type restriction used for negotiating the data query.
        Specified by:
        getInputTypeRestriction in interface Algorithm
        Returns:
        Type restriction
      • run

        public OutlierResult run​(Relation<O> relation)
        Run the algorithm.
        Parameters:
        relation - data relation
        Returns:
        outlier detection result