Class HilOut<O extends NumberVector>

  • Type Parameters:
    O - Object type
    All Implemented Interfaces:
    Algorithm, OutlierAlgorithm

    @Title("Fast Outlier Detection in High Dimensional Spaces")
    @Description("Algorithm to compute outliers using Hilbert space filling curves")
    @Reference(authors="F. Angiulli, C. Pizzuti",
               title="Fast Outlier Detection in High Dimensional Spaces",
               booktitle="Proc. European Conf. Principles of Knowledge Discovery and Data Mining (PKDD\'02)",
               url="https://doi.org/10.1007/3-540-45681-3_2",
               bibkey="DBLP:conf/pkdd/AngiulliP02")
    public class HilOut<O extends NumberVector>
    extends java.lang.Object
    implements OutlierAlgorithm
    Fast Outlier Detection in High Dimensional Spaces

    Outlier Detection using Hilbert space filling curves

    Reference:

    F. Angiulli, C. Pizzuti
    Fast Outlier Detection in High Dimensional Spaces
    Proc. European Conf. Principles of Knowledge Discovery and Data Mining (PKDD'02)

    Since:
    0.5.0
    Author:
    Jonathan von Brünken, Erich Schubert
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      (package private) class  HilOut.HilbertFeatures
      Class organizing the data points along a hilbert curve.
      (package private) static class  HilOut.HilFeature
      Hilbert representation of a single object.
      static class  HilOut.ScoreType
      Type of output: all scores (upper bounds) or top n only
    • Field Summary

      Fields 
      Modifier and Type Field Description
      private int capital_n
      Set sizes, total and current iteration
      private int capital_n_star
      Set sizes, total and current iteration
      private int d
      Set sizes, total and current iteration
      private Distance<? super O> distance
      Distance function used.
      private DistanceQuery<O> distq
      Distance query
      private int h
      Hilbert precision
      private int k
      Number of nearest neighbors
      private static Logging LOG
      The logger for this class.
      private int n
      Number of outliers to compute exactly
      private int n_star
      Set sizes, total and current iteration
      private double omega_star
      Outlier threshold
      private double t
      LPNorm p parameter
      private java.lang.Enum<HilOut.ScoreType> tn
      Reporting mode: exact (top n) only, or all
    • Field Detail

      • LOG

        private static final Logging LOG
        The logger for this class.
      • k

        private int k
        Number of nearest neighbors
      • n

        private int n
        Number of outliers to compute exactly
      • h

        private int h
        Hilbert precision
      • t

        private double t
        LPNorm p parameter
      • tn

        private java.lang.Enum<HilOut.ScoreType> tn
        Reporting mode: exact (top n) only, or all
      • capital_n

        private int capital_n
        Set sizes, total and current iteration
      • n_star

        private int n_star
        Set sizes, total and current iteration
      • capital_n_star

        private int capital_n_star
        Set sizes, total and current iteration
      • d

        private int d
        Set sizes, total and current iteration
      • omega_star

        private double omega_star
        Outlier threshold
    • Constructor Detail

      • HilOut

        public HilOut​(LPNormDistance distance,
                      int k,
                      int n,
                      int h,
                      java.lang.Enum<HilOut.ScoreType> tn)
        Constructor.
        Parameters:
        k - Number of Next Neighbors
        n - Number of Outlier
        h - Number of Bits for precision to use - max 32
        tn - TopN or All Outlier Rank to return
    • Method Detail

      • getInputTypeRestriction

        public TypeInformation[] getInputTypeRestriction()
        Description copied from interface: Algorithm
        Get the input type restriction used for negotiating the data query.
        Specified by:
        getInputTypeRestriction in interface Algorithm
        Returns:
        Type restriction
      • run

        public OutlierResult run​(Relation<O> relation)
        Run the HilOut algorithm.
        Parameters:
        relation - Data relation
        Returns:
        Outlier result
      • scan

        private void scan​(HilOut.HilbertFeatures hf,
                          int k0)
        Scan function performs a squential scan over the data.
        Parameters:
        hf - the hilbert features
        k0 -
      • innerScan

        private void innerScan​(HilOut.HilbertFeatures hf,
                               int i,
                               int maxcount)
        innerScan function calculates new upper and lower bounds and inserts the points of the neighborhood the bounds are based on in the NN Set
        Parameters:
        i - position in pf of the feature for which the bounds should be calculated
        maxcount - maximal size of the neighborhood
      • trueOutliers

        private void trueOutliers​(HilOut.HilbertFeatures h)
        trueOutliers function updates n_star
        Parameters:
        h - the HilberFeatures