Class IsolationForest

  • All Implemented Interfaces:
    Algorithm, OutlierAlgorithm

    @Reference(authors="F. T. Liu, K. M. Ting, Z.-H. Zhou",
               title="Isolation-Based Anomaly Detection",
               booktitle="Transactions on Knowledge Discovery from Data (TKDD)",
               url="https://doi.org/10.1145/2133360.2133363",
               bibkey="DBLP:journals/tkdd/LiuTZ12")
    public class IsolationForest
    extends java.lang.Object
    implements OutlierAlgorithm
    Isolation-Based Anomaly Detection.

    This method uses an ensemble of randomized trees that serve as a simple density estimator instead of using distances to estimate density.

    Reference:

    F. T. Liu, K. M. Ting, Z.-H. Zhou
    Isolation-Based Anomaly Detection
    Transactions on Knowledge Discovery from Data (TKDD)

    Since:
    0.8.0
    Author:
    Erich Schubert
    • Field Detail

      • LOG

        private static final Logging LOG
        Class logger
      • numTrees

        protected int numTrees
        The number of trees
      • subsampleSize

        protected int subsampleSize
        The sub sample size
    • Constructor Detail

      • IsolationForest

        public IsolationForest​(int numTrees,
                               int subsampleSize,
                               RandomFactory rnd)
        Constructor.
        Parameters:
        numTrees -
        subsampleSize -
        rnd -
    • Method Detail

      • run

        public OutlierResult run​(Relation<? extends NumberVector> relation)
        Run the isolation forest algorithm.
        Parameters:
        relation - Data relation to index
        Returns:
        Outlier detection result
      • c

        protected static double c​(double n)
        Returns the average path length of an unsuccessful search. Returns 0 if the value is less than or equal to 1.
        Parameters:
        n - Depth
        Returns:
        Expected average
      • isolationScore

        protected double isolationScore​(IsolationForest.Node n,
                                        NumberVector v)
        Search a vector in the tree, return depth (path length)
        Parameters:
        n - Node to start
        v - Vector to search
        Returns:
        Isolation score based on depth and node size
      • getInputTypeRestriction

        public TypeInformation[] getInputTypeRestriction()
        Description copied from interface: Algorithm
        Get the input type restriction used for negotiating the data query.
        Specified by:
        getInputTypeRestriction in interface Algorithm
        Returns:
        Type restriction