Class EvaluateRankingQuality<V extends NumberVector>

  • Type Parameters:
    V - Vector type
    All Implemented Interfaces:
    Algorithm

    @Title("Evaluate Ranking Quality")
    @Description("Evaluates the effectiveness of a distance function via the obtained rankings.")
    public class EvaluateRankingQuality<V extends NumberVector>
    extends java.lang.Object
    implements Algorithm
    Evaluate a distance function with respect to kNN queries. For each point, the neighbors are sorted by distance, then the AUROC is computed. A score of 1 means that the distance function provides a perfect ordering of relevant neighbors first, then irrelevant neighbors. A value of 0.5 can be obtained by random sorting. A value of 0 means the distance function is inverted, i.e. a similarity.

    In contrast to RankingQualityHistogram, this method uses a binning based on the centrality of objects. This allows analyzing whether or not a particular distance degrades for the outer parts of a cluster.

    TODO: Allow fixed binning range, configurable

    TODO: Add sampling

    Since:
    0.2
    Author:
    Erich Schubert
    • Field Detail

      • LOG

        private static final Logging LOG
        The logger for this class.
      • numbins

        protected int numbins
        Number of bins to use.
    • Constructor Detail

      • EvaluateRankingQuality

        public EvaluateRankingQuality​(Distance<? super V> distance,
                                      int numbins)
        Constructor.
        Parameters:
        distance - Distance function
        numbins - Number of bins
    • Method Detail

      • getInputTypeRestriction

        public TypeInformation[] getInputTypeRestriction()
        Description copied from interface: Algorithm
        Get the input type restriction used for negotiating the data query.
        Specified by:
        getInputTypeRestriction in interface Algorithm
        Returns:
        Type restriction
      • run

        public HistogramResult run​(Database database,
                                   Relation<V> relation)
        Run the algorithm.
        Parameters:
        database - Database
        relation - Relation
        Returns:
        Histogram