Class DistanceStatisticsWithClasses<O>

  • Type Parameters:
    O - Object type
    All Implemented Interfaces:
    Algorithm

    @Title("Distance Histogram")
    @Description("Computes a histogram over the distances occurring in the data set.")
    public class DistanceStatisticsWithClasses<O>
    extends java.lang.Object
    implements Algorithm
    Algorithm to gather statistics over the distance distribution in the data set.
    Since:
    0.2
    Author:
    Erich Schubert
    • Field Detail

      • LOG

        private static final Logging LOG
        The logger for this class.
      • distance

        protected Distance<? super O> distance
        Distance function used.
      • numbin

        protected int numbin
        Number of bins to use in sampling.
      • sampling

        protected boolean sampling
        Sampling flag.
      • exact

        protected boolean exact
        Compute exactly (slower).
    • Constructor Detail

      • DistanceStatisticsWithClasses

        public DistanceStatisticsWithClasses​(Distance<? super O> distance,
                                             int numbins,
                                             boolean exact,
                                             boolean sampling)
        Constructor.
        Parameters:
        distance - Distance function to use
        numbins - Number of bins
        exact - Exactness flag
        sampling - Sampling flag
    • Method Detail

      • getInputTypeRestriction

        public TypeInformation[] getInputTypeRestriction()
        Description copied from interface: Algorithm
        Get the input type restriction used for negotiating the data query.
        Specified by:
        getInputTypeRestriction in interface Algorithm
        Returns:
        Type restriction
      • sampleMinMax

        private DoubleMinMax sampleMinMax​(Relation<O> relation,
                                          DistanceQuery<O> distance)
        Estimate minimum and maximum via sampling.
        Parameters:
        relation - Relation to process
        distance - Distance function to use
        Returns:
        Minimum and maximum
      • exactMinMax

        private DoubleMinMax exactMinMax​(Relation<O> relation,
                                         DistanceQuery<O> distance)
        Compute the exact maximum and minimum.
        Parameters:
        relation - Relation to process
        distance - Distance function
        Returns:
        Exact maximum and minimum
      • shrinkHeap

        private static void shrinkHeap​(java.util.TreeSet<DoubleDBIDPair> hotset,
                                       int k)
        Shrink the heap of "hot" (extreme) items.
        Parameters:
        hotset - Set of hot items
        k - target size