Class SquaredErrors

  • All Implemented Interfaces:
    Evaluator, ResultProcessor

    public class SquaredErrors
    extends java.lang.Object
    implements Evaluator
    Evaluate a clustering by reporting the squared errors (SSE, SSQ), as used by k-means. This should be used with SquaredEuclideanDistance only (when used with other distances, it will manually square the values; but beware that the result is less meaningful with other distance functions).

    For clusterings that provide a cluster prototype object (e.g., k-means), the prototype will be used. For other algorithms, the centroid will be recomputed.

    TODO: support non-vector based clusterings, too, if the algorithm provided a prototype object (e.g., PAM).

    TODO: when combined with k-means, detect if the distance functions agree (both should be using squared Euclidean), and reuse the SSQ values provided by k-means.

    Since:
    0.7.0
    Author:
    Erich Schubert
    • Field Detail

      • LOG

        private static final Logging LOG
        Logger for debug output.
      • noiseOption

        private NoiseHandling noiseOption
        Handling of Noise clusters
      • key

        private java.lang.String key
        Key for logging statistics.
    • Constructor Detail

      • SquaredErrors

        public SquaredErrors​(NumberVectorDistance<?> distance,
                             NoiseHandling noiseOption)
        Constructor.
        Parameters:
        distance - Distance function to use.
        noiseOption - Control noise handling.
    • Method Detail

      • evaluateClustering

        public double evaluateClustering​(Relation<? extends NumberVector> rel,
                                         Clustering<?> c)
        Evaluate a single clustering.
        Parameters:
        rel - Data relation
        c - Clustering
        Returns:
        ssq
      • processNewResult

        public void processNewResult​(java.lang.Object result)
        Description copied from interface: ResultProcessor
        Process a result.
        Specified by:
        processNewResult in interface ResultProcessor
        Parameters:
        result - Newly added result subtree.