Class Segments

  • All Implemented Interfaces:

    @Reference(authors="Elke Achtert, Sascha Goldhofer, Hans-Peter Kriegel, Erich Schubert, Arthur Zimek",
               title="Evaluation of Clusterings - Metrics and Visual Support",
               booktitle="Proc. 28th International Conference on Data Engineering (ICDE 2012)",
    public class Segments
    extends java.lang.Object
    implements java.lang.Iterable<Segment>
    Creates segments of two or more clusterings.

    Segments are the equally paired database objects of all given (2+) clusterings. Given a contingency table, an object Segment represents the table's cells where an intersection of classes and labels are given. Pair Segments are created by converting an object Segment into its pair representation. Converting all object Segments into pair Segments results in a larger number of pair Segments, if any fragmentation (no perfect match of clusters) within the contingency table has occurred (multiple cells on one row or column). Thus for ever object Segment exists a corresponding pair Segment. Additionally pair Segments represent pairs that are only in one Clustering which occurs for each split of a clusterings cluster by another clustering. Here, these pair Segments are referenced as fragmented Segments. Within the visualization they describe (at least two) pair Segments that have a corresponding object Segment.


    Elke Achtert, Sascha Goldhofer, Hans-Peter Kriegel, Erich Schubert, Arthur Zimek
    Evaluation of Clusterings – Metrics and Visual Support
    Proc. 28th International Conference on Data Engineering (ICDE 2012)

    Sascha Goldhofer, Erich Schubert
    • Field Detail

      • LOG

        private static final Logging LOG
        Class logger
      • clusterings

        private java.util.List<Clustering<?>> clusterings
      • clusters

        private java.util.List<java.util.List<? extends Cluster<?>>> clusters
      • clusteringsCount

        private int clusteringsCount
        Number of clusterings in comparison
      • numclusters

        private int[] numclusters
        Number of Clusters for each clustering
      • totalObjects

        private int totalObjects
        Total number of objects
      • actualPairs

        private long actualPairs
        Pairs actually present in the data set
      • segments

        private java.util.TreeMap<Segment,​Segment> segments
        The actual segments
    • Constructor Detail

      • Segments

        public Segments​(java.util.List<Clustering<?>> clusterings)
        Initialize segments. Add DB objects via addObject method.
        clusterings - List of clusterings in comparison
    • Method Detail

      • recursivelyFill

        private void recursivelyFill​(java.util.List<java.util.List<? extends Cluster<?>>> cs)
      • recursivelyFill

        private void recursivelyFill​(java.util.List<java.util.List<? extends Cluster<?>>> cs,
                                     int depth,
                                     SetDBIDs first,
                                     SetDBIDs second,
                                     int[] path,
                                     boolean objectsegment)
      • makeOrUpdateSegment

        private void makeOrUpdateSegment​(int[] path,
                                         DBIDs ids,
                                         int pairsize)
      • getClusteringDescription

        public java.lang.String getClusteringDescription​(int clusteringID)
        Get the description of the nth clustering.
        clusteringID - Clustering number
        long name of clustering
      • getPairedSegments

        public java.util.List<Segment> getPairedSegments​(Segment unpairedSegment)
        Return to a given segment with unpaired objects, the corresponding segments that result in an unpaired segment. So, one cluster of a clustering is split by another clustering in multiple segments, resulting in a segment with unpaired objects, describing the missing pairs between the split cluster / between the segments.

        Basically we compare only two clusterings at once. If those clusterings do not have the whole cluster in common, we have at least three segments (two cluster), one of them containing the unpaired segment. A segmentID 3-0, describes a cluster 3 in clustering 1 (index 0) and all clusters 3-x in clustering 2. So we search for all segments 3-x (0 being a wildcard).

        unpairedSegment -
        Segments describing the set of objects that result in an unpaired segment
      • unifySegment

        public Segment unifySegment​(Segment temp)
        temp - Temporary segment to be unified
        the segmentID given by its string representation
      • size

        public int size()
        Get the number of segments
        Number of segments
      • getPairCount

        public long getPairCount​(boolean withUnclusteredPairs)
        Get total number of pairs with or without the unclustered pairs.
        withUnclusteredPairs - if false, segment with unclustered pairs is removed
        pair count, with or without unclusted (non-existant) pairs
      • getClusterings

        public int getClusterings()
        Get the number of clusterings
        number of clusterings compared
      • getTotalClusterCount

        public int getTotalClusterCount()
        Return the sum of all clusters
        sum of all cluster counts
      • getHighestClusterCount

        public int getHighestClusterCount()
        Returns the highest number of Clusters in the clusterings
        highest cluster count
      • iterator

        public java.util.Iterator<Segment> iterator()
        Specified by:
        iterator in interface java.lang.Iterable<Segment>