Class CompleteLinkage

  • All Implemented Interfaces:
    Linkage

    @Alias({"complete","clink","complete-link","farthest-neighbor"})
    @Priority(100)
    @Reference(authors="T. S\u00f8rensen",title="A method of establishing groups of equal amplitude in plant sociology based on similarity of species and its application to analyses of the vegetation on Danish commons",booktitle="Kongelige Danske Videnskabernes Selskab, Biologiske Skrifter 5(4)",bibkey="journals/misc/Sorensen48") @Reference(authors="P. Macnaughton-Smith",title="Some statistical and other numerical techniques for classifying individuals",booktitle="Home Office Res. Rpt. No. 6, H.M.S.O., London",bibkey="journals/misc/MacnaughtonSmith65") @Reference(authors="G. N. Lance, W. T. Williams",title="A general theory of classificatory sorting strategies 1. Hierarchical systems",booktitle="The Computer Journal 9.4",url="https://doi.org/10.1093/comjnl/9.4.373",bibkey="doi:10.1093/comjnl/9.4.373") @Reference(authors="S. C. Johnson",title="Hierarchical clustering schemes",booktitle="Psychometrika 32",url="https://doi.org/10.1007/BF02289588",bibkey="doi:10.1007/BF02289588")
    public class CompleteLinkage
    extends java.lang.Object
    implements Linkage
    Complete-linkage ("maximum linkage") clustering method.

    The distance of two clusters is simply the maximum of all pairwise distances between the two clusters.

    The distance of two clusters is defined as: \[d_{\max}(A,B):=\max_{a\in A}\max_{b\in B} d(a,b)\]

    This can be computed recursively using: \[d_{\max}(A\cup B,C) = \max(d(A,C), d(B,C))\]

    Note that with similarity functions, one would need to use the minimum instead to get the same effect.

    The algorithm CLINK is a faster algorithm to find such clusterings, but it is very much order dependent and tends to find worse solutions.

    References:

    This is attributed to different sources that are not easily verifiable. Lance and Williams (1967) attribute the idea to Macnaughton-Smith, albeit he may have suggested a divisive rather than agglomerative procedure, but Sørensen may have used this already in 1948 (and is credited, e.g., by Johnson 1967).

    T. Sørensen
    A method of establishing groups of equal amplitude in plant sociology based on similarity of species and its application to analyses of the vegetation on Danish commons
    Kongelige Danske Videnskabernes Selskab, Biologiske Skrifter 5(4)

    P. Macnaughton-Smith
    Some statistical and other numerical techniques for classifying individuals
    Home Office Res. Rpt. No. 6, H.M.S.O., London

    G. N. Lance, W. T. Williams
    A general theory of classificatory sorting strategies
    1. Hierarchical systems
    The Computer Journal 9.4

    S. C. Johnson
    Hierarchical clustering schemes
    Psychometrika 32

    Since:
    0.6.0
    Author:
    Erich Schubert
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  CompleteLinkage.Par
      Class parameterizer.
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static CompleteLinkage STATIC
      Static instance of class.
    • Constructor Summary

      Constructors 
      Constructor Description
      CompleteLinkage()
      Deprecated.
      use the static instance STATIC instead.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      double combine​(int sizex, double dx, int sizey, double dy, int sizej, double dxy)
      Compute combined linkage for two clusters.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • STATIC

        public static final CompleteLinkage STATIC
        Static instance of class.
    • Constructor Detail

      • CompleteLinkage

        @Deprecated
        public CompleteLinkage()
        Deprecated.
        use the static instance STATIC instead.
        Constructor.
    • Method Detail

      • combine

        public double combine​(int sizex,
                              double dx,
                              int sizey,
                              double dy,
                              int sizej,
                              double dxy)
        Description copied from interface: Linkage
        Compute combined linkage for two clusters.
        Specified by:
        combine in interface Linkage
        Parameters:
        sizex - Size of first cluster x before merging
        dx - Distance of cluster x to j before merging
        sizey - Size of second cluster y before merging
        dy - Distance of cluster y to j before merging
        sizej - Size of candidate cluster j
        dxy - Distance between clusters x and y before merging
        Returns:
        Combined distance