Class AGNES<O>

  • Type Parameters:
    O - Object type
    All Implemented Interfaces:
    Algorithm, HierarchicalClusteringAlgorithm
    Direct Known Subclasses:
    Anderberg, NNChain

    @Reference(authors="L. Kaufman, P. J. Rousseeuw",title="Agglomerative Nesting (Program AGNES)",booktitle="Finding Groups in Data: An Introduction to Cluster Analysis",url="https://doi.org/10.1002/9780470316801.ch5",bibkey="doi:10.1002/9780470316801.ch5") @Reference(authors="P. H. Sneath",title="The application of computers to taxonomy",booktitle="Journal of general microbiology, 17(1)",url="https://doi.org/10.1099/00221287-17-1-201",bibkey="doi:10.1099/00221287-17-1-201") @Reference(authors="R. M. Cormack",title="A Review of Classification",booktitle="Journal of the Royal Statistical Society. Series A, Vol. 134, No. 3",url="https://doi.org/10.2307/2344237",bibkey="doi:10.2307/2344237")
    @Alias({"HAC","SAHN"})
    public class AGNES<O>
    extends java.lang.Object
    implements HierarchicalClusteringAlgorithm
    Hierarchical Agglomerative Clustering (HAC) or Agglomerative Nesting (AGNES) is a classic hierarchical clustering algorithm. Initially, each element is its own cluster; the closest clusters are merged at every step, until all the data has become a single cluster.

    This is the naive O(n³) algorithm. See SLINK for a much faster algorithm (however, only for single-linkage).

    This implementation uses the pointer-based representation used by SLINK, so that the extraction algorithms we have can be used with either of them.

    The algorithm is believed to be first published (for single-linkage) by:

    P. H. Sneath
    The application of computers to taxonomy
    Journal of general microbiology, 17(1).

    This algorithm is also known as AGNES (Agglomerative Nesting), where the use of alternative linkage criterions is discussed:

    L. Kaufman, P. J. Rousseeuw
    Agglomerative Nesting (Program AGNES),
    in Finding Groups in Data: An Introduction to Cluster Analysis

    Reference for the unified concept:

    G. N. Lance, W. T. Williams
    A general theory of classificatory sorting strategies 1. Hierarchical systems
    The computer journal 9.4 (1967): 373-380.

    See also:

    R. M. Cormack
    A Review of Classification
    Journal of the Royal Statistical Society. Series A, Vol. 134, No. 3

    Since:
    0.6.0
    Author:
    Erich Schubert
    • Field Detail

      • LOG

        private static final Logging LOG
        Class logger
      • distance

        protected Distance<? super O> distance
        Distance function used.
      • linkage

        protected Linkage linkage
        Current linkage method in use.
    • Constructor Detail

      • AGNES

        public AGNES​(Distance<? super O> distance,
                     Linkage linkage)
        Constructor.
        Parameters:
        distance - Distance function to use
        linkage - Linkage method
    • Method Detail

      • run

        public ClusterMergeHistory run​(Relation<O> relation)
        Run the algorithm
        Parameters:
        relation - Relation
        Returns:
        Clustering hierarchy
      • initializeDistanceMatrix

        protected static ClusterDistanceMatrix initializeDistanceMatrix​(ArrayDBIDs ids,
                                                                        DistanceQuery<?> dq,
                                                                        Linkage linkage)
        Initialize a distance matrix.
        Parameters:
        ids - Object ids
        dq - Distance query
        linkage - Linkage method
        Returns:
        cluster distance matrix
      • getInputTypeRestriction

        public TypeInformation[] getInputTypeRestriction()
        Description copied from interface: Algorithm
        Get the input type restriction used for negotiating the data query.
        Specified by:
        getInputTypeRestriction in interface Algorithm
        Returns:
        Type restriction