Class FuzzyCMeans<V extends NumberVector>

  • Type Parameters:
    V - Vector Type of the data, must be subclass of NumberVector
    All Implemented Interfaces:
    Algorithm, ClusteringAlgorithm<Clustering<MeanModel>>

    @Reference(authors="J. C. Dunn",title="A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters",booktitle="Journal of Cybernetics 3(3)",url="https://doi.org/10.1080/01969727308546046",bibkey="doi:10.1080/01969727308546046") @Reference(authors="J. Bezdek",title="Pattern Recognition With Fuzzy Objective Function Algorithms",booktitle="Pattern Recognition With Fuzzy Objective Function Algorithms",url="https://doi.org/10.1007/978-1-4757-0450-1",bibkey="DBLP:books/sp/Bezdek81")
    public class FuzzyCMeans<V extends NumberVector>
    extends java.lang.Object
    implements ClusteringAlgorithm<Clustering<MeanModel>>
    Fuzzy Clustering developed by Dunn and revisited by Bezdek

    It minimizes the sum of squared distances times the weight of the assignment to the power m.

    Reference:

    J. C. Dunn
    A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters
    Journal of Cybernetics 3(3)

    J. Bezdek
    Pattern Recognition With Fuzzy Objective Function Algorithms
    Springer, 1981

    Since:
    0.8.0
    Author:
    Robert Gehde
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  FuzzyCMeans.Par
      Parameterization class.
    • Field Summary

      Fields 
      Modifier and Type Field Description
      private double delta
      Delta parameter
      (package private) KMeansInitialization initializer
      Produces initial cluster.
      private int k
      Number of clusters
      private static java.lang.String KEY
      Key for statistics logging.
      private static Logging LOG
      The logger for this class.
      private double m
      Weight exponent
      private int maxiter
      Maximum number of iterations to allow
      private int miniter
      Minimum number of iterations to do
      private boolean soft
      Retain soft assignments.
      static SimpleTypeInformation<double[]> SOFT_TYPE
      Soft assignment result type.
    • Constructor Summary

      Constructors 
      Constructor Description
      FuzzyCMeans​(int k, int miniter, int maxiter, double delta, double m, boolean soft, KMeansInitialization initialization)
      Constructor.
    • Field Detail

      • LOG

        private static final Logging LOG
        The logger for this class.
      • KEY

        private static final java.lang.String KEY
        Key for statistics logging.
      • k

        private int k
        Number of clusters
      • m

        private double m
        Weight exponent
      • delta

        private double delta
        Delta parameter
      • miniter

        private int miniter
        Minimum number of iterations to do
      • maxiter

        private int maxiter
        Maximum number of iterations to allow
      • soft

        private boolean soft
        Retain soft assignments.
    • Constructor Detail

      • FuzzyCMeans

        public FuzzyCMeans​(int k,
                           int miniter,
                           int maxiter,
                           double delta,
                           double m,
                           boolean soft,
                           KMeansInitialization initialization)
        Constructor.
        Parameters:
        k - number of clusters
        miniter - minimum iterations
        maxiter - maximum iterations
        delta - stopping threshold
        m - weight exponent
        soft - retain soft clustering?
        initialization - initial cluster centers
    • Method Detail

      • run

        public Clustering<MeanModel> run​(Relation<V> relation)
        Runs Fuzzy C Means clustering on the given Relation
        Parameters:
        relation - data to cluster
        Returns:
        Clustering
      • updateMeans

        private double updateMeans​(Relation<V> relation,
                                   WritableDataStore<double[]> probClusterIGivenX,
                                   double[][] means,
                                   int d)
        Updates the means according to the weighted means of all data points. Returns the objective function value \[ \sum_k \sum_i (u_{ik}^m \cdot d_{ik}^2) \]
        Parameters:
        relation - data points
        probClusterIGivenX - weights of clusters
        means - destination array for means
        d - dimensionality of the data
        Returns:
        objective function value
      • assignProbabilitiesToInstances

        public double assignProbabilitiesToInstances​(Relation<V> relation,
                                                     double[][] centers,
                                                     WritableDataStore<double[]> probClusterIGivenX)
        Calculates the weights of all points and clusters. As they add up to one for each point, they can be seen as cluster probabilities \(P(c_i|x_j)\). Then returns the difference of the weight matrix to the last weight matrix calculated with Frobenius norm and normalized by the number of data points and cluster \[ \frac{1}{Nk} \sum_i \sum_j (w_{ij} - w^\prime_{ij})^2 \]
        Parameters:
        relation - data points
        centers - current cluster centers
        probClusterIGivenX - destination datastore for probabilities/weights
        Returns:
        normalized Frobenius norm between last and current weight matrix
      • distance

        private double distance​(V v1,
                                double[] v2)
        Distance computation.
        Parameters:
        v1 - Data vector
        v2 - cluster mean
        Returns:
        Distance
      • distance

        private double distance​(double[] v1,
                                double[] v2)
        Distance computation.
        Parameters:
        v1 - Data vector
        v2 - cluster mean
        Returns:
        Distance
      • getInputTypeRestriction

        public TypeInformation[] getInputTypeRestriction()
        Description copied from interface: Algorithm
        Get the input type restriction used for negotiating the data query.
        Specified by:
        getInputTypeRestriction in interface Algorithm
        Returns:
        Type restriction