Class SupportVectorClustering

  • All Implemented Interfaces:
    Algorithm, ClusteringAlgorithm<Clustering<? extends Model>>

    @Reference(authors="A. Ben-Hur, D. Horn, H. T. Siegelmann, V. Vapnik",title="A Support Vector Method for Clustering",booktitle="Neural Information Processing Systems",url="https://proceedings.neurips.cc/paper/2000/hash/14cfdb59b5bda1fc245aadae15b1984a-Abstract.html",bibkey="DBLP:conf/nips/Ben-HurHSV00") @Reference(authors="A. Ben-Hur, H. T. Siegelmann, D. Horn, V. Vapnik",title="A Support Vector Clustering Method",booktitle="International Conference on Pattern Recognition (ICPR)",url="https://doi.org/10.1109/ICPR.2000.906177",bibkey="DBLP:conf/icpr/Ben-HurSHV00") @Reference(authors="A. Ben-Hur, D. Horn, H. T. Siegelmann, V. Vapnik",title="Support Vector Clustering",booktitle="Journal of Machine Learning Research",url="http://jmlr.org/papers/v2/horn01a.html",bibkey="DBLP:journals/jmlr/Ben-HurHSV01")
    public class SupportVectorClustering
    extends java.lang.Object
    implements ClusteringAlgorithm<Clustering<? extends Model>>
    Support Vector Clustering works on SVDD, which tries to find the smallest sphere enclosing all objects in kernel space. SupportVectorClustering then checks if the line between two data points stay inside the sphere in kernel space. Clusters are those points connected by enclosed lines.

    References:

    A. Ben-Hur, D. Horn, H. T. Siegelmann, V. Vapnik
    A Support Vector Method for Clustering
    Neural Information Processing Systems

    A. Ben-Hur, H. T. Siegelmann, D. Horn, V. Vapnik
    A Support Vector Clustering Method
    International Conference on Pattern Recognition (ICPR)

    A. Ben-Hur, D. Horn, H. T. Siegelmann, V. Vapnik
    Support Vector Clustering
    Journal of Machine Learning Research 2 (2001)

    Since:
    0.8.0
    Author:
    Robert Gehde
    • Field Detail

      • LOG

        private static final Logging LOG
        Class logger.
      • C

        double C
        C parameter.
      • lsz

        int lsz
        Sample size for line check. (lsz-1 are between the points, last is on point)
    • Constructor Detail

      • SupportVectorClustering

        public SupportVectorClustering​(PrimitiveSimilarity<? super NumberVector> kernel,
                                       double C)
        Constructor.
        Parameters:
        kernel - kernel to use
        C - C parameter
    • Method Detail

      • checkConnectivity

        private boolean checkConnectivity​(Relation<NumberVector> relation,
                                          double[] start,
                                          DBIDRef destRef,
                                          RegressionModel model,
                                          double fixed,
                                          ArrayDBIDs ids,
                                          SimilarityQuery<NumberVector> sim,
                                          double r_squared)
        Checks if the connecting line between start and dest lies inside the kernel space sphere.
        Parameters:
        relation - database
        start - start vector as array
        destRef - dest vector as DBIDRef
        model - model containing sphere
        fixed - fixed part of evaluation
        ids - ArrayDBIDs used to train model
        sim - Similarity Query used to train model
        r_squared - squared radius of trained model
        Returns:
        true if connected, false if not
      • accept

        private boolean accept​(NumberVector cur,
                               RegressionModel model,
                               double fixed,
                               ArrayDBIDs ids,
                               SimilarityQuery<NumberVector> sim,
                               double r_square)
        evaluate if a point cur is inside the sphere in kernel space.
        Parameters:
        cur - point to evaluate
        model - Model to check the point in
        fixed - fixed part of calculation
        ids - IDs used for access
        sim - kernel similarity query
        r_square - squared radius
        Returns:
        true iff point is inside sphere
      • calcfixedpart

        private double calcfixedpart​(RegressionModel model,
                                     ArrayDBIDs ids,
                                     SimilarityQuery<NumberVector> sim)
        calculate fixed part of model evaluation
        Parameters:
        model - model to calculate the fixed part for
        ids - IDs for access
        sim - kernel similarity query
        Returns:
        fixed part of evaluation
      • getInputTypeRestriction

        public TypeInformation[] getInputTypeRestriction()
        Description copied from interface: Algorithm
        Get the input type restriction used for negotiating the data query.
        Specified by:
        getInputTypeRestriction in interface Algorithm
        Returns:
        Type restriction