Class AutotuningPCA


  • @Reference(authors="Hans-Peter Kriegel, Peer Kr\u00f6ger, Erich Schubert, Arthur Zimek",
               title="A General Framework for Increasing the Robustness of PCA-based Correlation Clustering Algorithms",
               booktitle="Proc. 20th Intl. Conf. on Scientific and Statistical Database Management (SSDBM)",
               url="https://doi.org/10.1007/978-3-540-69497-7_27",
               bibkey="DBLP:conf/ssdbm/KriegelKSZ08")
    public class AutotuningPCA
    extends PCARunner
    Performs a self-tuning local PCA based on the covariance matrices of given objects. At most the closest 'k' points are used in the calculation and a weight function is applied.

    The number of points used depends on when the strong eigenvectors exhibit the clearest correlation.

    Reference:

    A General Framework for Increasing the Robustness of PCA-Based Correlation Clustering Algorithms
    Hans-Peter Kriegel and Peer Kröger and Erich Schubert and Arthur Zimek
    Proc. 20th Int. Conf. on Scientific and Statistical Database Management (SSDBM)

    Since:
    0.5.0
    Author:
    Erich Schubert
    • Constructor Detail

      • AutotuningPCA

        public AutotuningPCA​(CovarianceMatrixBuilder covarianceMatrixBuilder,
                             EigenPairFilter filter)
        Constructor.
        Parameters:
        covarianceMatrixBuilder - Covariance matrix builder
        filter - Filter to select eigenvectors
    • Method Detail

      • processIds

        public PCAResult processIds​(DBIDs ids,
                                    Relation<? extends NumberVector> database)
        Description copied from class: PCARunner
        Run PCA on a collection of database IDs.
        Overrides:
        processIds in class PCARunner
        Parameters:
        ids - a collection of ids
        database - the database used
        Returns:
        PCA result
      • reversed

        private static double[] reversed​(double[] a)
        Sort an array of doubles in descending order.
        Parameters:
        a - Values
        Returns:
        Values in descending order
      • computeExplainedVariance

        private double computeExplainedVariance​(double[] eigenValues,
                                                int filteredEigenPairs)
        Compute the explained variance for a filtered EigenPairs.
        Parameters:
        eigenValues - Eigen values
        filteredEigenPairs - Filtered eigenpairs
        Returns:
        explained variance by the strong eigenvectors.
      • assertSortedByDistance

        private void assertSortedByDistance​(DoubleDBIDList results)
        Ensure that the results are sorted by distance.
        Parameters:
        results - Results to process