Class IntrinsicNearestNeighborAffinityMatrixBuilder<O>

  • Type Parameters:
    O - Object type
    All Implemented Interfaces:
    AffinityMatrixBuilder<O>

    @Title("Intrinsic t-Stochastic Neighbor Embedding")
    @Reference(authors="Erich Schubert, Michael Gertz",
               title="Intrinsic t-Stochastic Neighbor Embedding for Visualization and Outlier Detection: A Remedy Against the Curse of Dimensionality?",
               booktitle="Proc. Int. Conf. Similarity Search and Applications, SISAP 2017",
               url="https://doi.org/10.1007/978-3-319-68474-1_13",
               bibkey="DBLP:conf/sisap/SchubertG17")
    public class IntrinsicNearestNeighborAffinityMatrixBuilder<O>
    extends NearestNeighborAffinityMatrixBuilder<O>
    Build sparse affinity matrix using the nearest neighbors only, adjusting for intrinsic dimensionality. On data sets with high intrinsic dimensionality, this can give better results.

    Furthermore, this approach uses a different rule to combine affinities: rather than taking the arithmetic average of \(p_{ij}\) and \(p_{ji}\), we use \(\sqrt{p_{ij} \cdot p_{ji}}\), which prevents outliers from attaching closely to nearby clusters.

    Reference:

    Erich Schubert, Michael Gertz
    Intrinsic t-Stochastic Neighbor Embedding for Visualization and Outlier Detection: A Remedy Against the Curse of Dimensionality?
    Proc. Int. Conf. Similarity Search and Applications, SISAP 2017

    Since:
    0.7.5
    Author:
    Erich Schubert
    • Constructor Detail

      • IntrinsicNearestNeighborAffinityMatrixBuilder

        public IntrinsicNearestNeighborAffinityMatrixBuilder​(Distance<? super O> distance,
                                                             double perplexity,
                                                             DistanceBasedIntrinsicDimensionalityEstimator estimator)
        Constructor.
        Parameters:
        distance - Distance function
        perplexity - Perplexity
        estimator - Estimator of intrinsic dimensionality
    • Method Detail

      • computePij

        protected void computePij​(DBIDRange ids,
                                  KNNSearcher<DBIDRef> knnq,
                                  boolean square,
                                  int numberOfNeighbours,
                                  double[][] pij,
                                  int[][] indices,
                                  double initialScale)
        Compute the sparse pij using the nearest neighbors only.
        Overrides:
        computePij in class NearestNeighborAffinityMatrixBuilder<O>
        Parameters:
        ids - ID range
        knnq - kNN query
        square - Use squared distances
        numberOfNeighbours - Number of neighbors to get
        pij - Output of distances
        indices - Output of indexes
        initialScale - Initial scaling factor
      • convertNeighbors

        protected void convertNeighbors​(DBIDRange ids,
                                        DBIDRef ix,
                                        boolean square,
                                        KNNList neighbours,
                                        DoubleArray dist,
                                        IntegerArray ind,
                                        Mean m)
        Load a neighbor query result into a double and and integer array, also removing the query point. This is necessary, because we have to modify the distances. TODO: sort by index, not distance
        Parameters:
        ids - Indexes
        ix - Current Object
        square - Use squared distances
        neighbours - Neighbor list
        dist - Output distance array
        ind - Output index array
        m - Mean id, for statistics.