Class PerplexityAffinityMatrixBuilder<O>

  • Type Parameters:
    O - Vector type
    All Implemented Interfaces:
    AffinityMatrixBuilder<O>
    Direct Known Subclasses:
    NearestNeighborAffinityMatrixBuilder

    @Reference(authors="G. Hinton, S. Roweis",
               title="Stochastic Neighbor Embedding",
               booktitle="Advances in Neural Information Processing Systems 15",
               url="http://papers.nips.cc/paper/2276-stochastic-neighbor-embedding",
               bibkey="DBLP:conf/nips/HintonR02")
    public class PerplexityAffinityMatrixBuilder<O>
    extends GaussianAffinityMatrixBuilder<O>
    Compute the affinity matrix for SNE and tSNE.

    Reference:

    G. Hinton, S. Roweis
    Stochastic Neighbor Embedding
    Advances in Neural Information Processing Systems 15

    Since:
    0.7.5
    Author:
    Erich Schubert
    • Field Detail

      • LOG

        private static final Logging LOG
        Class logger.
      • PERPLEXITY_ERROR

        protected static final double PERPLEXITY_ERROR
        Threshold for optimizing perplexity.
        See Also:
        Constant Field Values
      • PERPLEXITY_MAXITER

        protected static final int PERPLEXITY_MAXITER
        Maximum number of iterations when optimizing perplexity.
        See Also:
        Constant Field Values
      • MIN_PIJ

        protected static final double MIN_PIJ
        Minimum value for pij entries (even when duplicate)
        See Also:
        Constant Field Values
      • distance

        protected Distance<? super O> distance
        Input distance function.
      • perplexity

        protected double perplexity
        Perplexity.
    • Constructor Detail

      • PerplexityAffinityMatrixBuilder

        public PerplexityAffinityMatrixBuilder​(Distance<? super O> distance,
                                               double perplexity)
        Constructor.
        Parameters:
        distance - Distance function
        perplexity - Perplexity
    • Method Detail

      • computePij

        protected static double[][] computePij​(double[][] dist,
                                               double perplexity,
                                               double initialScale)
        Compute the pij from the distance matrix.
        Parameters:
        dist - Distance matrix.
        perplexity - Desired perplexity
        initialScale - Initial scale
        Returns:
        Affinity matrix pij
      • computePi

        protected static double computePi​(int i,
                                          double[] dist_i,
                                          double[] pij_i,
                                          double perplexity,
                                          double logPerp)
        Compute row pij[i], using binary search on the kernel bandwidth sigma to obtain the desired perplexity.
        Parameters:
        i - Current point
        dist_i - Distance matrix row pij[i]
        pij_i - Output row
        perplexity - Desired perplexity
        logPerp - Log of desired perplexity
        Returns:
        Beta
      • estimateInitialBeta

        protected static double estimateInitialBeta​(double[] dist_i,
                                                    double perplexity)
        Estimate beta from the distances in a row.

        This lacks a mathematical argument, but is a handcrafted heuristic to avoid numerical problems. The average distance is usually too large, so we scale the average distance by 2*N/perplexity. Then estimate beta as 1/x.

        Parameters:
        dist_i - Distances
        perplexity - Desired perplexity
        Returns:
        Estimated beta.