## Class PearsonCorrelation

• public class PearsonCorrelation
extends java.lang.Object
Class to compute the Pearson correlation coefficient (PCC) also known as Pearson product-moment correlation coefficient (PPMCC).

This computes Var(X), Var(Y) and Cov(X, Y), all of which can be obtained from this class. If you need more than two variables, use CovarianceMatrix which uses slightly more memory (by using arrays) but essentially does the same.

This method used a numerically more stable approach than the popular $$E[XY]-E[X]E[Y]$$ based version.

Since:
0.5.0
Author:
Erich Schubert
• ### Field Summary

Fields
Modifier and Type Field Description
private double sumWe
Weight sum.
private double sumX
Current mean for X and Y.
private double sumXX
Aggregation for squared residuals - we are not using sum-of-squares!
private double sumXY
Aggregation for squared residuals - we are not using sum-of-squares!
private double sumY
Current mean for X and Y.
private double sumYY
Aggregation for squared residuals - we are not using sum-of-squares!
• ### Constructor Summary

Constructors
Constructor Description
PearsonCorrelation()
Constructor.
• ### Method Summary

All Methods
Modifier and Type Method Description
static double coefficient​(double[] x, double[] y)
Compute the Pearson product-moment correlation coefficient.
static double coefficient​(NumberVector x, NumberVector y)
Compute the Pearson product-moment correlation coefficient for two NumberVectors.
double getCorrelation()
Get the Pearson correlation value.
double getCount()
Get the number of points the average is based on.
double getMeanX()
Return mean of X
double getMeanY()
Return mean of Y
double getNaiveCovariance()
Get the covariance of X and Y (not taking sampling into account)
double getPopulationStddevX()
Return standard deviation using the non-sample variance
double getPopulationStddevY()
Return standard deviation using the non-sample variance
double getPopulationVarianceX()
Return the naive variance (not taking sampling into account)
double getPopulationVarianceY()
Return the naive variance (not taking sampling into account)
double getSampleCovariance()
Get the covariance of X and Y (with sampling correction)
double getSampleStddevX()
Return standard deviation
double getSampleStddevY()
Return standard deviation
double getSampleVarianceX()
Return sample variance.
double getSampleVarianceY()
Return sample variance.
void put​(double x, double y)
Put a single value into the correlation statistic.
void put​(double x, double y, double w)
Put a single value into the correlation statistic.
void reset()
Reset the value.
static double weightedCoefficient​(double[] x, double[] y, double[] weights)
Compute the Pearson product-moment correlation coefficient.
static double weightedCoefficient​(NumberVector x, NumberVector y, double[] weights)
Compute the Pearson product-moment correlation coefficient for two NumberVectors.
static double weightedCoefficient​(NumberVector x, NumberVector y, NumberVector weights)
Compute the Pearson product-moment correlation coefficient,
• ### Field Detail

• #### sumXX

private double sumXX
Aggregation for squared residuals - we are not using sum-of-squares!
• #### sumYY

private double sumYY
Aggregation for squared residuals - we are not using sum-of-squares!
• #### sumXY

private double sumXY
Aggregation for squared residuals - we are not using sum-of-squares!
• #### sumX

private double sumX
Current mean for X and Y.
• #### sumY

private double sumY
Current mean for X and Y.
• #### sumWe

private double sumWe
Weight sum.
• ### Constructor Detail

• #### PearsonCorrelation

public PearsonCorrelation()
Constructor.
• ### Method Detail

• #### put

public void put​(double x,
double y,
double w)
Put a single value into the correlation statistic.
Parameters:
x - Value in X
y - Value in Y
w - Weight
• #### put

public void put​(double x,
double y)
Put a single value into the correlation statistic.
Parameters:
x - Value in X
y - Value in Y
• #### getCorrelation

public double getCorrelation()
Get the Pearson correlation value.
Returns:
Correlation value
• #### getCount

public double getCount()
Get the number of points the average is based on.
Returns:
number of data points
• #### getMeanX

public double getMeanX()
Return mean of X
Returns:
mean
• #### getMeanY

public double getMeanY()
Return mean of Y
Returns:
mean
• #### getNaiveCovariance

public double getNaiveCovariance()
Get the covariance of X and Y (not taking sampling into account)
Returns:
Covariance
• #### getSampleCovariance

public double getSampleCovariance()
Get the covariance of X and Y (with sampling correction)
Returns:
Covariance
• #### getPopulationVarianceX

public double getPopulationVarianceX()
Return the naive variance (not taking sampling into account)

Note: often you should be using getSampleVarianceX() instead!

Returns:
variance
• #### getSampleVarianceX

public double getSampleVarianceX()
Return sample variance.
Returns:
sample variance
• #### getPopulationStddevX

public double getPopulationStddevX()
Return standard deviation using the non-sample variance

Note: often you should be using getSampleStddevX() instead!

Returns:
standard deviation
• #### getSampleStddevX

public double getSampleStddevX()
Return standard deviation
Returns:
standard deviation
• #### getPopulationVarianceY

public double getPopulationVarianceY()
Return the naive variance (not taking sampling into account)

Note: often you should be using getSampleVarianceY() instead!

Returns:
variance
• #### getSampleVarianceY

public double getSampleVarianceY()
Return sample variance.
Returns:
sample variance
• #### getPopulationStddevY

public double getPopulationStddevY()
Return standard deviation using the non-sample variance

Note: often you should be using getSampleStddevY() instead!

Returns:
stddev
• #### getSampleStddevY

public double getSampleStddevY()
Return standard deviation
Returns:
stddev
• #### reset

public void reset()
Reset the value.
• #### coefficient

public static double coefficient​(double[] x,
double[] y)
Compute the Pearson product-moment correlation coefficient.
Parameters:
x - first data array
y - second data array
Returns:
the Pearson product-moment correlation coefficient for x and y
• #### coefficient

public static double coefficient​(NumberVector x,
NumberVector y)
Compute the Pearson product-moment correlation coefficient for two NumberVectors.
Parameters:
x - first NumberVector
y - second NumberVector
Returns:
the Pearson product-moment correlation coefficient for x and y
• #### weightedCoefficient

public static double weightedCoefficient​(double[] x,
double[] y,
double[] weights)
Compute the Pearson product-moment correlation coefficient.
Parameters:
x - first data array
y - second data array
weights - Weights
Returns:
the Pearson product-moment correlation coefficient for x and y
• #### weightedCoefficient

public static double weightedCoefficient​(NumberVector x,
NumberVector y,
double[] weights)
Compute the Pearson product-moment correlation coefficient for two NumberVectors.
Parameters:
x - first NumberVector
y - second NumberVector
weights - Weights
Returns:
the Pearson product-moment correlation coefficient for x and y
• #### weightedCoefficient

public static double weightedCoefficient​(NumberVector x,
NumberVector y,
NumberVector weights)
Compute the Pearson product-moment correlation coefficient,
Parameters:
x - first NumberVector
y - second NumerVector
weights - Weights
Returns:
the Pearson product-moment correlation coefficient for x and y