 de.lmu.ifi.dbs.elki.math

Class MeanVariance

• Direct Known Subclasses:
MeanVarianceMinMax

@Reference(authors="Erich Schubert, Michael Gertz",title="Numerically Stable Parallel Computation of (Co-)Variance",booktitle="Proc. 30th Int. Conf. Scientific and Statistical Database Management (SSDBM 2018)",url="https://doi.org/10.1145/3221269.3223036",bibkey="DBLP:conf/ssdbm/SchubertG18") @Reference(authors="E. A. Youngs, E. M. Cramer",title="Some Results Relevant to Choice of Sum and Sum-of-Product Algorithms",booktitle="Technometrics 13(3)",url="https://doi.org/10.1080/00401706.1971.10488826",bibkey="doi:10.1080/00401706.1971.10488826") @Reference(authors="B. P. Welford",title="Note on a method for calculating corrected sums of squares and products",booktitle="Technometrics 4(3)",url="https://doi.org/10.2307/1266577",bibkey="doi:10.2307/1266577") @Reference(authors="D. H. D. West",title="Updating Mean and Variance Estimates: An Improved Method",booktitle="Communications of the ACM 22(9)",url="https://doi.org/10.1145/359146.359153",bibkey="DBLP:journals/cacm/West79")
public class MeanVariance
extends Mean
Do some simple statistics (mean, variance) using a numerically stable online algorithm.

This class can repeatedly be fed with data using the add() methods, the resulting values for mean and average can be queried at any time using Mean.getMean() and getSampleVariance().

Make sure you have understood variance correctly when using getNaiveVariance() - since this class is fed with samples and estimates the mean from the samples, getSampleVariance() is often the more appropriate version.

As experimentally studied in

Erich Schubert, Michael Gertz
Numerically Stable Parallel Computation of (Co-)Variance
Proc. 30th Int. Conf. Scientific and Statistical Database Management (SSDBM 2018)

the current approach is based on:

E. A. Youngs and E. M. Cramer
Some Results Relevant to Choice of Sum and Sum-of-Product Algorithms
Technometrics 13(3), 1971

We have originally experimented with:

B. P. Welford
Note on a method for calculating corrected sums of squares and products
Technometrics 4(3), 1962

D. H. D. West
Updating Mean and Variance Estimates: An Improved Method
Communications of the ACM 22(9)

Since:
0.2
Author:
Erich Schubert
• Field Summary

Fields
Modifier and Type Field and Description
protected double m2
n times Variance

n, sum
• Method Summary

All Methods
Modifier and Type Method and Description
double getNaiveStddev()
Return standard deviation using the non-sample variance Note: usually, you should be using getSampleStddev() instead!
double getNaiveVariance()
Return the naive variance (not taking sampling into account) Note: usually, you should be using getSampleVariance() instead!
double getSampleStddev()
Return standard deviation
double getSampleVariance()
Return sample variance.
double getSumOfSquares()
Get the sum of squares.
static MeanVariance[] newArray(int dimensionality)
Create and initialize a new array of MeanVariance
void put(double val)
Add a single value with weight 1.0
MeanVariance put(double[] vals)
Add values with weight 1.0
MeanVariance put(double[] vals, double[] weights)
Add values with weight 1.0
void put(double val, double weight)
Add data with a given weight.
void put(Mean other)
Join the data of another MeanVariance instance.
void reset()
Reset the value.
java.lang.String toString()
• Methods inherited from class java.lang.Object

clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
• Field Detail

• m2

protected double m2
n times Variance
• Constructor Detail

• MeanVariance

public MeanVariance()
Empty constructor
• MeanVariance

public MeanVariance(MeanVariance other)
Constructor from other instance
Parameters:
other - other instance to copy data from.
• Method Detail

• put

public void put(double val)
Add a single value with weight 1.0
Overrides:
put in class Mean
Parameters:
val - Value
• put

public void put(double val,
double weight)
Add data with a given weight.
Overrides:
put in class Mean
Parameters:
val - data
weight - weight
• put

public void put(Mean other)
Join the data of another MeanVariance instance.
Overrides:
put in class Mean
Parameters:
other - Data to join with
• put

public MeanVariance put(double[] vals)
Add values with weight 1.0
Overrides:
put in class Mean
Parameters:
vals - Values
Returns:
this
• put

public MeanVariance put(double[] vals,
double[] weights)
Description copied from class: Mean
Add values with weight 1.0
Overrides:
put in class Mean
Parameters:
vals - Values
Returns:
this
• getNaiveVariance

public double getNaiveVariance()
Return the naive variance (not taking sampling into account) Note: usually, you should be using getSampleVariance() instead!
Returns:
variance
• getSampleVariance

public double getSampleVariance()
Return sample variance.
Returns:
sample variance
• getSumOfSquares

public double getSumOfSquares()
Get the sum of squares.
Returns:
sum of squared deviations
• getNaiveStddev

public double getNaiveStddev()
Return standard deviation using the non-sample variance Note: usually, you should be using getSampleStddev() instead!
Returns:
stddev
• getSampleStddev

public double getSampleStddev()
Return standard deviation
Returns:
stddev
• newArray

public static MeanVariance[] newArray(int dimensionality)
Create and initialize a new array of MeanVariance
Parameters:
dimensionality - Dimensionality
Returns:
New and initialized Array
• toString

public java.lang.String toString()
Overrides:
toString in class Mean
• reset

public void reset()
Description copied from class: Mean
Reset the value.
Overrides:
reset in class Mean