@Reference(authors="T. W. Anderson, D. A. Darling", title="Asymptotic theory of certain \'goodness of fit\' criteria based on stochastic processes", booktitle="Annals of mathematical statistics 23(2)", url="https://doi.org/10.1214/aoms/1177729437", bibkey="doi:10.1214/aoms/1177729437") public class AndersonDarlingTest extends java.lang.Object
This is a test against normality / goodness of fit. I.e. you can use it to reject the hypothesis that the data is normal distributed. Such tests are sensitive to data set size: on small samples, even large deviations could be by-chance and thus not allow rejection. On the other hand, on large data sets even a slight deviation can be unlikely to happen if the data were indeed normal distributed. Thus, this test is more likely to fail to reject small data sets even when they intuitively do not appear to be normal distributed, while it will reject large data sets that originate from a distribution only slightly different from the normal distribution.
Before using, make sure you have understood statistical tests, and the difference between failure-to-reject and acceptance!
The data size should be at least 8 before the results start getting somewhat reliable. For large data sets, the chance of rejecting the normal distribution hypothesis increases a lot: no real data looks exactly like a normal distribution.
T. W. Anderson, D. A. Darling
Asymptotic theory of certain 'goodness of fit' criteria based on stochastic processes
Annals of mathematical statistics 23(2)
M. A. Stephens
EDF Statistics for Goodness of Fit and Some Comparisons
Journal of the American Statistical Association 69(347)
|Modifier||Constructor and Description|
|Modifier and Type||Method and Description|
Test a sorted but not standardized data set.
Test a sorted data set against the standard normal distribution.
Remove bias from the Anderson-Darling statistic if the mean and standard deviation were estimated from the data, and a normal distribution was assumed.
public static double A2StandardNormal(double sorted)
Note: the data will be compared to the standard normal distribution, i.e. with mean 0 and variance 1.
The data size should be at least 8 before the results start getting somewhat reliable. For large data sets, the chance of rejecting increases a lot: no real data looks exactly like a normal distribution.
sorted- Sorted input data.
public static double A2Noncentral(double sorted)
The data size should be at least 8!
sorted- Sorted input data.
@Reference(authors="M. A. Stephens", title="EDF Statistics for Goodness of Fit and Some Comparisons", booktitle="Journal of the American Statistical Association, Volume 69, Issue 347", url="https://doi.org/10.1080/01621459.1974.10480196", bibkey="doi:10.1080/01621459.1974.10480196") public static double removeBiasNormalDistribution(double A2, int n)
A2- A2 statistic
n- Sample size
Copyright © 2019 ELKI Development Team. License information.