Class PerturbationFilter<V extends NumberVector>

  • All Implemented Interfaces:
    ObjectFilter

    @Title("Data Perturbation for Outlier Detection Ensembles")
    @Description("A filter to perturb a datasset on read by an additive noise component, implemented for use in an outlier ensemble (this reference).")
    @Reference(authors="A. Zimek, R. J. G. B. Campello, J. Sander",
               title="Data Perturbation for Outlier Detection Ensembles",
               booktitle="Proc. 26th International Conference on Scientific and Statistical Database Management (SSDBM), Aalborg, Denmark, 2014",
               url="https://doi.org/10.1145/2618243.2618257",
               bibkey="DBLP:conf/ssdbm/ZimekCS14")
    public class PerturbationFilter<V extends NumberVector>
    extends AbstractVectorConversionFilter<V,​V>
    A filter to perturb the values by adding micro-noise.

    The added noise is generated, attribute-wise, by a Gaussian with mean=0 and a specified standard deviation or by a uniform distribution with a specified range. The standard deviation or the range can be scaled, attribute-wise, to a given percentage of the original standard deviation in the data distribution (assuming a Gaussian distribution there), or to a percentage of the extension in each attribute (maximumValue - minimumValue).

    This filter has a potentially wide use but has been implemented for the following publication:

    Reference:

    A. Zimek, R. J. G. B. Campello, J. Sander
    Data Perturbation for Outlier Detection Ensemble
    Proc. 26th Int. Conf. on Scientific and Statistical Database Management (SSDBM 2014)

    Since:
    0.7.0
    Author:
    Arthur Zimek