Class MCDEDependence

  • All Implemented Interfaces:
    Dependence

    @Reference(authors="E. Fouch\u00e9, K. B\u00f6hm",
               title="Monte Carlo Density Estimation",
               booktitle="Proc. Scientific and Statistical Database Management (SSDBM 2019)",
               url="https://doi.org/10.1145/3335783.3335795",
               bibkey="DBLP:conf/ssdbm/FoucheB19")
    public class MCDEDependence
    extends java.lang.Object
    implements Dependence
    Implementation of bivariate Monte Carlo Density Estimation as described in

    This is an abstract class. In order to use MCDE extend it and implement an appropriate statistical test that returns a p-value and index structure for efficient computation of the statistical test.

    The instantiation of MCDE based on the Mann-Whitney U test is called MWPTest (as described in the paper).

    Reference:

    E. Fouché and K. Böhm
    Monte Carlo Density Estimation
    Proc. Scientific and Statistical Database Management (SSDBM 2019)

    Since:
    0.8.0
    Author:
    Alan Mazankiewicz, Edouard Fouché
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  MCDEDependence.Par
      Parameterizer
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected double alpha
      Expected share of instances in slice (independent dimensions).
      protected double beta
      Parameter that specifies the size of the marginal restriction.
      protected int m
      Monte-Carlo iterations.
      protected MCDETest<MCDETest.RankStruct> mcdeTest
      Statistical Test returning p-value tailored to MCDE Framework.
      protected RandomFactory rnd
      Random generator.
    • Field Detail

      • m

        protected int m
        Monte-Carlo iterations.
      • alpha

        protected double alpha
        Expected share of instances in slice (independent dimensions).
      • beta

        protected double beta
        Parameter that specifies the size of the marginal restriction. Note that in the original paper alpha = beta and as such there is no explicit distinction between the parameters.
    • Constructor Detail

      • MCDEDependence

        public MCDEDependence​(int m,
                              double alpha,
                              double beta,
                              RandomFactory rnd,
                              MCDETest<?> mcdeTest)
        Constructor.
    • Method Detail

      • randomSlice

        protected boolean[] randomSlice​(java.util.Random random,
                                        MCDETest.RankStruct nonRefIndex)
        Bivariate data slicing
        Parameters:
        random - Random generator
        nonRefIndex - Index (see correctedRank()) computed for the dimension that is not the reference dimension
        Returns:
        Array of booleans that states which instances are part of the slice
      • randomSlice

        protected boolean[] randomSlice​(java.util.Random random,
                                        MCDETest.RankStruct[] nonRefIndex,
                                        int refDim,
                                        int nDim)
        Multivariate data slicing
        Parameters:
        random - Random generator
        nonRefIndex - Array of indices computed for each dimension
        refDim - Indexvalue of reference dimension
        nDim - No of dimensions
        Returns:
        Array of booleans that states which instances are part of the slice
      • dependence

        public <A,​B> double dependence​(NumberArrayAdapter<?,​A> adapter1,
                                             A data1,
                                             NumberArrayAdapter<?,​B> adapter2,
                                             B data2)
        Description copied from interface: Dependence
        Measure the dependence of two variables.

        This is the more flexible API, which allows using different internal data representations.

        Specified by:
        dependence in interface Dependence
        Type Parameters:
        A - First array type
        B - Second array type
        Parameters:
        adapter1 - First data adapter
        data1 - First data set
        adapter2 - Second data adapter
        data2 - Second data set
        Returns:
        Dependence measure
      • dependence

        public <A> double[] dependence​(NumberArrayAdapter<?,​A> adapter,
                                       java.util.List<? extends A> data)
        Description copied from interface: Dependence
        Measure the dependence of two variables.

        This is the more flexible API, which allows using different internal data representations.

        The resulting data is a serialized lower triangular matrix:

          X  S  S  S  S  S
          0  X  S  S  S  S
          1  2  X  S  S  S
          3  4  5  X  S  S
          6  7  8  9  X  S
         10 11 12 13 14  X
         
        Specified by:
        dependence in interface Dependence
        Type Parameters:
        A - Array type
        Parameters:
        adapter - Data adapter
        data - Data sets. Must have fast random access!
        Returns:
        Lower triangular serialized matrix
      • higherOrderDependence

        public <A> double higherOrderDependence​(NumberArrayAdapter<?,​A> adapter,
                                                java.util.List<? extends A> data)
        Runs MCDE Algorithm with possibly more than two dimensions
        Type Parameters:
        A - Array type
        Parameters:
        adapter - Array type adapter
        data - Data sets. Must have fast random access!
        Returns:
        Dependence Measure