Class KDTreeEM.KDTree

  • Enclosing class:
    KDTreeEM

    static class KDTreeEM.KDTree
    extends java.lang.Object
    KDTree class with the statistics needed for EM clustering.
    Author:
    Robert Gehde
    • Field Summary

      Fields 
      Modifier and Type Field Description
      (package private) double[] halfwidth
      Half width of the rectangle.
      (package private) int left
      Interval in sorted list
      (package private) KDTreeEM.KDTree leftChild
      Child nodes:
      (package private) double[] midpoint
      Middle point of bounding box
      (package private) int right
      Interval in sorted list
      (package private) KDTreeEM.KDTree rightChild
      Child nodes:
      (package private) double[] sum
      Sum of contained vectors
      (package private) double[][] sumSq
      Sum over all squared elements (x^T * x), needed for covariance calculation
    • Constructor Summary

      Constructors 
      Constructor Description
      KDTree​(Relation<? extends NumberVector> relation, ArrayModifiableDBIDs sorted, int left, int right, double[] dimWidth, double mbw)
      Constructor for a KDTree with statistics needed for KDTreeEM calculation.
    • Field Detail

      • left

        int left
        Interval in sorted list
      • right

        int right
        Interval in sorted list
      • sum

        double[] sum
        Sum of contained vectors
      • sumSq

        double[][] sumSq
        Sum over all squared elements (x^T * x), needed for covariance calculation
      • midpoint

        double[] midpoint
        Middle point of bounding box
      • halfwidth

        double[] halfwidth
        Half width of the rectangle.
    • Constructor Detail

      • KDTree

        public KDTree​(Relation<? extends NumberVector> relation,
                      ArrayModifiableDBIDs sorted,
                      int left,
                      int right,
                      double[] dimWidth,
                      double mbw)
        Constructor for a KDTree with statistics needed for KDTreeEM calculation. Uses points between the indices left and right for calculation
        Parameters:
        relation - datapoints for the construction
        sorted - sorted id array
        left - leftmost datapoint used for construction
        right - rightmost datapoint used for construction
        dimWidth - Array containing the width of all dimensions on the complete dataset
        mbw - factor when to stop construction. Stop if splitdimwidth < mbw * dimwidth[splitdim]
    • Method Detail

      • computeBoundingBox

        private void computeBoundingBox​(Relation<? extends NumberVector> relation,
                                        DBIDArrayIter iter)
        Compute the bounding box.
        Parameters:
        relation - Data relation
        iter - Iterator
      • aggregateStats

        private void aggregateStats​(Relation<? extends NumberVector> relation,
                                    DBIDArrayIter iter,
                                    int dim)
        Aggregate the statistics for a leaf node.
        Parameters:
        relation - Data relation
        iter - Iterator
        dim - Dimensionality