Class MSTSplit<E extends MTreeEntry,​N extends AbstractMTreeNode<?,​N,​E>>

  • Type Parameters:
    E - Entry type
    N - Node type
    All Implemented Interfaces:
    MTreeSplit<E,​N>

    @Priority(200)
    @Reference(authors="C. Traina Jr., A. J. M. Traina, B. Seeger, C. Faloutsos",
               title="Slim-Trees: High Performance Metric Trees Minimizing Overlap Between Nodes",
               booktitle="Int. Conf. Extending Database Technology (EDBT\'2000)",
               url="https://doi.org/10.1007/3-540-46439-5_4",
               bibkey="DBLP:conf/edbt/TrainaTSF00")
    public class MSTSplit<E extends MTreeEntry,​N extends AbstractMTreeNode<?,​N,​E>>
    extends java.lang.Object
    implements MTreeSplit<E,​N>
    Splitting algorithm using the minimum spanning tree (MST), as proposed by the Slim-Tree variant.

    Unfortunately, the slim-tree paper does not detail how to choose the "most appropriate edge from the longest ones" (to find a more balanced split), so we try the longest 50%, and keep the choice which yields the most balanced split. This seems to work quite well.

    Reference:

    C. Traina Jr., A. J. M. Traina, B. Seeger, C. Faloutsos
    Slim-Trees: High Performance Metric Trees Minimizing Overlap Between Nodes
    Int. Conf. Extending Database Technology (EDBT'2000)

    Since:
    0.7.5
    Author:
    Erich Schubert
    • Constructor Summary

      Constructors 
      Constructor Description
      MSTSplit()  
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      private static double coverRadius​(double[][] matrix, int[] idx, int i)
      Find the cover radius of a partition.
      private static double edgelength​(double[][] matrix, int[] edges, int i)
      Length of edge i.
      private static int follow​(int i, int[] partitions)
      Union-find with simple path compression.
      private static int[] mstPartition​(double[][] matrix)
      Partition the data using the minimum spanning tree.
      private static void omitEdge​(int[] edges, int[] idx, int[] sizes, int omit)
      Partition the data by omitting one edge.
      Assignments<E> split​(AbstractMTree<?,​N,​E,​?> tree, N node)
      Returns the assignments of this split.
      private static double thresholdLength​(double[][] matrix, int[] edges)
      Choose the threshold length of edges to consider omittig.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • MSTSplit

        public MSTSplit()
    • Method Detail

      • coverRadius

        private static double coverRadius​(double[][] matrix,
                                          int[] idx,
                                          int i)
        Find the cover radius of a partition.
        Parameters:
        matrix - Distance matrix
        idx - Partition keys
        i - Candidate index
        Returns:
        max(d(i,j)) for all j with the same index
      • mstPartition

        private static int[] mstPartition​(double[][] matrix)
        Partition the data using the minimum spanning tree.
        Parameters:
        matrix -
        Returns:
        partition assignments
      • thresholdLength

        private static double thresholdLength​(double[][] matrix,
                                              int[] edges)
        Choose the threshold length of edges to consider omittig.
        Parameters:
        matrix - Distance matrix
        edges - Edges
        Returns:
        Distance threshold
      • edgelength

        private static double edgelength​(double[][] matrix,
                                         int[] edges,
                                         int i)
        Length of edge i.
        Parameters:
        matrix - Distance matrix
        edges - Edge list
        i - Edge number
        Returns:
        Edge length
      • omitEdge

        private static void omitEdge​(int[] edges,
                                     int[] idx,
                                     int[] sizes,
                                     int omit)
        Partition the data by omitting one edge.
        Parameters:
        edges - Edges list
        idx - Partition index storage
        sizes - Partition sizes
        omit - Edge number to omit
      • follow

        private static int follow​(int i,
                                  int[] partitions)
        Union-find with simple path compression.
        Parameters:
        i - Start
        partitions - Partitions array
        Returns:
        Partition