Package elki.clustering.kmeans
Class KDTreePruningKMeans.Instance
- java.lang.Object
-
- elki.clustering.kmeans.AbstractKMeans.Instance
-
- elki.clustering.kmeans.KDTreePruningKMeans.Instance
-
- Direct Known Subclasses:
KDTreeFilteringKMeans.Instance
- Enclosing class:
- KDTreePruningKMeans<V extends NumberVector>
protected class KDTreePruningKMeans.Instance extends AbstractKMeans.Instance
Inner instance, storing state for a single data set.- Author:
- Cedrik Lüdicke, Erich Schubert
-
-
Field Summary
Fields Modifier and Type Field Description protected int[]clusterSizesNumber of elements in each cluster.protected double[][]clusterSumsTo aggregate the sum of a cluster.protected int[]indicesArray of candidate indexes.protected DBIDArrayMIteriterIterator into the k-d-tree entries.protected KDTreePruningKMeans.KDNoderootThe root node of the treeprotected ArrayModifiableDBIDssortedThe tree stored as ArrayModifiableDBIDs-
Fields inherited from class elki.clustering.kmeans.AbstractKMeans.Instance
assignment, clusters, diststat, isSquared, k, key, means, relation, varsum
-
-
Constructor Summary
Constructors Constructor Description Instance(Relation<? extends NumberVector> relation, NumberVectorDistance<?> df, double[][] means)Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected KDTreePruningKMeans.KDNodebuildTreeBoundedMidpoint(Relation<? extends NumberVector> relation, int left, int right, VectorUtil.SortDBIDsBySingleDimension comp)Build the k-d-tree using bounded midpoint splitting.protected KDTreePruningKMeans.KDNodebuildTreeMedian(Relation<? extends NumberVector> relation, int left, int right, VectorUtil.SortDBIDsBySingleDimension comp)Build the k-d-tree using median splitting.protected KDTreePruningKMeans.KDNodebuildTreeMidpoint(Relation<? extends NumberVector> relation, int left, int right)Build the k-d-tree using midpoint splitting.protected KDTreePruningKMeans.KDNodebuildTreeSSQ(Relation<? extends NumberVector> relation, int left, int right, VectorUtil.SortDBIDsBySingleDimension comp)Build the k-d-tree using a variance-based splitting strategy.protected LogginggetLogger()Get the class logger.protected doublegetMinMaxDist(double[] mid, double[] halfwidth, int alive)Get the smallest maximum distance for pruning.protected intiterate(int iteration)Main loop function.protected intlabelSubtree(double[] sum, int start, int end, int index)Label an entire subtree.protected doublemindist(double[] mean, double[] mid, double[] halfwidth)Get the smallest maximum distance for pruning.protected intpruning(KDTreePruningKMeans.KDNode u, int alive)The pruning algorithm.voidrun(int maxiter)Run the clustering.protected inttraversal(KDTreePruningKMeans.KDNode u, int alive)The tree traversal algorithm.protected inttraverseLeaf(int start, int end, int alive)Traversal of a leaf (assuming alive > 1)-
Methods inherited from class elki.clustering.kmeans.AbstractKMeans.Instance
assignToNearestCluster, buildResult, buildResult, computeSquaredSeparation, copyMeans, distance, distance, distance, initialSeperation, meansFromSums, movedDistance, recomputeSeperation, recomputeVariance, sqrtdistance, sqrtdistance, sqrtdistance
-
-
-
-
Field Detail
-
root
protected KDTreePruningKMeans.KDNode root
The root node of the tree
-
sorted
protected ArrayModifiableDBIDs sorted
The tree stored as ArrayModifiableDBIDs
-
iter
protected DBIDArrayMIter iter
Iterator into the k-d-tree entries.
-
indices
protected int[] indices
Array of candidate indexes.
-
clusterSums
protected double[][] clusterSums
To aggregate the sum of a cluster.
-
clusterSizes
protected int[] clusterSizes
Number of elements in each cluster.
-
-
Constructor Detail
-
Instance
public Instance(Relation<? extends NumberVector> relation, NumberVectorDistance<?> df, double[][] means)
Constructor.- Parameters:
relation- Relation of data pointsdf- Distance functionmeans- Initial means
-
-
Method Detail
-
run
public void run(int maxiter)
Description copied from class:AbstractKMeans.InstanceRun the clustering.- Overrides:
runin classAbstractKMeans.Instance- Parameters:
maxiter- Maximum number of iterations
-
buildTreeMidpoint
protected KDTreePruningKMeans.KDNode buildTreeMidpoint(Relation<? extends NumberVector> relation, int left, int right)
Build the k-d-tree using midpoint splitting.- Parameters:
relation- Data relationleft- Left subintervalright- Right subinterval- Returns:
- Root node
-
buildTreeBoundedMidpoint
protected KDTreePruningKMeans.KDNode buildTreeBoundedMidpoint(Relation<? extends NumberVector> relation, int left, int right, VectorUtil.SortDBIDsBySingleDimension comp)
Build the k-d-tree using bounded midpoint splitting.- Parameters:
relation- Data relationleft- Left subintervalright- Right subintervalcomp- Comparator- Returns:
- Root node
-
buildTreeMedian
protected KDTreePruningKMeans.KDNode buildTreeMedian(Relation<? extends NumberVector> relation, int left, int right, VectorUtil.SortDBIDsBySingleDimension comp)
Build the k-d-tree using median splitting.- Parameters:
relation- Data relationleft- Left subintervalright- Right subintervalcomp- Comparator- Returns:
- Root node
-
buildTreeSSQ
protected KDTreePruningKMeans.KDNode buildTreeSSQ(Relation<? extends NumberVector> relation, int left, int right, VectorUtil.SortDBIDsBySingleDimension comp)
Build the k-d-tree using a variance-based splitting strategy.- Parameters:
relation- Data relationleft- Left subintervalright- Right subintervalcomp- Comparator- Returns:
- Root node
-
iterate
protected int iterate(int iteration)
Description copied from class:AbstractKMeans.InstanceMain loop function.- Specified by:
iteratein classAbstractKMeans.Instance- Parameters:
iteration- Iteration number (beginning at 1)- Returns:
- Number of reassigned points
-
traversal
protected int traversal(KDTreePruningKMeans.KDNode u, int alive)
The tree traversal algorithm.- Parameters:
u- Current nodealive- Number of alive centers- Returns:
- Number of relabeled point
-
labelSubtree
protected int labelSubtree(double[] sum, int start, int end, int index)Label an entire subtree.- Parameters:
sum- Node sumstart- Start offsetend- End offsetindex- New mean index- Returns:
- Number of reassigned points (for convergence)
-
pruning
protected int pruning(KDTreePruningKMeans.KDNode u, int alive)
The pruning algorithm.- Parameters:
u- Current nodealive- Range of alive means- Returns:
- Updated range.
-
getMinMaxDist
protected double getMinMaxDist(double[] mid, double[] halfwidth, int alive)Get the smallest maximum distance for pruning.- Parameters:
mid- Midpointhalfwidth- Half widthalive- Number of alive centers- Returns:
- Smallest maximum distance
-
mindist
protected double mindist(double[] mean, double[] mid, double[] halfwidth)Get the smallest maximum distance for pruning.- Parameters:
mean- Cluster meanmid- Midpointhalfwidth- Half width- Returns:
- Minimum distance
-
traverseLeaf
protected int traverseLeaf(int start, int end, int alive)Traversal of a leaf (assuming alive > 1)- Parameters:
start- Start indexend- End indexalive- Number of alive means- Returns:
- Number of relabeled points (for convergence)
-
getLogger
protected Logging getLogger()
Description copied from class:AbstractKMeans.InstanceGet the class logger.- Specified by:
getLoggerin classAbstractKMeans.Instance- Returns:
- Logger
-
-