Package elki.clustering.kmeans
Class KDTreePruningKMeans.Instance
- java.lang.Object
-
- elki.clustering.kmeans.AbstractKMeans.Instance
-
- elki.clustering.kmeans.KDTreePruningKMeans.Instance
-
- Direct Known Subclasses:
KDTreeFilteringKMeans.Instance
- Enclosing class:
- KDTreePruningKMeans<V extends NumberVector>
protected class KDTreePruningKMeans.Instance extends AbstractKMeans.Instance
Inner instance, storing state for a single data set.- Author:
- Cedrik Lüdicke, Erich Schubert
-
-
Field Summary
Fields Modifier and Type Field Description protected int[]
clusterSizes
Number of elements in each cluster.protected double[][]
clusterSums
To aggregate the sum of a cluster.protected int[]
indices
Array of candidate indexes.protected DBIDArrayMIter
iter
Iterator into the k-d-tree entries.protected KDTreePruningKMeans.KDNode
root
The root node of the treeprotected ArrayModifiableDBIDs
sorted
The tree stored as ArrayModifiableDBIDs-
Fields inherited from class elki.clustering.kmeans.AbstractKMeans.Instance
assignment, clusters, diststat, isSquared, k, key, means, relation, varsum
-
-
Constructor Summary
Constructors Constructor Description Instance(Relation<? extends NumberVector> relation, NumberVectorDistance<?> df, double[][] means)
Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected KDTreePruningKMeans.KDNode
buildTreeBoundedMidpoint(Relation<? extends NumberVector> relation, int left, int right, VectorUtil.SortDBIDsBySingleDimension comp)
Build the k-d-tree using bounded midpoint splitting.protected KDTreePruningKMeans.KDNode
buildTreeMedian(Relation<? extends NumberVector> relation, int left, int right, VectorUtil.SortDBIDsBySingleDimension comp)
Build the k-d-tree using median splitting.protected KDTreePruningKMeans.KDNode
buildTreeMidpoint(Relation<? extends NumberVector> relation, int left, int right)
Build the k-d-tree using midpoint splitting.protected KDTreePruningKMeans.KDNode
buildTreeSSQ(Relation<? extends NumberVector> relation, int left, int right, VectorUtil.SortDBIDsBySingleDimension comp)
Build the k-d-tree using a variance-based splitting strategy.protected Logging
getLogger()
Get the class logger.protected double
getMinMaxDist(double[] mid, double[] halfwidth, int alive)
Get the smallest maximum distance for pruning.protected int
iterate(int iteration)
Main loop function.protected int
labelSubtree(double[] sum, int start, int end, int index)
Label an entire subtree.protected double
mindist(double[] mean, double[] mid, double[] halfwidth)
Get the smallest maximum distance for pruning.protected int
pruning(KDTreePruningKMeans.KDNode u, int alive)
The pruning algorithm.void
run(int maxiter)
Run the clustering.protected int
traversal(KDTreePruningKMeans.KDNode u, int alive)
The tree traversal algorithm.protected int
traverseLeaf(int start, int end, int alive)
Traversal of a leaf (assuming alive > 1)-
Methods inherited from class elki.clustering.kmeans.AbstractKMeans.Instance
assignToNearestCluster, buildResult, buildResult, computeSquaredSeparation, copyMeans, distance, distance, distance, initialSeperation, meansFromSums, movedDistance, recomputeSeperation, recomputeVariance, sqrtdistance, sqrtdistance, sqrtdistance
-
-
-
-
Field Detail
-
root
protected KDTreePruningKMeans.KDNode root
The root node of the tree
-
sorted
protected ArrayModifiableDBIDs sorted
The tree stored as ArrayModifiableDBIDs
-
iter
protected DBIDArrayMIter iter
Iterator into the k-d-tree entries.
-
indices
protected int[] indices
Array of candidate indexes.
-
clusterSums
protected double[][] clusterSums
To aggregate the sum of a cluster.
-
clusterSizes
protected int[] clusterSizes
Number of elements in each cluster.
-
-
Constructor Detail
-
Instance
public Instance(Relation<? extends NumberVector> relation, NumberVectorDistance<?> df, double[][] means)
Constructor.- Parameters:
relation
- Relation of data pointsdf
- Distance functionmeans
- Initial means
-
-
Method Detail
-
run
public void run(int maxiter)
Description copied from class:AbstractKMeans.Instance
Run the clustering.- Overrides:
run
in classAbstractKMeans.Instance
- Parameters:
maxiter
- Maximum number of iterations
-
buildTreeMidpoint
protected KDTreePruningKMeans.KDNode buildTreeMidpoint(Relation<? extends NumberVector> relation, int left, int right)
Build the k-d-tree using midpoint splitting.- Parameters:
relation
- Data relationleft
- Left subintervalright
- Right subinterval- Returns:
- Root node
-
buildTreeBoundedMidpoint
protected KDTreePruningKMeans.KDNode buildTreeBoundedMidpoint(Relation<? extends NumberVector> relation, int left, int right, VectorUtil.SortDBIDsBySingleDimension comp)
Build the k-d-tree using bounded midpoint splitting.- Parameters:
relation
- Data relationleft
- Left subintervalright
- Right subintervalcomp
- Comparator- Returns:
- Root node
-
buildTreeMedian
protected KDTreePruningKMeans.KDNode buildTreeMedian(Relation<? extends NumberVector> relation, int left, int right, VectorUtil.SortDBIDsBySingleDimension comp)
Build the k-d-tree using median splitting.- Parameters:
relation
- Data relationleft
- Left subintervalright
- Right subintervalcomp
- Comparator- Returns:
- Root node
-
buildTreeSSQ
protected KDTreePruningKMeans.KDNode buildTreeSSQ(Relation<? extends NumberVector> relation, int left, int right, VectorUtil.SortDBIDsBySingleDimension comp)
Build the k-d-tree using a variance-based splitting strategy.- Parameters:
relation
- Data relationleft
- Left subintervalright
- Right subintervalcomp
- Comparator- Returns:
- Root node
-
iterate
protected int iterate(int iteration)
Description copied from class:AbstractKMeans.Instance
Main loop function.- Specified by:
iterate
in classAbstractKMeans.Instance
- Parameters:
iteration
- Iteration number (beginning at 1)- Returns:
- Number of reassigned points
-
traversal
protected int traversal(KDTreePruningKMeans.KDNode u, int alive)
The tree traversal algorithm.- Parameters:
u
- Current nodealive
- Number of alive centers- Returns:
- Number of relabeled point
-
labelSubtree
protected int labelSubtree(double[] sum, int start, int end, int index)
Label an entire subtree.- Parameters:
sum
- Node sumstart
- Start offsetend
- End offsetindex
- New mean index- Returns:
- Number of reassigned points (for convergence)
-
pruning
protected int pruning(KDTreePruningKMeans.KDNode u, int alive)
The pruning algorithm.- Parameters:
u
- Current nodealive
- Range of alive means- Returns:
- Updated range.
-
getMinMaxDist
protected double getMinMaxDist(double[] mid, double[] halfwidth, int alive)
Get the smallest maximum distance for pruning.- Parameters:
mid
- Midpointhalfwidth
- Half widthalive
- Number of alive centers- Returns:
- Smallest maximum distance
-
mindist
protected double mindist(double[] mean, double[] mid, double[] halfwidth)
Get the smallest maximum distance for pruning.- Parameters:
mean
- Cluster meanmid
- Midpointhalfwidth
- Half width- Returns:
- Minimum distance
-
traverseLeaf
protected int traverseLeaf(int start, int end, int alive)
Traversal of a leaf (assuming alive > 1)- Parameters:
start
- Start indexend
- End indexalive
- Number of alive means- Returns:
- Number of relabeled points (for convergence)
-
getLogger
protected Logging getLogger()
Description copied from class:AbstractKMeans.Instance
Get the class logger.- Specified by:
getLogger
in classAbstractKMeans.Instance
- Returns:
- Logger
-
-