Package tutorial.clustering
Class SameSizeKMeans<V extends NumberVector>
- java.lang.Object
-
- elki.clustering.kmeans.AbstractKMeans<V,MeanModel>
-
- tutorial.clustering.SameSizeKMeans<V>
-
- Type Parameters:
V
- Vector type
- All Implemented Interfaces:
Algorithm
,ClusteringAlgorithm<Clustering<MeanModel>>
,KMeans<V,MeanModel>
public class SameSizeKMeans<V extends NumberVector> extends AbstractKMeans<V,MeanModel>
K-means variation that produces equally sized clusters.Note that this is a rather obvious variation, and one cannot expect very good results from this algorithm. K-means already is quite primitive, and putting in the size constraint will likely not make the results much better (in particular, it will even less be able to make sense of outliers!)
There is no reference for this algorithm. If you want to cite it, please cite the latest ELKI release as given on the ELKI web page: https://elki-project.github.io/publications
- Since:
- 0.5.5
- Author:
- Erich Schubert
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description private static class
SameSizeKMeans.Meta
Object metadata.static class
SameSizeKMeans.Par<V extends NumberVector>
Parameterization class.static class
SameSizeKMeans.PreferenceComparator
Sort a list of integers (= cluster numbers) by the distances.-
Nested classes/interfaces inherited from class elki.clustering.kmeans.AbstractKMeans
AbstractKMeans.Instance
-
Nested classes/interfaces inherited from interface elki.Algorithm
Algorithm.Utils
-
-
Field Summary
Fields Modifier and Type Field Description private static Logging
LOG
Class logger-
Fields inherited from class elki.clustering.kmeans.AbstractKMeans
distance, initializer, k, maxiter
-
Fields inherited from interface elki.clustering.kmeans.KMeans
DISTANCE_FUNCTION_ID, INIT_ID, K_ID, MAXITER_ID, SEED_ID, VARSTAT_ID
-
-
Constructor Summary
Constructors Constructor Description SameSizeKMeans(NumberVectorDistance<? super V> distance, int k, int maxiter, KMeansInitialization initializer)
Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected Logging
getLogger()
Get the (STATIC) logger for this class.protected ArrayModifiableDBIDs
initialAssignment(java.util.List<ModifiableDBIDs> clusters, WritableDataStore<SameSizeKMeans.Meta> metas, DBIDs ids)
protected WritableDataStore<SameSizeKMeans.Meta>
initializeMeta(Relation<V> relation, double[][] means)
Initialize the metadata storage.protected double[][]
refineResult(Relation<V> relation, double[][] means, java.util.List<ModifiableDBIDs> clusters, WritableDataStore<SameSizeKMeans.Meta> metas, ArrayModifiableDBIDs tids)
Perform k-means style iterations to improve the clustering result.Clustering<MeanModel>
run(Relation<V> relation)
Run k-means with cluster size constraints.protected void
transfer(WritableDataStore<SameSizeKMeans.Meta> metas, SameSizeKMeans.Meta meta, ModifiableDBIDs src, ModifiableDBIDs dst, DBIDRef id, int dstnum)
Transfer a single element from one cluster to another.protected void
updateDistances(Relation<V> relation, double[][] means, WritableDataStore<SameSizeKMeans.Meta> metas, NumberVectorDistance<? super V> df)
Compute the distances of each object to all means.-
Methods inherited from class elki.clustering.kmeans.AbstractKMeans
getDistance, getInputTypeRestriction, incrementalUpdateMean, initialMeans, means, minusEquals, nearestMeans, plusEquals, plusMinusEquals, setDistance, setInitializer, setK
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface elki.clustering.ClusteringAlgorithm
autorun
-
-
-
-
Field Detail
-
LOG
private static final Logging LOG
Class logger
-
-
Constructor Detail
-
SameSizeKMeans
public SameSizeKMeans(NumberVectorDistance<? super V> distance, int k, int maxiter, KMeansInitialization initializer)
Constructor.- Parameters:
distance
- Distance functionk
- Number of neighborsmaxiter
- Maximum number of iterationsinitializer
-
-
-
Method Detail
-
run
public Clustering<MeanModel> run(Relation<V> relation)
Run k-means with cluster size constraints.- Parameters:
relation
- relation to use- Returns:
- result
-
initializeMeta
protected WritableDataStore<SameSizeKMeans.Meta> initializeMeta(Relation<V> relation, double[][] means)
Initialize the metadata storage.- Parameters:
relation
- Relation to processmeans
- Mean vectors- Returns:
- Initialized storage
-
initialAssignment
protected ArrayModifiableDBIDs initialAssignment(java.util.List<ModifiableDBIDs> clusters, WritableDataStore<SameSizeKMeans.Meta> metas, DBIDs ids)
-
updateDistances
protected void updateDistances(Relation<V> relation, double[][] means, WritableDataStore<SameSizeKMeans.Meta> metas, NumberVectorDistance<? super V> df)
Compute the distances of each object to all means. UpdateSameSizeKMeans.Meta.secondary
to point to the best cluster number except the current cluster assignment- Parameters:
relation
- Data relationmeans
- Meansmetas
- Metadata storagedf
- Distance function
-
refineResult
protected double[][] refineResult(Relation<V> relation, double[][] means, java.util.List<ModifiableDBIDs> clusters, WritableDataStore<SameSizeKMeans.Meta> metas, ArrayModifiableDBIDs tids)
Perform k-means style iterations to improve the clustering result.- Parameters:
relation
- Data relationmeans
- Means listclusters
- Cluster listmetas
- Metadata storagetids
- DBIDs array- Returns:
- final means
-
transfer
protected void transfer(WritableDataStore<SameSizeKMeans.Meta> metas, SameSizeKMeans.Meta meta, ModifiableDBIDs src, ModifiableDBIDs dst, DBIDRef id, int dstnum)
Transfer a single element from one cluster to another.- Parameters:
metas
- Meta storagemeta
- Meta of current objectsrc
- Source clusterdst
- Destination clusterid
- Object IDdstnum
- Destination cluster number
-
getLogger
protected Logging getLogger()
Description copied from class:AbstractKMeans
Get the (STATIC) logger for this class.- Specified by:
getLogger
in classAbstractKMeans<V extends NumberVector,MeanModel>
- Returns:
- the static logger
-
-