Class AlternatingKMedoids<O>
- java.lang.Object
-
- elki.clustering.kmedoids.AlternatingKMedoids<O>
-
- Type Parameters:
O
- vector datatype
- All Implemented Interfaces:
Algorithm
,ClusteringAlgorithm<Clustering<MedoidModel>>
,KMedoidsClustering<O>
@Priority(-100) @Reference(authors="F. E. Maranzana",title="On the location of supply points to minimize transport costs",booktitle="Journal of the Operational Research Society 15.3",url="https://doi.org/10.1057/jors.1964.47",bibkey="doi:10.1057/jors.1964.47") @Reference(authors="A. P. Reynolds, G. Richards, B. de la Iglesia, V. J. Rayward-Smith",title="Clustering Rules: A Comparison of Partitioning and Hierarchical Clustering Algorithms",booktitle="J. Math. Model. Algorithms 5(4)",url="https://doi.org/10.1007/s10852-005-9022-1",bibkey="DBLP:journals/jmma/ReynoldsRIR06") @Reference(authors="H.-S. Park, C.-H. Jun",title="A simple and fast algorithm for K-medoids clustering",booktitle="Expert Systems with Applications 36(2)",url="https://doi.org/10.1016/j.eswa.2008.01.039",bibkey="DBLP:journals/eswa/ParkJ09") public class AlternatingKMedoids<O> extends java.lang.Object implements KMedoidsClustering<O>
A k-medoids clustering algorithm, implemented as EM-style batch algorithm; known in literature as the "alternate" method.In contrast to PAM, which will in each iteration update one medoid with one (arbitrary) non-medoid, this implementation follows the EM pattern. In the expectation step, the best medoid from the cluster members is chosen; in the M-step, the objects are reassigned to their nearest medoid.
This implementation evolved naturally from EM and k-means algorithms. We then came across a similar approach was published by Park and Jun, and made this code match their suggestions (in particular, the default initialization is as proposed by them, despite its shortcomings). But similar ideas were already discussed by Reynolds et al. as a side note, and can be further traced back to Maranzana.
In our experiments, it tends to be much faster than PAM, but also find less good solutions, as the medoids are only chosen from the cluster members. This aligns with findings of Reynolds et al. and can be explained with the requirement of the new medoid to cover the entire cluster. Similar observations were also made by Teitz and Bart, 1967, who described its performance as "erratic".
Reference:
F. E. Maranzana
On the location of supply points to minimize transport costs
Journal of the Operational Research Society 15.3A. P. Reynolds, G. Richards, B. de la Iglesia, V. J. Rayward-Smith
Clustering Rules: A Comparison of Partitioning and Hierarchical Clustering Algorithms
J. Math. Model. Algorithms 5(4)H.-S. Park, C.-H. Jun
A simple and fast algorithm for K-medoids clustering
Expert Systems with Applications 36(2)- Since:
- 0.5.0
- Author:
- Erich Schubert
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
AlternatingKMedoids.Par<V>
Parameterization class.-
Nested classes/interfaces inherited from interface elki.Algorithm
Algorithm.Utils
-
-
Field Summary
Fields Modifier and Type Field Description protected Distance<? super O>
distance
Distance function used.protected KMedoidsInitialization<O>
initializer
Method to choose initial means.protected int
k
Number of clusters to find.private static java.lang.String
KEY
Key for statistics logging.private static Logging
LOG
The logger for this class.protected int
maxiter
Maximum number of iterations.
-
Constructor Summary
Constructors Constructor Description AlternatingKMedoids(Distance<? super O> distance, int k, int maxiter, KMedoidsInitialization<O> initializer)
Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description TypeInformation[]
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.Clustering<MedoidModel>
run(Relation<O> relation)
Run k-medoids clustering.Clustering<MedoidModel>
run(Relation<O> relation, int k, DistanceQuery<? super O> distQ)
Run k-medoids clustering with a given distance query.
Not a very elegant API, but needed for some types of nested k-medoids.-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface elki.clustering.ClusteringAlgorithm
autorun
-
-
-
-
Field Detail
-
LOG
private static final Logging LOG
The logger for this class.
-
KEY
private static final java.lang.String KEY
Key for statistics logging.
-
k
protected int k
Number of clusters to find.
-
maxiter
protected int maxiter
Maximum number of iterations.
-
initializer
protected KMedoidsInitialization<O> initializer
Method to choose initial means.
-
-
Constructor Detail
-
AlternatingKMedoids
public AlternatingKMedoids(Distance<? super O> distance, int k, int maxiter, KMedoidsInitialization<O> initializer)
Constructor.- Parameters:
distance
- distance functionk
- k parametermaxiter
- Maxiter parameterinitializer
- Function to generate the initial means
-
-
Method Detail
-
getInputTypeRestriction
public TypeInformation[] getInputTypeRestriction()
Description copied from interface:Algorithm
Get the input type restriction used for negotiating the data query.- Specified by:
getInputTypeRestriction
in interfaceAlgorithm
- Returns:
- Type restriction
-
run
public Clustering<MedoidModel> run(Relation<O> relation)
Description copied from interface:KMedoidsClustering
Run k-medoids clustering.- Specified by:
run
in interfaceKMedoidsClustering<O>
- Parameters:
relation
- relation to use- Returns:
- result
-
run
public Clustering<MedoidModel> run(Relation<O> relation, int k, DistanceQuery<? super O> distQ)
Description copied from interface:KMedoidsClustering
Run k-medoids clustering with a given distance query.
Not a very elegant API, but needed for some types of nested k-medoids.- Specified by:
run
in interfaceKMedoidsClustering<O>
- Parameters:
relation
- relation to usek
- Number of clustersdistQ
- Distance query to use- Returns:
- result
-
-