Class FastPAM<O>
- java.lang.Object
-
- elki.clustering.kmedoids.PAM<O>
-
- elki.clustering.kmedoids.FastPAM1<O>
-
- elki.clustering.kmedoids.FastPAM<O>
-
- Type Parameters:
O
- vector datatype
- All Implemented Interfaces:
Algorithm
,ClusteringAlgorithm<Clustering<MedoidModel>>
,KMedoidsClustering<O>
@Priority(102) @Reference(authors="Erich Schubert, Peter J. Rousseeuw", title="Faster k-Medoids Clustering: Improving the PAM, CLARA, and CLARANS Algorithms", booktitle="Proc. 12th Int. Conf. Similarity Search and Applications (SISAP\'2019)", url="https://doi.org/10.1007/978-3-030-32047-8_16", bibkey="DBLP:conf/sisap/SchubertR19") public class FastPAM<O> extends FastPAM1<O>
FastPAM: An improved version of PAM, that is usually O(k) times faster. This class incorporates the benefits ofFastPAM1
, but in addition it tries to perform multiple swaps in each iteration (FastPAM2), which can reduce the total number of iterations needed substantially for large k, if some areas of the data are largely independent.There is a tolerance parameter, which controls how many additional swaps are performed. When set to 0, it will only execute an additional swap if it appears to be independent (i.e., the improvements resulting from the swap have not decreased when the first swap was executed). We suggest to rather leave it at the default of 1, which means to perform any additional swap that gives an improvement. We could not observe a tendency to find worse results when doing these additional swaps, but a reduced runtime.
Because of the speed benefits, we also suggest to use a linear-time initialization, such as the k-means++ initialization or the proposed LAB (linear approximative BUILD, the third component of FastPAM) initialization, and try multiple times if the runtime permits.
Reference:
Erich Schubert, Peter J. Rousseeuw
Faster k-Medoids Clustering: Improving the PAM, CLARA, and CLARANS Algorithms
Proc. 12th Int. Conf. Similarity Search and Applications (SISAP'2019)- Since:
- 0.7.5
- Author:
- Erich Schubert
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected static class
FastPAM.Instance
Instance for a single dataset.static class
FastPAM.Par<V>
Parameterization class.-
Nested classes/interfaces inherited from interface elki.Algorithm
Algorithm.Utils
-
-
Field Summary
Fields Modifier and Type Field Description protected double
fasttol
Tolerance for fast swapping behavior (may perform worse swaps).private static java.lang.String
KEY
Key for statistics logging.private static Logging
LOG
The logger for this class.-
Fields inherited from class elki.clustering.kmedoids.PAM
distance, initializer, k, maxiter
-
-
Constructor Summary
Constructors Constructor Description FastPAM(Distance<? super O> distance, int k, int maxiter, KMedoidsInitialization<O> initializer)
Constructor.FastPAM(Distance<? super O> distance, int k, int maxiter, KMedoidsInitialization<O> initializer, double fasttol)
Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected Logging
getLogger()
Get the static class logger.Clustering<MedoidModel>
run(Relation<O> relation, int k, DistanceQuery<? super O> distQ)
Run k-medoids clustering with a given distance query.
Not a very elegant API, but needed for some types of nested k-medoids.-
Methods inherited from class elki.clustering.kmedoids.PAM
getInputTypeRestriction, initialMedoids, run, wrapResult
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface elki.clustering.ClusteringAlgorithm
autorun
-
-
-
-
Field Detail
-
LOG
private static final Logging LOG
The logger for this class.
-
KEY
private static final java.lang.String KEY
Key for statistics logging.
-
fasttol
protected double fasttol
Tolerance for fast swapping behavior (may perform worse swaps).
-
-
Constructor Detail
-
FastPAM
public FastPAM(Distance<? super O> distance, int k, int maxiter, KMedoidsInitialization<O> initializer)
Constructor.- Parameters:
distance
- distance functionk
- k parametermaxiter
- Maxiter parameterinitializer
- Function to generate the initial means
-
FastPAM
public FastPAM(Distance<? super O> distance, int k, int maxiter, KMedoidsInitialization<O> initializer, double fasttol)
Constructor.- Parameters:
distance
- distance functionk
- k parametermaxiter
- Maxiter parameterinitializer
- Function to generate the initial meansfasttol
- Tolerance for fast swapping
-
-
Method Detail
-
run
public Clustering<MedoidModel> run(Relation<O> relation, int k, DistanceQuery<? super O> distQ)
Description copied from interface:KMedoidsClustering
Run k-medoids clustering with a given distance query.
Not a very elegant API, but needed for some types of nested k-medoids.
-
-