V - Vector type@Reference(authors="R. T. Ng, J. Han", title="CLARANS: a method for clustering objects for spatial data mining", booktitle="IEEE Transactions on Knowledge and Data Engineering 14(5)", url="https://doi.org/10.1109/TKDE.2002.1033770", bibkey="DBLP:journals/tkde/NgH02") public class CLARANS<V> extends AbstractDistanceBasedAlgorithm<V,Clustering<MedoidModel>> implements ClusteringAlgorithm<Clustering<MedoidModel>>
KMedoidsPAM)
and CLARA and also based on sampling.
This implementation tries to balance memory and computation time. By caching the distances to the two nearest medoids, we usually only need O(n) instead of O(nk) distance computations for one iteration, at the cost of needing O(2n) memory to store them.
The implementation is fairly ugly, because we have three solutions (the best found so far, the current solution, and a neighbor candidate); and for each point in each solution we need the best and second best assignments. But with Java 11, we may be able to switch to value types that would clean this code significantly, without the overhead of O(n) objects.
Reference:
R. T. Ng, J. Han
CLARANS: a method for clustering objects for spatial data mining
IEEE Transactions on Knowledge and Data Engineering 14(5)
| Modifier and Type | Class and Description |
|---|---|
protected static class |
CLARANS.Assignment
Assignment state.
|
static class |
CLARANS.Parameterizer<V>
Parameterization class.
|
| Modifier and Type | Field and Description |
|---|---|
(package private) int |
k
Number of clusters to find.
|
private static Logging |
LOG
Class logger.
|
(package private) double |
maxneighbor
Sampling rate.
|
(package private) int |
numlocal
Number of samples to draw (i.e. restarts).
|
(package private) RandomFactory |
random
Random factory for initialization.
|
ALGORITHM_IDDISTANCE_FUNCTION_ID| Constructor and Description |
|---|
CLARANS(DistanceFunction<? super V> distanceFunction,
int k,
int numlocal,
double maxneighbor,
RandomFactory random)
Constructor.
|
| Modifier and Type | Method and Description |
|---|---|
TypeInformation[] |
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.
|
protected Logging |
getLogger()
Get the (STATIC) logger for this class.
|
Clustering<MedoidModel> |
run(Database database,
Relation<V> relation) |
getDistanceFunctionrunclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitrunprivate static final Logging LOG
int k
int numlocal
double maxneighbor
RandomFactory random
public CLARANS(DistanceFunction<? super V> distanceFunction, int k, int numlocal, double maxneighbor, RandomFactory random)
distanceFunction - Distance function to usek - Number of clusters to producenumlocal - Number of samples (restarts)maxneighbor - Neighbor sampling rate (absolute or relative)random - Random generatorpublic Clustering<MedoidModel> run(Database database, Relation<V> relation)
public TypeInformation[] getInputTypeRestriction()
AbstractAlgorithmgetInputTypeRestriction in interface AlgorithmgetInputTypeRestriction in class AbstractAlgorithm<Clustering<MedoidModel>>protected Logging getLogger()
AbstractAlgorithmgetLogger in class AbstractAlgorithm<Clustering<MedoidModel>>Copyright © 2019 ELKI Development Team. License information.