V - Data type@Reference(authors="L. Kaufman, P. J. Rousseeuw",title="Clustering Large Data Sets",booktitle="Pattern Recognition in Practice",url="https://doi.org/10.1016/B978-0-444-87877-9.50039-X",bibkey="doi:10.1016/B978-0-444-87877-9.50039-X") @Reference(authors="L. Kaufman, P. J. Rousseeuw",title="Clustering Large Applications (Program CLARA)",booktitle="Finding Groups in Data: An Introduction to Cluster Analysis",url="https://doi.org/10.1002/9780470316801.ch3",bibkey="doi:10.1002/9780470316801.ch3") public class CLARA<V> extends KMedoidsPAM<V>
KMedoidsPAM) based on
 sampling.
 TODO: use a triangular distance matrix, rather than a hash-map based cache, for a bit better performance and less memory.
Reference:
 L. Kaufman, P. J. Rousseeuw
 Clustering Large Data Sets
 Pattern Recognition in Practice
 
 L. Kaufman, P. J. Rousseeuw
 Clustering Large Applications (Program CLARA)
 Finding Groups in Data: An Introduction to Cluster Analysis
| Modifier and Type | Class and Description | 
|---|---|
(package private) static class  | 
CLARA.CachedDistanceQuery<V>
Cached distance query. 
 | 
static class  | 
CLARA.Parameterizer<V>
Parameterization class. 
 | 
KMedoidsPAM.Instance| Modifier and Type | Field and Description | 
|---|---|
(package private) boolean | 
keepmed
Keep the previous medoids in the sample (see page 145). 
 | 
private static Logging | 
LOG
Class logger. 
 | 
(package private) int | 
numsamples
Number of samples to draw (i.e. iterations). 
 | 
(package private) RandomFactory | 
random
Random factory for initialization. 
 | 
(package private) double | 
sampling
Sampling rate. 
 | 
initializer, k, maxiterALGORITHM_IDDISTANCE_FUNCTION_ID| Constructor and Description | 
|---|
CLARA(DistanceFunction<? super V> distanceFunction,
     int k,
     int maxiter,
     KMedoidsInitialization<V> initializer,
     int numsamples,
     double sampling,
     boolean keepmed,
     RandomFactory random)
Constructor. 
 | 
| Modifier and Type | Method and Description | 
|---|---|
(package private) static double | 
assignRemainingToNearestCluster(ArrayDBIDs means,
                               DBIDs ids,
                               DBIDs rids,
                               WritableIntegerDataStore assignment,
                               DistanceQuery<?> distQ)
Returns a list of clusters. 
 | 
(package private) static DBIDs | 
randomSample(DBIDs ids,
            int samplesize,
            java.util.Random rnd,
            DBIDs previous)
Draw a random sample of the desired size. 
 | 
Clustering<MedoidModel> | 
run(Database database,
   Relation<V> relation)
Run k-medoids 
 | 
getInputTypeRestriction, getLogger, initialMedoids, rungetDistanceFunctionrunclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitrunprivate static final Logging LOG
double sampling
int numsamples
boolean keepmed
RandomFactory random
public CLARA(DistanceFunction<? super V> distanceFunction, int k, int maxiter, KMedoidsInitialization<V> initializer, int numsamples, double sampling, boolean keepmed, RandomFactory random)
distanceFunction - Distance function to usek - Number of clusters to producemaxiter - Maximum number of iterationsinitializer - Initialization functionnumsamples - Number of samples (sampling iterations)sampling - Sampling rate (absolute or relative)keepmed - Keep the previous medoids in the next samplerandom - Random generatorpublic Clustering<MedoidModel> run(Database database, Relation<V> relation)
KMedoidsPAMrun in class KMedoidsPAM<V>database - Databaserelation - relation to usestatic DBIDs randomSample(DBIDs ids, int samplesize, java.util.Random rnd, DBIDs previous)
ids - IDs to sample fromsamplesize - Sample sizernd - Random generatorprevious - Previous medoids to always include in the sample.static double assignRemainingToNearestCluster(ArrayDBIDs means, DBIDs ids, DBIDs rids, WritableIntegerDataStore assignment, DistanceQuery<?> distQ)
means - Object centroidsids - Object idsrids - Sample that was already assignedassignment - cluster assignmentdistQ - distance queryCopyright © 2019 ELKI Development Team. License information.