Package elki.clustering.uncertain
Class RepresentativeUncertainClustering
- java.lang.Object
-
- elki.clustering.uncertain.RepresentativeUncertainClustering
-
- All Implemented Interfaces:
Algorithm
,ClusteringAlgorithm<Clustering<?>>
@Reference(authors="Andreas Z\u00fcfle, Tobias Emrich, Klaus Arthur Schmid, Nikos Mamoulis, Arthur Zimek, Mathias Renz", title="Representative clustering of uncertain data", booktitle="Proc. 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining", url="https://doi.org/10.1145/2623330.2623725", bibkey="DBLP:conf/kdd/ZufleESMZR14") public class RepresentativeUncertainClustering extends java.lang.Object implements ClusteringAlgorithm<Clustering<?>>
Representative clustering of uncertain data.This algorithm clusters uncertain data by repeatedly sampling a possible world, then running a traditional clustering algorithm on this sample.
The resulting "possible" clusterings are then clustered themselves, using a clustering similarity measure. This yields a number of representatives for the set of all possible worlds.
Reference:
Andreas Züfle, Tobias Emrich, Klaus Arthur Schmid, Nikos Mamoulis, Arthur Zimek, Mathias Renz
Representative clustering of uncertain data
In Proc. 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining- Since:
- 0.7.0
- Author:
- Alexander Koos, Erich Schubert
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
RepresentativeUncertainClustering.Par
Parameterization class.static class
RepresentativeUncertainClustering.RepresentativenessEvaluation
Representativeness evaluation result.-
Nested classes/interfaces inherited from interface elki.Algorithm
Algorithm.Utils
-
-
Field Summary
Fields Modifier and Type Field Description protected double
alpha
Alpha parameter for confidence.protected ClusteringDistanceSimilarity
distance
Distance function for clusterings.protected boolean
keep
Keep all samples (not only the representative results)private static Logging
LOG
Initialize a Logger.protected ClusteringAlgorithm<?>
metaAlgorithm
The algorithm for meta-clustering.protected int
numsamples
How many clusterings shall be made for aggregation.protected RandomFactory
random
Random factory for sampling.protected ClusteringAlgorithm<?>
samplesAlgorithm
The algorithm to be wrapped and run.
-
Constructor Summary
Constructors Constructor Description RepresentativeUncertainClustering(ClusteringDistanceSimilarity distance, ClusteringAlgorithm<?> metaAlgorithm, ClusteringAlgorithm<?> samplesAlgorithm, int numsamples, RandomFactory random, double alpha, boolean keep)
Constructor, quite trivial.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description private double
computeConfidence(int support, int samples)
Estimate the confidence probability of a clustering.TypeInformation[]
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.Clustering<?>
run(Database database, Relation<? extends UncertainObject> relation)
This run method will do the wrapping.protected Clustering<?>
runClusteringAlgorithm(java.lang.Object parent, DBIDs ids, DataStore<DoubleVector> store, int dim, java.lang.String title)
Run a clustering algorithm on a single instance.-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface elki.clustering.ClusteringAlgorithm
autorun
-
-
-
-
Field Detail
-
LOG
private static final Logging LOG
Initialize a Logger.
-
distance
protected ClusteringDistanceSimilarity distance
Distance function for clusterings.
-
metaAlgorithm
protected ClusteringAlgorithm<?> metaAlgorithm
The algorithm for meta-clustering.
-
samplesAlgorithm
protected ClusteringAlgorithm<?> samplesAlgorithm
The algorithm to be wrapped and run.
-
numsamples
protected int numsamples
How many clusterings shall be made for aggregation.
-
random
protected RandomFactory random
Random factory for sampling.
-
alpha
protected double alpha
Alpha parameter for confidence.
-
keep
protected boolean keep
Keep all samples (not only the representative results)
-
-
Constructor Detail
-
RepresentativeUncertainClustering
public RepresentativeUncertainClustering(ClusteringDistanceSimilarity distance, ClusteringAlgorithm<?> metaAlgorithm, ClusteringAlgorithm<?> samplesAlgorithm, int numsamples, RandomFactory random, double alpha, boolean keep)
Constructor, quite trivial.- Parameters:
distance
- Distance function for meta clusteringmetaAlgorithm
- Meta clustering algorithmsamplesAlgorithm
- Primary clustering algorithmnumsamples
- Number of samplesalpha
- Alpha confidencekeep
- Keep all samples (not only the representative results).
-
-
Method Detail
-
getInputTypeRestriction
public TypeInformation[] getInputTypeRestriction()
Description copied from interface:Algorithm
Get the input type restriction used for negotiating the data query.- Specified by:
getInputTypeRestriction
in interfaceAlgorithm
- Returns:
- Type restriction
-
run
public Clustering<?> run(Database database, Relation<? extends UncertainObject> relation)
This run method will do the wrapping.- Parameters:
database
- Databaserelation
- Data relation of uncertain objects- Returns:
- Clustering result
-
computeConfidence
private double computeConfidence(int support, int samples)
Estimate the confidence probability of a clustering.- Parameters:
support
- Number of supporting samplessamples
- Total samples- Returns:
- Probability
-
runClusteringAlgorithm
protected Clustering<?> runClusteringAlgorithm(java.lang.Object parent, DBIDs ids, DataStore<DoubleVector> store, int dim, java.lang.String title)
Run a clustering algorithm on a single instance.- Parameters:
parent
- Parent result to attach toids
- Object IDs to processstore
- Input datadim
- Dimensionalitytitle
- Title of relation- Returns:
- Clustering result
-
-