Class ClustersWithNoiseExtraction
- java.lang.Object
-
- elki.clustering.hierarchical.extraction.ClustersWithNoiseExtraction
-
- All Implemented Interfaces:
Algorithm
,ClusteringAlgorithm<Clustering<Model>>
@Reference(authors="Erich Schubert, Michael Gertz", title="Semantic Word Clouds with Background Corpus Normalization and t-distributed Stochastic Neighbor Embedding", booktitle="ArXiV preprint, 1708.03569", url="http://arxiv.org/abs/1708.03569", bibkey="DBLP:journals/corr/abs-1708-03569") @Priority(206) public class ClustersWithNoiseExtraction extends java.lang.Object implements ClusteringAlgorithm<Clustering<Model>>
Extraction of a given number of clusters with a minimum size, and noise.This will execute the highest-most cut where we retain k clusters, each with a minimum size, plus noise (single points that would only merge afterwards). If no such cut can be found, it returns a result with a relaxed k.
You need to specify: A) the minimum size of a cluster (it does not make much sense to use 1 - then it will simply execute all but the last k merges) and B) the desired number of clusters with at least minSize elements each.
Reference:
Erich Schubert, Michael Gertz
Semantic Word Clouds with Background Corpus Normalization and t-distributed Stochastic Neighbor Embedding
ArXiV preprint, 1708.03569TODO: Also provide representatives and last merge height for clusters.
- Since:
- 0.7.5
- Author:
- Erich Schubert
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected class
ClustersWithNoiseExtraction.Instance
Instance for a single data set.static class
ClustersWithNoiseExtraction.Par
Parameterization class.-
Nested classes/interfaces inherited from interface elki.Algorithm
Algorithm.Utils
-
-
Constructor Summary
Constructors Constructor Description ClustersWithNoiseExtraction(HierarchicalClusteringAlgorithm algorithm, int numCl, int minClSize)
Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Clustering<Model>
autorun(Database database)
Try to auto-run the algorithm on a database by calling a method calledrun
, with an optionalDatabase
first, and with data relations as specified byAlgorithm.getInputTypeRestriction()
.TypeInformation[]
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.Clustering<Model>
run(ClusterMergeHistory merges)
Process an existing result.
-
-
-
Field Detail
-
LOG
private static final Logging LOG
Class logger.
-
numCl
private int numCl
Minimum number of clusters.
-
minClSize
private int minClSize
Minimum cluster size.
-
algorithm
private HierarchicalClusteringAlgorithm algorithm
Clustering algorithm to run to obtain the hierarchy.
-
-
Constructor Detail
-
ClustersWithNoiseExtraction
public ClustersWithNoiseExtraction(HierarchicalClusteringAlgorithm algorithm, int numCl, int minClSize)
Constructor.- Parameters:
algorithm
- Algorithm to runnumCl
- Number of clustersminClSize
- Minimum cluster size
-
-
Method Detail
-
autorun
public Clustering<Model> autorun(Database database)
Description copied from interface:Algorithm
Try to auto-run the algorithm on a database by calling a method calledrun
, with an optionalDatabase
first, and with data relations as specified byAlgorithm.getInputTypeRestriction()
.- Specified by:
autorun
in interfaceAlgorithm
- Specified by:
autorun
in interfaceClusteringAlgorithm<Clustering<Model>>
- Parameters:
database
- the database to run the algorithm on- Returns:
- the Result computed by this algorithm
-
run
public Clustering<Model> run(ClusterMergeHistory merges)
Process an existing result.- Parameters:
merges
- Existing result in pointer representation.- Returns:
- Clustering
-
getInputTypeRestriction
public TypeInformation[] getInputTypeRestriction()
Description copied from interface:Algorithm
Get the input type restriction used for negotiating the data query.- Specified by:
getInputTypeRestriction
in interfaceAlgorithm
- Returns:
- Type restriction
-
-