Package elki.clustering.subspace
Class CLIQUE
- java.lang.Object
-
- elki.clustering.subspace.CLIQUE
-
- All Implemented Interfaces:
Algorithm
,ClusteringAlgorithm<Clustering<SubspaceModel>>
,SubspaceClusteringAlgorithm<SubspaceModel>
@Title("CLIQUE: Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications") @Description("Grid-based algorithm to identify dense clusters in subspaces of maximum dimensionality.") @Reference(authors="R. Agrawal, J. Gehrke, D. Gunopulos, P. Raghavan", title="Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications", booktitle="Proc. SIGMOD Conference, Seattle, WA, 1998", url="https://doi.org/10.1145/276304.276314", bibkey="DBLP:conf/sigmod/AgrawalGGR98") public class CLIQUE extends java.lang.Object implements SubspaceClusteringAlgorithm<SubspaceModel>
Implementation of the CLIQUE algorithm, a grid-based algorithm to identify dense clusters in subspaces of maximum dimensionality.The implementation consists of two steps:
1. Identification of subspaces that contain clusters
2. Identification of clustersThe third step of the original algorithm (Generation of minimal description for the clusters) is not (yet) implemented.
Note: this is fairly old code, and not well optimized. Do not use this for runtime benchmarking!
Reference:
R. Agrawal, J. Gehrke, D. Gunopulos, P. Raghavan
Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications
In Proc. ACM SIGMOD Int. Conf. on Management of Data, Seattle, WA, 1998.- Since:
- 0.1
- Author:
- Elke Achtert
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
CLIQUE.Par
Parameterization class.-
Nested classes/interfaces inherited from interface elki.Algorithm
Algorithm.Utils
-
-
Constructor Summary
Constructors Constructor Description CLIQUE(int xsi, double tau, boolean prune)
Constructor.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description private double[][]
computeDiffs(java.util.List<CLIQUESubspace> denseSubspaces, int[] mi, int[] mp)
The specified sorted list of dense subspaces is divided into the selected set I and the pruned set P.private int[][]
computeMeans(java.util.List<CLIQUESubspace> denseSubspaces)
The specified sorted list of dense subspaces is divided into the selected set I and the pruned set P.private java.util.List<Pair<Subspace,ModifiableDBIDs>>
determineClusters(java.util.List<CLIQUESubspace> denseSubspaces)
Determines the clusters in the specified dense subspaces.private java.util.List<CLIQUESubspace>
findDenseSubspaceCandidates(Relation<? extends NumberVector> database, java.util.List<CLIQUESubspace> denseSubspaces)
Determines thek
-dimensional dense subspace candidates from the specified(k-1)
-dimensional dense subspaces.private java.util.List<CLIQUESubspace>
findDenseSubspaces(Relation<? extends NumberVector> database, java.util.List<CLIQUESubspace> denseSubspaces)
Determines thek
-dimensional dense subspaces and performs a pruning if this option is chosen.private java.util.List<CLIQUESubspace>
findOneDimensionalDenseSubspaceCandidates(Relation<? extends NumberVector> database)
Determines the one-dimensional dense subspace candidates by making a pass over the database.private java.util.List<CLIQUESubspace>
findOneDimensionalDenseSubspaces(Relation<? extends NumberVector> database)
Determines the one dimensional dense subspaces and performs a pruning if this option is chosen.TypeInformation[]
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.private java.util.Collection<CLIQUEUnit>
initOneDimensionalUnits(Relation<? extends NumberVector> database)
Initializes and returns the one dimensional units.private static double
log2OrZero(double x)
Robust log 2, that ignores zero values.private java.util.List<CLIQUESubspace>
pruneDenseSubspaces(java.util.List<CLIQUESubspace> denseSubspaces)
Performs a MDL-based pruning of the specified dense subspaces as described in the CLIQUE algorithm.Clustering<SubspaceModel>
run(Relation<? extends NumberVector> relation)
Performs the CLIQUE algorithm on the given database.private void
updateMinMax(NumberVector featureVector, double[] minima, double[] maxima)
Updates the minima and maxima array according to the specified feature vector.-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface elki.clustering.ClusteringAlgorithm
autorun
-
-
-
-
Field Detail
-
LOG
private static final Logging LOG
The logger for this class.
-
xsi
protected int xsi
Number of intervals in each dimension.
-
tau
protected double tau
Density threshold / selectivity.
-
prune
protected boolean prune
Pruning flag.
-
-
Method Detail
-
run
public Clustering<SubspaceModel> run(Relation<? extends NumberVector> relation)
Performs the CLIQUE algorithm on the given database.- Parameters:
relation
- Data relation to process- Returns:
- Clustering result
-
determineClusters
private java.util.List<Pair<Subspace,ModifiableDBIDs>> determineClusters(java.util.List<CLIQUESubspace> denseSubspaces)
Determines the clusters in the specified dense subspaces.- Parameters:
denseSubspaces
- the dense subspaces in reverse order by their coverage- Returns:
- the clusters in the specified dense subspaces and the corresponding cluster models
-
findOneDimensionalDenseSubspaces
private java.util.List<CLIQUESubspace> findOneDimensionalDenseSubspaces(Relation<? extends NumberVector> database)
Determines the one dimensional dense subspaces and performs a pruning if this option is chosen.- Parameters:
database
- the database to run the algorithm on- Returns:
- the one dimensional dense subspaces reverse ordered by their coverage
-
findDenseSubspaces
private java.util.List<CLIQUESubspace> findDenseSubspaces(Relation<? extends NumberVector> database, java.util.List<CLIQUESubspace> denseSubspaces)
Determines thek
-dimensional dense subspaces and performs a pruning if this option is chosen.- Parameters:
database
- the database to run the algorithm ondenseSubspaces
- the(k-1)
-dimensional dense subspaces- Returns:
- a list of the
k
-dimensional dense subspaces sorted in reverse order by their coverage
-
initOneDimensionalUnits
private java.util.Collection<CLIQUEUnit> initOneDimensionalUnits(Relation<? extends NumberVector> database)
Initializes and returns the one dimensional units.- Parameters:
database
- the database to run the algorithm on- Returns:
- the created one dimensional units
-
updateMinMax
private void updateMinMax(NumberVector featureVector, double[] minima, double[] maxima)
Updates the minima and maxima array according to the specified feature vector.- Parameters:
featureVector
- the feature vectorminima
- the array of minimamaxima
- the array of maxima
-
findOneDimensionalDenseSubspaceCandidates
private java.util.List<CLIQUESubspace> findOneDimensionalDenseSubspaceCandidates(Relation<? extends NumberVector> database)
Determines the one-dimensional dense subspace candidates by making a pass over the database.- Parameters:
database
- the database to run the algorithm on- Returns:
- the one-dimensional dense subspace candidates reverse ordered by their coverage
-
findDenseSubspaceCandidates
private java.util.List<CLIQUESubspace> findDenseSubspaceCandidates(Relation<? extends NumberVector> database, java.util.List<CLIQUESubspace> denseSubspaces)
Determines thek
-dimensional dense subspace candidates from the specified(k-1)
-dimensional dense subspaces.- Parameters:
database
- the database to run the algorithm ondenseSubspaces
- the(k-1)
-dimensional dense subspaces- Returns:
- a list of the
k
-dimensional dense subspace candidates reverse ordered by their coverage
-
pruneDenseSubspaces
private java.util.List<CLIQUESubspace> pruneDenseSubspaces(java.util.List<CLIQUESubspace> denseSubspaces)
Performs a MDL-based pruning of the specified dense subspaces as described in the CLIQUE algorithm.- Parameters:
denseSubspaces
- the subspaces to be pruned sorted in reverse order by their coverage- Returns:
- the subspaces which are not pruned reverse ordered by their coverage
-
computeMeans
private int[][] computeMeans(java.util.List<CLIQUESubspace> denseSubspaces)
The specified sorted list of dense subspaces is divided into the selected set I and the pruned set P. For each set the mean of the cover fractions is computed.- Parameters:
denseSubspaces
- the dense subspaces in reverse order by their coverage- Returns:
- the mean of the cover fractions, the first value is the mean of the selected set I, the second value is the mean of the pruned set P.
-
computeDiffs
private double[][] computeDiffs(java.util.List<CLIQUESubspace> denseSubspaces, int[] mi, int[] mp)
The specified sorted list of dense subspaces is divided into the selected set I and the pruned set P. For each set the difference from the specified mean values is computed.- Parameters:
denseSubspaces
- denseSubspaces the dense subspaces in reverse order by their coveragemi
- the mean of the selected sets Imp
- the mean of the pruned sets P- Returns:
- the difference from the specified mean values, the first value is the difference from the mean of the selected set I, the second value is the difference from the mean of the pruned set P.
-
log2OrZero
private static double log2OrZero(double x)
Robust log 2, that ignores zero values.- Parameters:
x
- Input value- Returns:
- Log2(x), or zero.
-
getInputTypeRestriction
public TypeInformation[] getInputTypeRestriction()
Description copied from interface:Algorithm
Get the input type restriction used for negotiating the data query.- Specified by:
getInputTypeRestriction
in interfaceAlgorithm
- Returns:
- Type restriction
-
-