de.lmu.ifi.dbs.elki.algorithm.clustering.subspace

Class CLIQUE

• All Implemented Interfaces:
Algorithm, ClusteringAlgorithm<Clustering<SubspaceModel>>, SubspaceClusteringAlgorithm<SubspaceModel>

@Title(value="CLIQUE: Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications")
@Description(value="Grid-based algorithm to identify dense clusters in subspaces of maximum dimensionality.")
@Reference(authors="R. Agrawal, J. Gehrke, D. Gunopulos, P. Raghavan",
title="Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications",
booktitle="Proc. SIGMOD Conference, Seattle, WA, 1998",
url="https://doi.org/10.1145/276304.276314",
bibkey="DBLP:conf/sigmod/AgrawalGGR98")
public class CLIQUE
extends AbstractAlgorithm<Clustering<SubspaceModel>>
implements SubspaceClusteringAlgorithm<SubspaceModel>
Implementation of the CLIQUE algorithm, a grid-based algorithm to identify dense clusters in subspaces of maximum dimensionality.

The implementation consists of two steps:
1. Identification of subspaces that contain clusters
2. Identification of clusters

The third step of the original algorithm (Generation of minimal description for the clusters) is not (yet) implemented.

Note: this is fairly old code, and not well optimized. Do not use this for runtime benchmarking!

Reference:

R. Agrawal, J. Gehrke, D. Gunopulos, P. Raghavan
Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications
In Proc. ACM SIGMOD Int. Conf. on Management of Data, Seattle, WA, 1998.

Since:
0.1
Author:
Elke Achtert
• Nested Class Summary

Nested Classes
Modifier and Type Class and Description
static class  CLIQUE.Parameterizer
Parameterization class.
• Field Summary

Fields
Modifier and Type Field and Description
private static Logging LOG
The logger for this class.
protected boolean prune
Pruning flag.
protected double tau
Density threshold / selectivity.
protected int xsi
Number of intervals in each dimension.
• Fields inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm

ALGORITHM_ID
• Constructor Summary

Constructors
Constructor and Description
CLIQUE(int xsi, double tau, boolean prune)
Constructor.
• Method Summary

All Methods
Modifier and Type Method and Description
private double[][] computeDiffs(java.util.List<CLIQUESubspace> denseSubspaces, int[] mi, int[] mp)
The specified sorted list of dense subspaces is divided into the selected set I and the pruned set P.
private int[][] computeMeans(java.util.List<CLIQUESubspace> denseSubspaces)
The specified sorted list of dense subspaces is divided into the selected set I and the pruned set P.
private java.util.List<Pair<Subspace,ModifiableDBIDs>> determineClusters(java.util.List<CLIQUESubspace> denseSubspaces)
Determines the clusters in the specified dense subspaces.
private java.util.List<CLIQUESubspace> findDenseSubspaceCandidates(Relation<? extends NumberVector> database, java.util.List<CLIQUESubspace> denseSubspaces)
Determines the k-dimensional dense subspace candidates from the specified (k-1)-dimensional dense subspaces.
private java.util.List<CLIQUESubspace> findDenseSubspaces(Relation<? extends NumberVector> database, java.util.List<CLIQUESubspace> denseSubspaces)
Determines the k-dimensional dense subspaces and performs a pruning if this option is chosen.
private java.util.List<CLIQUESubspace> findOneDimensionalDenseSubspaceCandidates(Relation<? extends NumberVector> database)
Determines the one-dimensional dense subspace candidates by making a pass over the database.
private java.util.List<CLIQUESubspace> findOneDimensionalDenseSubspaces(Relation<? extends NumberVector> database)
Determines the one dimensional dense subspaces and performs a pruning if this option is chosen.
TypeInformation[] getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.
protected Logging getLogger()
Get the (STATIC) logger for this class.
private java.util.Collection<CLIQUEUnit> initOneDimensionalUnits(Relation<? extends NumberVector> database)
Initializes and returns the one dimensional units.
private static double log2OrZero(double x)
Robust log 2, that ignores zero values.
private java.util.List<CLIQUESubspace> pruneDenseSubspaces(java.util.List<CLIQUESubspace> denseSubspaces)
Performs a MDL-based pruning of the specified dense subspaces as described in the CLIQUE algorithm.
Clustering<SubspaceModel> run(Relation<? extends NumberVector> relation)
Performs the CLIQUE algorithm on the given database.
private void updateMinMax(NumberVector featureVector, double[] minima, double[] maxima)
Updates the minima and maxima array according to the specified feature vector.
• Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm

run
• Methods inherited from class java.lang.Object

clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
• Methods inherited from interface de.lmu.ifi.dbs.elki.algorithm.clustering.ClusteringAlgorithm

run
• Field Detail

• LOG

private static final Logging LOG
The logger for this class.
• xsi

protected int xsi
Number of intervals in each dimension.
• tau

protected double tau
Density threshold / selectivity.
• prune

protected boolean prune
Pruning flag.
• Constructor Detail

• CLIQUE

public CLIQUE(int xsi,
double tau,
boolean prune)
Constructor.
Parameters:
xsi - Xsi interval number
tau - Tau density threshold
prune - Prune flag
• Method Detail

• run

public Clustering<SubspaceModel> run(Relation<? extends NumberVector> relation)
Performs the CLIQUE algorithm on the given database.
Parameters:
relation - Data relation to process
Returns:
Clustering result
• determineClusters

private java.util.List<Pair<Subspace,ModifiableDBIDs>> determineClusters(java.util.List<CLIQUESubspace> denseSubspaces)
Determines the clusters in the specified dense subspaces.
Parameters:
denseSubspaces - the dense subspaces in reverse order by their coverage
Returns:
the clusters in the specified dense subspaces and the corresponding cluster models
• findOneDimensionalDenseSubspaces

private java.util.List<CLIQUESubspace> findOneDimensionalDenseSubspaces(Relation<? extends NumberVector> database)
Determines the one dimensional dense subspaces and performs a pruning if this option is chosen.
Parameters:
database - the database to run the algorithm on
Returns:
the one dimensional dense subspaces reverse ordered by their coverage
• findDenseSubspaces

private java.util.List<CLIQUESubspace> findDenseSubspaces(Relation<? extends NumberVector> database,
java.util.List<CLIQUESubspace> denseSubspaces)
Determines the k-dimensional dense subspaces and performs a pruning if this option is chosen.
Parameters:
database - the database to run the algorithm on
denseSubspaces - the (k-1)-dimensional dense subspaces
Returns:
a list of the k-dimensional dense subspaces sorted in reverse order by their coverage
• initOneDimensionalUnits

private java.util.Collection<CLIQUEUnit> initOneDimensionalUnits(Relation<? extends NumberVector> database)
Initializes and returns the one dimensional units.
Parameters:
database - the database to run the algorithm on
Returns:
the created one dimensional units
• updateMinMax

private void updateMinMax(NumberVector featureVector,
double[] minima,
double[] maxima)
Updates the minima and maxima array according to the specified feature vector.
Parameters:
featureVector - the feature vector
minima - the array of minima
maxima - the array of maxima
• findOneDimensionalDenseSubspaceCandidates

private java.util.List<CLIQUESubspace> findOneDimensionalDenseSubspaceCandidates(Relation<? extends NumberVector> database)
Determines the one-dimensional dense subspace candidates by making a pass over the database.
Parameters:
database - the database to run the algorithm on
Returns:
the one-dimensional dense subspace candidates reverse ordered by their coverage
• findDenseSubspaceCandidates

private java.util.List<CLIQUESubspace> findDenseSubspaceCandidates(Relation<? extends NumberVector> database,
java.util.List<CLIQUESubspace> denseSubspaces)
Determines the k-dimensional dense subspace candidates from the specified (k-1)-dimensional dense subspaces.
Parameters:
database - the database to run the algorithm on
denseSubspaces - the (k-1)-dimensional dense subspaces
Returns:
a list of the k-dimensional dense subspace candidates reverse ordered by their coverage
• pruneDenseSubspaces

private java.util.List<CLIQUESubspace> pruneDenseSubspaces(java.util.List<CLIQUESubspace> denseSubspaces)
Performs a MDL-based pruning of the specified dense subspaces as described in the CLIQUE algorithm.
Parameters:
denseSubspaces - the subspaces to be pruned sorted in reverse order by their coverage
Returns:
the subspaces which are not pruned reverse ordered by their coverage
• computeMeans

private int[][] computeMeans(java.util.List<CLIQUESubspace> denseSubspaces)
The specified sorted list of dense subspaces is divided into the selected set I and the pruned set P. For each set the mean of the cover fractions is computed.
Parameters:
denseSubspaces - the dense subspaces in reverse order by their coverage
Returns:
the mean of the cover fractions, the first value is the mean of the selected set I, the second value is the mean of the pruned set P.
• computeDiffs

private double[][] computeDiffs(java.util.List<CLIQUESubspace> denseSubspaces,
int[] mi,
int[] mp)
The specified sorted list of dense subspaces is divided into the selected set I and the pruned set P. For each set the difference from the specified mean values is computed.
Parameters:
denseSubspaces - denseSubspaces the dense subspaces in reverse order by their coverage
mi - the mean of the selected sets I
mp - the mean of the pruned sets P
Returns:
the difference from the specified mean values, the first value is the difference from the mean of the selected set I, the second value is the difference from the mean of the pruned set P.
• log2OrZero

private static double log2OrZero(double x)
Robust log 2, that ignores zero values.
Parameters:
x - Input value
Returns:
Log2(x), or zero.
• getInputTypeRestriction

public TypeInformation[] getInputTypeRestriction()
Description copied from class: AbstractAlgorithm
Get the input type restriction used for negotiating the data query.
Specified by:
getInputTypeRestriction in interface Algorithm
Specified by:
getInputTypeRestriction in class AbstractAlgorithm<Clustering<SubspaceModel>>
Returns:
Type restriction
• getLogger

protected Logging getLogger()
Description copied from class: AbstractAlgorithm
Get the (STATIC) logger for this class.
Specified by:
getLogger in class AbstractAlgorithm<Clustering<SubspaceModel>>
Returns:
the static logger