Package elki.outlier.clustering
Class CBLOF<O extends NumberVector>
- java.lang.Object
-
- elki.outlier.clustering.CBLOF<O>
-
- Type Parameters:
O
- the type of data objects handled by this algorithm
- All Implemented Interfaces:
Algorithm
,OutlierAlgorithm
@Title("Discovering cluster-based local outliers") @Reference(authors="Z. He, X. Xu, S. Deng", title="Discovering cluster-based local outliers", booktitle="Pattern Recognition Letters 24(9-10)", url="https://doi.org/10.1016/S0167-8655(03)00003-5", bibkey="DBLP:journals/prl/HeXD03") public class CBLOF<O extends NumberVector> extends java.lang.Object implements OutlierAlgorithm
Cluster-based local outlier factor (CBLOF).Reference:
Z. He, X. Xu, S. Deng
Discovering cluster-based local outliers
Pattern Recognition Letters 24(9-10)Implementation note: this algorithm is hard to implement in a generic fashion, as to support arbitrary clustering algorithms and distances, because it is not trivial to ensure both the clustering algorithm and the outlier method use compatible data types and distances.
- Since:
- 0.7.5
- Author:
- Patrick Kostjens
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface elki.Algorithm
Algorithm.Utils
-
-
Field Summary
Fields Modifier and Type Field Description protected double
alpha
The ratio of the size that separates the large clusters from the small clusters.protected double
beta
The minimal ratio between two consecutive clusters (when ordered descending by size) at which the boundary between the large and small clusters is set.protected ClusteringAlgorithm<Clustering<MeanModel>>
clusteringAlgorithm
The clustering algorithm to use.protected NumberVectorDistance<? super O>
distance
Distance function used.private static Logging
LOG
The logger for this class.
-
Constructor Summary
Constructors Constructor Description CBLOF(NumberVectorDistance<? super O> distance, ClusteringAlgorithm<Clustering<MeanModel>> clusteringAlgorithm, double alpha, double beta)
Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description private void
computeCBLOFs(Relation<O> relation, WritableDoubleDataStore cblofs, DoubleMinMax cblofMinMax, java.util.List<? extends Cluster<MeanModel>> largeClusters, java.util.List<? extends Cluster<MeanModel>> smallClusters)
Compute the CBLOF scores for all the data.private double
computeLargeClusterCBLOF(O obj, NumberVectorDistance<? super O> distance, NumberVector clusterMean, Cluster<MeanModel> cluster)
private double
computeSmallClusterCBLOF(O obj, NumberVectorDistance<? super O> distance, java.util.List<NumberVector> largeClusterMeans, Cluster<MeanModel> cluster)
private int
getClusterBoundary(Relation<O> relation, java.util.List<? extends Cluster<MeanModel>> clusters)
Compute the boundary index separating the large cluster from the small cluster.TypeInformation[]
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.OutlierResult
run(Database database, Relation<O> relation)
Run CBLOF.private void
storeCBLOFScore(WritableDoubleDataStore cblofs, DoubleMinMax cblofMinMax, double cblof, DBIDIter iter)
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface elki.outlier.OutlierAlgorithm
autorun
-
-
-
-
Field Detail
-
LOG
private static final Logging LOG
The logger for this class.
-
distance
protected NumberVectorDistance<? super O extends NumberVector> distance
Distance function used.
-
clusteringAlgorithm
protected ClusteringAlgorithm<Clustering<MeanModel>> clusteringAlgorithm
The clustering algorithm to use.
-
alpha
protected double alpha
The ratio of the size that separates the large clusters from the small clusters. The clusters are ordered descending by size and are taken until the specified ratio of the data is included. For example: a ratio of 0.9 indicates that the large clusters should cover at least 90% of the data points.
-
beta
protected double beta
The minimal ratio between two consecutive clusters (when ordered descending by size) at which the boundary between the large and small clusters is set. For example: a ratio of 3 means that the clusters are separated between cluster i and (i+1) (where (i+1) is the first cluster smaller than i) when cluster i is at least 3 times bigger than (i+1).
-
-
Constructor Detail
-
CBLOF
public CBLOF(NumberVectorDistance<? super O> distance, ClusteringAlgorithm<Clustering<MeanModel>> clusteringAlgorithm, double alpha, double beta)
Constructor.- Parameters:
distance
- the neighborhood distance functionclusteringAlgorithm
- the clustering algorithmalpha
- the ratio of the data that should be included in the large clustersbeta
- the ratio of the sizes of the clusters at the boundary between the large and the small clusters
-
-
Method Detail
-
run
public OutlierResult run(Database database, Relation<O> relation)
Run CBLOF.- Parameters:
database
- Database to run onrelation
- Relation to use for CBLOF computation- Returns:
- Outlier result
-
getClusterBoundary
private int getClusterBoundary(Relation<O> relation, java.util.List<? extends Cluster<MeanModel>> clusters)
Compute the boundary index separating the large cluster from the small cluster.- Parameters:
relation
- Data to processclusters
- All clusters that were found- Returns:
- Index of boundary between large and small cluster.
-
computeCBLOFs
private void computeCBLOFs(Relation<O> relation, WritableDoubleDataStore cblofs, DoubleMinMax cblofMinMax, java.util.List<? extends Cluster<MeanModel>> largeClusters, java.util.List<? extends Cluster<MeanModel>> smallClusters)
Compute the CBLOF scores for all the data.- Parameters:
relation
- Data to processcblofs
- CBLOF scorescblofMinMax
- Minimum/maximum score trackerlargeClusters
- Large clusters outputsmallClusters
- Small clusters output
-
storeCBLOFScore
private void storeCBLOFScore(WritableDoubleDataStore cblofs, DoubleMinMax cblofMinMax, double cblof, DBIDIter iter)
-
computeSmallClusterCBLOF
private double computeSmallClusterCBLOF(O obj, NumberVectorDistance<? super O> distance, java.util.List<NumberVector> largeClusterMeans, Cluster<MeanModel> cluster)
-
computeLargeClusterCBLOF
private double computeLargeClusterCBLOF(O obj, NumberVectorDistance<? super O> distance, NumberVector clusterMean, Cluster<MeanModel> cluster)
-
getInputTypeRestriction
public TypeInformation[] getInputTypeRestriction()
Description copied from interface:Algorithm
Get the input type restriction used for negotiating the data query.- Specified by:
getInputTypeRestriction
in interfaceAlgorithm
- Returns:
- Type restriction
-
-