Package  Description 

de.lmu.ifi.dbs.elki.algorithm 
Algorithms suitable as a task for the
KDDTask
main routine. 
de.lmu.ifi.dbs.elki.algorithm.benchmark 
Benchmarking pseudo algorithms.

de.lmu.ifi.dbs.elki.algorithm.classification 
Classification algorithms.

de.lmu.ifi.dbs.elki.algorithm.clustering 
Clustering algorithms
Clustering algorithms are supposed to implement the
Algorithm Interface. 
de.lmu.ifi.dbs.elki.algorithm.clustering.gdbscan 
Generalized DBSCAN
Generalized DBSCAN is an abstraction of the original DBSCAN idea,
that allows the use of arbitrary "neighborhood" and "core point" predicates.

de.lmu.ifi.dbs.elki.algorithm.clustering.hierarchical 
Hierarchical agglomerative clustering (HAC).

de.lmu.ifi.dbs.elki.algorithm.clustering.kmeans 
Kmeans clustering and variations

de.lmu.ifi.dbs.elki.algorithm.clustering.kmeans.parallel 
Parallelized implementations of kmeans.

de.lmu.ifi.dbs.elki.algorithm.clustering.optics 
OPTICS family of clustering algorithms.

de.lmu.ifi.dbs.elki.algorithm.outlier 
Outlier detection algorithms

de.lmu.ifi.dbs.elki.algorithm.outlier.clustering 
Clustering based outlier detection.

de.lmu.ifi.dbs.elki.algorithm.outlier.distance 
Distancebased outlier detection algorithms, such as DBOutlier and kNN.

de.lmu.ifi.dbs.elki.algorithm.outlier.distance.parallel 
Parallel implementations of distancebased outlier detectors.

de.lmu.ifi.dbs.elki.algorithm.outlier.intrinsic 
Outlier detection algorithms based on intrinsic dimensionality.

de.lmu.ifi.dbs.elki.algorithm.outlier.lof 
LOF family of outlier detection algorithms

de.lmu.ifi.dbs.elki.algorithm.outlier.lof.parallel 
Parallelized variants of LOF.

de.lmu.ifi.dbs.elki.algorithm.outlier.spatial 
Spatial outlier detection algorithms

de.lmu.ifi.dbs.elki.algorithm.statistics 
Statistical analysis algorithms.

tutorial.clustering 
Classes from the tutorial on implementing a custom kmeans variation

tutorial.outlier 
Tutorials on implementing outlier detection methods in ELKI.

Modifier and Type  Class and Description 

class 
AbstractDistanceBasedAlgorithm<O,R extends Result>
Abstract base class for distancebased algorithms.

class 
AbstractNumberVectorDistanceBasedAlgorithm<O,R extends Result>
Abstract base class for distancebased algorithms that need to work with
synthetic numerical vectors such as mean vectors.

class 
DependencyDerivator<V extends NumberVector>
Dependency derivator computes quantitatively linear dependencies among
attributes of a given dataset based on a linear correlation PCA.

class 
KNNDistancesSampler<O>
Provides an order of the kNNdistances for all objects within the database.

class 
KNNJoin<V extends NumberVector,N extends SpatialNode<N,E>,E extends SpatialEntry>
Joins in a given spatial database to each object its knearest neighbors.

Modifier and Type  Class and Description 

class 
KNNBenchmarkAlgorithm<O>
Benchmarking algorithm that computes the k nearest neighbors for each query
point.

class 
RangeQueryBenchmarkAlgorithm<O extends NumberVector>
Benchmarking algorithm that computes a range query for each point.

class 
ValidateApproximativeKNNIndex<O>
Algorithm to validate the quality of an approximative kNN index, by
performing a number of queries and comparing them to the results obtained by
exact indexing (e.g. linear scanning).

Modifier and Type  Class and Description 

class 
KNNClassifier<O>
KNNClassifier classifies instances based on the class distribution among the
k nearest neighbors in a database.

Modifier and Type  Class and Description 

class 
CanopyPreClustering<O>
Canopy preclustering is a simple preprocessing step for clustering.

class 
DBSCAN<O>
DensityBased Clustering of Applications with Noise (DBSCAN), an algorithm to
find densityconnected sets in a database.

class 
GriDBSCAN<V extends NumberVector>
Using Grid for Accelerating DensityBased Clustering.

class 
Leader<O>
Leader clustering algorithm.

class 
NaiveMeanShiftClustering<V extends NumberVector>
Meanshift based clustering algorithm.

Modifier and Type  Class and Description 

class 
LSDBC<O extends NumberVector>
Locally Scaled Density Based Clustering.

Modifier and Type  Class and Description 

class 
AbstractHDBSCAN<O,R extends Result>
Abstract base class for HDBSCAN variations.

class 
AGNES<O>
Hierarchical Agglomerative Clustering (HAC) or Agglomerative Nesting (AGNES)
is a classic hierarchical clustering algorithm.

class 
AnderbergHierarchicalClustering<O>
This is a modification of the classic AGNES algorithm for hierarchical
clustering using a nearestneighbor heuristic for acceleration.

class 
CLINK<O>
CLINK algorithm for complete linkage.

class 
HDBSCANLinearMemory<O>
Linear memory implementation of HDBSCAN clustering.

class 
MiniMax<O>
Minimax Linkage clustering.

class 
MiniMaxAnderberg<O>
This is a modification of the classic MiniMax algorithm for hierarchical
clustering using a nearestneighbor heuristic for acceleration.

class 
MiniMaxNNChain<O>
MiniMax hierarchical clustering using the NNchain algorithm.

class 
NNChain<O>
NNchain clustering algorithm.

class 
SLINK<O>
Implementation of the efficient SingleLink Algorithm SLINK of R.

class 
SLINKHDBSCANLinearMemory<O>
Linear memory implementation of HDBSCAN clustering based on SLINK.

Modifier and Type  Interface and Description 

interface 
KMeans<V extends NumberVector,M extends Model>
Some constants and options shared among kmeans family algorithms.

Modifier and Type  Class and Description 

class 
AbstractKMeans<V extends NumberVector,M extends Model>
Abstract base class for kmeans implementations.

class 
BestOfMultipleKMeans<V extends NumberVector,M extends MeanModel>
Run KMeans multiple times, and keep the best run.

class 
CLARA<V>
Clustering Large Applications (CLARA) is a clustering method for large data
sets based on PAM, partitioning around medoids (
KMedoidsPAM ) based on
sampling. 
class 
CLARANS<V>
CLARANS: a method for clustering objects for spatial data mining
is inspired by PAM (partitioning around medoids,
KMedoidsPAM )
and CLARA and also based on sampling. 
class 
FastCLARA<V>
Clustering Large Applications (CLARA) with the
KMedoidsFastPAM
improvements, to increase scalability in the number of clusters. 
class 
FastCLARANS<V>
A faster variation of CLARANS, that can explore O(k) as many swaps at a
similar cost by considering all medoids for each candidate nonmedoid.

class 
KMeansAnnulus<V extends NumberVector>
Annulus kmeans algorithm.

class 
KMeansBisecting<V extends NumberVector,M extends MeanModel>
The bisecting kmeans algorithm works by starting with an initial
partitioning into two clusters, then repeated splitting of the largest
cluster to get additional clusters.

class 
KMeansCompare<V extends NumberVector>
CompareMeans: Accelerated kmeans by exploiting the triangle inequality and
pairwise distances of means to prune candidate means.

class 
KMeansElkan<V extends NumberVector>
Elkan's fast kmeans by exploiting the triangle inequality.

class 
KMeansExponion<V extends NumberVector>
Newlings's exponion kmeans algorithm, exploiting the triangle inequality.

class 
KMeansHamerly<V extends NumberVector>
Hamerly's fast kmeans by exploiting the triangle inequality.

class 
KMeansLloyd<V extends NumberVector>
The standard kmeans algorithm, using bulk iterations and commonly attributed
to Lloyd and Forgy (independently).

class 
KMeansMacQueen<V extends NumberVector>
The original kmeans algorithm, using MacQueen style incremental updates;
making this effectively an "online" (streaming) algorithm.

class 
KMeansMinusMinus<V extends NumberVector>
kmeans: A Unified Approach to Clustering and Outlier Detection.

class 
KMeansSimplifiedElkan<V extends NumberVector>
Simplified version of Elkan's kmeans by exploiting the triangle inequality.

class 
KMeansSort<V extends NumberVector>
SortMeans: Accelerated kmeans by exploiting the triangle inequality and
pairwise distances of means to prune candidate means (with sorting).

class 
KMediansLloyd<V extends NumberVector>
kmedians clustering algorithm, but using Lloydstyle bulk iterations instead
of the more complicated approach suggested by Kaufman and Rousseeuw (see
KMedoidsPAM instead). 
class 
KMedoidsFastPAM<V>
FastPAM: An improved version of PAM, that is usually O(k) times faster.

class 
KMedoidsFastPAM1<V>
FastPAM1: A version of PAM that is O(k) times faster, i.e., now in O((nk)²).

class 
KMedoidsPAM<V>
The original Partitioning Around Medoids (PAM) algorithm or kmedoids
clustering, as proposed by Kaufman and Rousseeuw in "Clustering by means of
Medoids".

class 
KMedoidsPAMReynolds<V>
The Partitioning Around Medoids (PAM) algorithm with some additional
optimizations proposed by Reynolds et al.

class 
KMedoidsPark<V>
A kmedoids clustering algorithm, implemented as EMstyle bulk algorithm.

class 
SingleAssignmentKMeans<V extends NumberVector>
PseudokMeans variations, that assigns each object to the nearest center.

class 
XMeans<V extends NumberVector,M extends MeanModel>
Xmeans: Extending Kmeans with Efficient Estimation on the Number of
Clusters.

Modifier and Type  Class and Description 

class 
ParallelLloydKMeans<V extends NumberVector>
Parallel implementation of kMeans clustering.

Modifier and Type  Class and Description 

class 
AbstractOPTICS<O>
The OPTICS algorithm for densitybased hierarchical clustering.

class 
DeLiClu<V extends NumberVector>
DeliClu: DensityBased Hierarchical Clustering
A hierarchical algorithm to find densityconnected sets in a database,
closely related to OPTICS but exploiting the structure of a Rtree for
acceleration.

class 
OPTICSHeap<O>
The OPTICS algorithm for densitybased hierarchical clustering.

class 
OPTICSList<O>
The OPTICS algorithm for densitybased hierarchical clustering.

Modifier and Type  Class and Description 

class 
COP<V extends NumberVector>
Correlation outlier probability: Outlier Detection in Arbitrarily Oriented
Subspaces
Reference:
HansPeter Kriegel, Peer Kröger, Erich Schubert, Arthur Zimek
Outlier Detection in Arbitrarily Oriented Subspaces Proc. 
class 
DWOF<O>
Algorithm to compute dynamicwindow outlier factors in a database based on a
specified parameter k, which specifies the number of the neighbors to be
considered during the calculation of the DWOF score.

class 
OPTICSOF<O>
OPTICSOF outlier detection algorithm, an algorithm to find Local Outliers in
a database based on ideas from
OPTICSTypeAlgorithm clustering. 
class 
SimpleCOP<V extends NumberVector>
Algorithm to compute local correlation outlier probability.

Modifier and Type  Class and Description 

class 
CBLOF<O extends NumberVector>
Clusterbased local outlier factor (CBLOF).

class 
SilhouetteOutlierDetection<O>
Outlier detection by using the Silhouette Coefficients.

Modifier and Type  Class and Description 

class 
AbstractDBOutlier<O>
Simple distance based outlier detection algorithms.

class 
DBOutlierDetection<O>
Simple distanced based outlier detection algorithm.

class 
DBOutlierScore<O>
Compute percentage of neighbors in the given neighborhood with size d.

class 
HilOut<O extends NumberVector>
Fast Outlier Detection in High Dimensional Spaces
Outlier Detection using Hilbert space filling curves
Reference:
F.

class 
KNNDD<O>
Nearest Neighbor Data Description.

class 
KNNOutlier<O>
Outlier Detection based on the distance of an object to its k nearest
neighbor.

class 
KNNSOS<O>
kNNbased adaption of Stochastic Outlier Selection.

class 
KNNWeightOutlier<O>
Outlier Detection based on the accumulated distances of a point to its k
nearest neighbors.

class 
LocalIsolationCoefficient<O>
The Local Isolation Coefficient is the sum of the kNN distance and the
average distance to its k nearest neighbors.

class 
ODIN<O>
Outlier detection based on the indegree of the kNN graph.

class 
ReferenceBasedOutlierDetection
ReferenceBased Outlier Detection algorithm, an algorithm that computes kNN
distances approximately, using reference points.

class 
SOS<O>
Stochastic Outlier Selection.

Modifier and Type  Class and Description 

class 
ParallelKNNOutlier<O>
Parallel implementation of KNN Outlier detection.

class 
ParallelKNNWeightOutlier<O>
Parallel implementation of KNN Weight Outlier detection.

Modifier and Type  Class and Description 

class 
IDOS<O>
Intrinsic Dimensional Outlier Detection in HighDimensional Data.

class 
IntrinsicDimensionalityOutlier<O>
Use intrinsic dimensionality for outlier detection.

class 
ISOS<O>
Intrinsic Stochastic Outlier Selection.

Modifier and Type  Class and Description 

class 
COF<O>
Connectivitybased Outlier Factor (COF).

class 
INFLO<O>
Influence Outliers using Symmetric Relationship (INFLO) using twoway search,
is an outlier detection method based on LOF; but also using the reverse kNN.

class 
KDEOS<O>
Generalized Outlier Detection with Flexible Kernel Density Estimates.

class 
LDF<O extends NumberVector>
Outlier Detection with Kernel Density Functions.

class 
LDOF<O>
Computes the LDOF (Local DistanceBased Outlier Factor) for all objects of a
Database.

class 
LOCI<O>
Fast Outlier Detection Using the "Local Correlation Integral".

class 
LOF<O>
Algorithm to compute densitybased local outlier factors in a database based
on a specified parameter
lof.k . 
class 
SimpleKernelDensityLOF<O extends NumberVector>
A simple variant of the LOF algorithm, which uses a simple kernel density
estimation instead of the local reachability density.

class 
SimplifiedLOF<O>
A simplified version of the original LOF algorithm, which does not use the
reachability distance, yielding less stable results on inliers.

class 
VarianceOfVolume<O extends SpatialComparable>
Variance of Volume for outlier detection.

Modifier and Type  Class and Description 

class 
ParallelLOF<O>
Parallel implementation of Local Outlier Factor using processors.

class 
ParallelSimplifiedLOF<O>
Parallel implementation of SimplifiedLOF Outlier detection using processors.

Modifier and Type  Class and Description 

class 
CTLuGLSBackwardSearchAlgorithm<V extends NumberVector>
GLSBackward Search is a statistical approach to detecting spatial outliers.

class 
CTLuRandomWalkEC<P>
Spatial outlier detection based on random walks.

Modifier and Type  Class and Description 

class 
AveragePrecisionAtK<O>
Evaluate a distance functions performance by computing the average precision
at k, when ranking the objects by distance.

class 
DistanceQuantileSampler<O>
Compute a quantile of a distance sample, useful for choosing parameters for
algorithms.

class 
DistanceStatisticsWithClasses<O>
Algorithm to gather statistics over the distance distribution in the data
set.

class 
EstimateIntrinsicDimensionality<O>
Estimate global average intrinsic dimensionality of a data set.

class 
EvaluateRankingQuality<V extends NumberVector>
Evaluate a distance function with respect to kNN queries.

class 
EvaluateRetrievalPerformance<O>
Evaluate a distance functions performance by computing the mean average
precision, ROC, and NN classification performance when ranking the objects by
distance.

class 
HopkinsStatisticClusteringTendency
The Hopkins Statistic of Clustering Tendency measures the probability that a
data set is generated by a uniform data distribution.

class 
RangeQuerySelectivity<V extends NumberVector>
Evaluate the range query selectivity.

class 
RankingQualityHistogram<O>
Evaluate a distance function with respect to kNN queries.

Modifier and Type  Class and Description 

class 
NaiveAgglomerativeHierarchicalClustering1<O>
This tutorial will step you through implementing a well known clustering
algorithm, agglomerative hierarchical clustering, in multiple steps.

class 
NaiveAgglomerativeHierarchicalClustering2<O>
This tutorial will step you through implementing a well known clustering
algorithm, agglomerative hierarchical clustering, in multiple steps.

class 
NaiveAgglomerativeHierarchicalClustering3<O>
This tutorial will step you through implementing a well known clustering
algorithm, agglomerative hierarchical clustering, in multiple steps.

class 
NaiveAgglomerativeHierarchicalClustering4<O>
This tutorial will step you through implementing a well known clustering
algorithm, agglomerative hierarchical clustering, in multiple steps.

class 
SameSizeKMeansAlgorithm<V extends NumberVector>
Kmeans variation that produces equally sized clusters.

Modifier and Type  Class and Description 

class 
DistanceStddevOutlier<O>
A simple outlier detection algorithm that computes the standard deviation of
the kNN distances.

Copyright © 2019 ELKI Development Team. License information.