Package elki.algorithm
Class DependencyDerivator<V extends NumberVector>
- java.lang.Object
-
- elki.algorithm.DependencyDerivator<V>
-
- Type Parameters:
V
- the type of FeatureVector handled by this Algorithm
- All Implemented Interfaces:
Algorithm
@Title("Dependency Derivator: Deriving numerical inter-dependencies on data") @Description("Derives an equality-system describing dependencies between attributes in a correlation-cluster") @Reference(authors="Elke Achtert, Christian B\u00f6hm, Hans-Peter Kriegel, Peer Kr\u00f6ger, Arthur Zimek", title="Deriving Quantitative Dependencies for Correlation Clusters", booktitle="Proc. 12th Int. Conf. on Knowledge Discovery and Data Mining (KDD \'06)", url="https://doi.org/10.1145/1150402.1150408", bibkey="DBLP:conf/kdd/AchtertBKKZ06") @Priority(-5) public class DependencyDerivator<V extends NumberVector> extends java.lang.Object implements Algorithm
Dependency derivator computes quantitatively linear dependencies among attributes of a given dataset based on a linear correlation PCA.Reference:
Elke Achtert, Christian Böhm, Hans-Peter Kriegel, Peer Kröger, Arthur Zimek
Deriving Quantitative Dependencies for Correlation Clusters
Proc. 12th Int. Conf. on Knowledge Discovery and Data Mining (KDD '06)- Since:
- 0.1
- Author:
- Arthur Zimek
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
DependencyDerivator.Par<V extends NumberVector>
Parameterization class.-
Nested classes/interfaces inherited from interface elki.Algorithm
Algorithm.Utils
-
-
Field Summary
Fields Modifier and Type Field Description private NumberVectorDistance<? super V>
distance
Distance function used.private EigenPairFilter
filter
Filter to select eigenvectors.private static Logging
LOG
The logger for this class.private java.text.NumberFormat
nf
Number format for output of solution.private PCARunner
pca
Holds the object performing the pca.private boolean
randomsample
Flag for random sampling vs. kNNprivate int
sampleSize
The number of samples to draw.
-
Constructor Summary
Constructors Constructor Description DependencyDerivator(NumberVectorDistance<? super V> distance, java.text.NumberFormat nf, PCARunner pca, EigenPairFilter filter, int sampleSize, boolean randomsample)
Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description CorrelationAnalysisSolution
generateModel(Relation<V> db, DBIDs ids)
Runs the pca on the given set of IDs.CorrelationAnalysisSolution
generateModel(Relation<V> relation, DBIDs ids, double[] centroid)
Runs the pca on the given set of IDs and for the given centroid.TypeInformation[]
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.CorrelationAnalysisSolution
run(Relation<V> relation)
Computes quantitatively linear dependencies among the attributes of the given database based on a linear correlation PCA.
-
-
-
Field Detail
-
LOG
private static final Logging LOG
The logger for this class.
-
distance
private NumberVectorDistance<? super V extends NumberVector> distance
Distance function used.
-
sampleSize
private final int sampleSize
The number of samples to draw.
-
pca
private final PCARunner pca
Holds the object performing the pca.
-
filter
private final EigenPairFilter filter
Filter to select eigenvectors.
-
nf
private final java.text.NumberFormat nf
Number format for output of solution.
-
randomsample
private final boolean randomsample
Flag for random sampling vs. kNN
-
-
Constructor Detail
-
DependencyDerivator
public DependencyDerivator(NumberVectorDistance<? super V> distance, java.text.NumberFormat nf, PCARunner pca, EigenPairFilter filter, int sampleSize, boolean randomsample)
Constructor.- Parameters:
distance
- distance functionnf
- Number formatpca
- PCA runnerfilter
- Eigenvector filtersampleSize
- sample sizerandomsample
- flag for random sampling
-
-
Method Detail
-
getInputTypeRestriction
public TypeInformation[] getInputTypeRestriction()
Description copied from interface:Algorithm
Get the input type restriction used for negotiating the data query.- Specified by:
getInputTypeRestriction
in interfaceAlgorithm
- Returns:
- Type restriction
-
run
public CorrelationAnalysisSolution run(Relation<V> relation)
Computes quantitatively linear dependencies among the attributes of the given database based on a linear correlation PCA.- Parameters:
relation
- the relation to process- Returns:
- the CorrelationAnalysisSolution computed by this DependencyDerivator
-
generateModel
public CorrelationAnalysisSolution generateModel(Relation<V> db, DBIDs ids)
Runs the pca on the given set of IDs. The centroid is computed from the given ids.- Parameters:
db
- the databaseids
- the set of ids- Returns:
- a matrix of equations describing the dependencies
-
generateModel
public CorrelationAnalysisSolution generateModel(Relation<V> relation, DBIDs ids, double[] centroid)
Runs the pca on the given set of IDs and for the given centroid.- Parameters:
relation
- the databaseids
- the set of idscentroid
- the centroid- Returns:
- a matrix of equations describing the dependencies
-
-