Class InverseDocumentFrequencyNormalization<V extends SparseNumberVector>
- java.lang.Object
-
- elki.datasource.filter.AbstractConversionFilter<I,O>
-
- elki.datasource.filter.AbstractVectorConversionFilter<V,V>
-
- elki.datasource.filter.normalization.columnwise.InverseDocumentFrequencyNormalization<V>
-
- Type Parameters:
V
- Vector type
- All Implemented Interfaces:
Normalization<V>
,ObjectFilter
public class InverseDocumentFrequencyNormalization<V extends SparseNumberVector> extends AbstractVectorConversionFilter<V,V> implements Normalization<V>
Normalization for text frequency (TF) vectors, using the inverse document frequency (IDF). See also: TF-IDF for text analysis.- Since:
- 0.4.0
- Author:
- Erich Schubert
-
-
Field Summary
Fields Modifier and Type Field Description (package private) it.unimi.dsi.fastutil.ints.Int2DoubleOpenHashMap
idf
The IDF storage.private static Logging
LOG
Class logger.(package private) int
objcnt
The number of objects in the dataset.-
Fields inherited from class elki.datasource.filter.AbstractVectorConversionFilter
factory
-
-
Constructor Summary
Constructors Constructor Description InverseDocumentFrequencyNormalization()
Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected SimpleTypeInformation<? super V>
convertedType(SimpleTypeInformation<V> in)
Get the output type from the input type after conversion.protected V
filterSingleObject(V featureVector)
Normalize a single instance.protected SimpleTypeInformation<? super V>
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.protected Logging
getLogger()
Class logger.protected void
prepareComplete()
Complete the initialization phase.protected void
prepareProcessInstance(V featureVector)
Process a single object during initialization.protected boolean
prepareStart(SimpleTypeInformation<V> in)
Return "true" when the normalization needs initialization (two-pass filtering!).V
restore(V featureVector)
Transforms a feature vector to the original attribute ranges.-
Methods inherited from class elki.datasource.filter.AbstractVectorConversionFilter
initializeOutputType
-
Methods inherited from class elki.datasource.filter.AbstractConversionFilter
filter, toString
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface elki.datasource.filter.normalization.Normalization
transform
-
Methods inherited from interface elki.datasource.filter.ObjectFilter
filter
-
-
-
-
Field Detail
-
LOG
private static final Logging LOG
Class logger.
-
idf
it.unimi.dsi.fastutil.ints.Int2DoubleOpenHashMap idf
The IDF storage.
-
objcnt
int objcnt
The number of objects in the dataset.
-
-
Method Detail
-
prepareStart
protected boolean prepareStart(SimpleTypeInformation<V> in)
Description copied from class:AbstractConversionFilter
Return "true" when the normalization needs initialization (two-pass filtering!).- Overrides:
prepareStart
in classAbstractConversionFilter<V extends SparseNumberVector,V extends SparseNumberVector>
- Parameters:
in
- Input type information- Returns:
- true or false
-
prepareProcessInstance
protected void prepareProcessInstance(V featureVector)
Description copied from class:AbstractConversionFilter
Process a single object during initialization.- Overrides:
prepareProcessInstance
in classAbstractConversionFilter<V extends SparseNumberVector,V extends SparseNumberVector>
- Parameters:
featureVector
- Object to process
-
prepareComplete
protected void prepareComplete()
Description copied from class:AbstractConversionFilter
Complete the initialization phase.- Overrides:
prepareComplete
in classAbstractConversionFilter<V extends SparseNumberVector,V extends SparseNumberVector>
-
filterSingleObject
protected V filterSingleObject(V featureVector)
Description copied from class:AbstractConversionFilter
Normalize a single instance. You can implement this as UnsupportedOperationException if you override both public "normalize" functions!- Specified by:
filterSingleObject
in classAbstractConversionFilter<V extends SparseNumberVector,V extends SparseNumberVector>
- Parameters:
featureVector
- Database object to normalize- Returns:
- Normalized database object
-
restore
public V restore(V featureVector)
Description copied from interface:Normalization
Transforms a feature vector to the original attribute ranges.- Specified by:
restore
in interfaceNormalization<V extends SparseNumberVector>
- Parameters:
featureVector
- a feature vector to be transformed into original space- Returns:
- a feature vector transformed into original space corresponding to the given feature vector
-
convertedType
protected SimpleTypeInformation<? super V> convertedType(SimpleTypeInformation<V> in)
Description copied from class:AbstractConversionFilter
Get the output type from the input type after conversion.- Specified by:
convertedType
in classAbstractConversionFilter<V extends SparseNumberVector,V extends SparseNumberVector>
- Parameters:
in
- input type restriction- Returns:
- output type restriction
-
getInputTypeRestriction
protected SimpleTypeInformation<? super V> getInputTypeRestriction()
Description copied from class:AbstractConversionFilter
Get the input type restriction used for negotiating the data query.- Specified by:
getInputTypeRestriction
in classAbstractConversionFilter<V extends SparseNumberVector,V extends SparseNumberVector>
- Returns:
- Type restriction
-
getLogger
protected Logging getLogger()
Description copied from class:AbstractConversionFilter
Class logger.- Specified by:
getLogger
in classAbstractConversionFilter<V extends SparseNumberVector,V extends SparseNumberVector>
- Returns:
- Logger
-
-