Class InverseDocumentFrequencyNormalization<V extends SparseNumberVector>
- java.lang.Object
-
- elki.datasource.filter.AbstractConversionFilter<I,O>
-
- elki.datasource.filter.AbstractVectorConversionFilter<V,V>
-
- elki.datasource.filter.normalization.columnwise.InverseDocumentFrequencyNormalization<V>
-
- Type Parameters:
V- Vector type
- All Implemented Interfaces:
Normalization<V>,ObjectFilter
public class InverseDocumentFrequencyNormalization<V extends SparseNumberVector> extends AbstractVectorConversionFilter<V,V> implements Normalization<V>
Normalization for text frequency (TF) vectors, using the inverse document frequency (IDF). See also: TF-IDF for text analysis.- Since:
- 0.4.0
- Author:
- Erich Schubert
-
-
Field Summary
Fields Modifier and Type Field Description (package private) it.unimi.dsi.fastutil.ints.Int2DoubleOpenHashMapidfThe IDF storage.private static LoggingLOGClass logger.(package private) intobjcntThe number of objects in the dataset.-
Fields inherited from class elki.datasource.filter.AbstractVectorConversionFilter
factory
-
-
Constructor Summary
Constructors Constructor Description InverseDocumentFrequencyNormalization()Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected SimpleTypeInformation<? super V>convertedType(SimpleTypeInformation<V> in)Get the output type from the input type after conversion.protected VfilterSingleObject(V featureVector)Normalize a single instance.protected SimpleTypeInformation<? super V>getInputTypeRestriction()Get the input type restriction used for negotiating the data query.protected LogginggetLogger()Class logger.protected voidprepareComplete()Complete the initialization phase.protected voidprepareProcessInstance(V featureVector)Process a single object during initialization.protected booleanprepareStart(SimpleTypeInformation<V> in)Return "true" when the normalization needs initialization (two-pass filtering!).Vrestore(V featureVector)Transforms a feature vector to the original attribute ranges.-
Methods inherited from class elki.datasource.filter.AbstractVectorConversionFilter
initializeOutputType
-
Methods inherited from class elki.datasource.filter.AbstractConversionFilter
filter, toString
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface elki.datasource.filter.normalization.Normalization
transform
-
Methods inherited from interface elki.datasource.filter.ObjectFilter
filter
-
-
-
-
Field Detail
-
LOG
private static final Logging LOG
Class logger.
-
idf
it.unimi.dsi.fastutil.ints.Int2DoubleOpenHashMap idf
The IDF storage.
-
objcnt
int objcnt
The number of objects in the dataset.
-
-
Method Detail
-
prepareStart
protected boolean prepareStart(SimpleTypeInformation<V> in)
Description copied from class:AbstractConversionFilterReturn "true" when the normalization needs initialization (two-pass filtering!).- Overrides:
prepareStartin classAbstractConversionFilter<V extends SparseNumberVector,V extends SparseNumberVector>- Parameters:
in- Input type information- Returns:
- true or false
-
prepareProcessInstance
protected void prepareProcessInstance(V featureVector)
Description copied from class:AbstractConversionFilterProcess a single object during initialization.- Overrides:
prepareProcessInstancein classAbstractConversionFilter<V extends SparseNumberVector,V extends SparseNumberVector>- Parameters:
featureVector- Object to process
-
prepareComplete
protected void prepareComplete()
Description copied from class:AbstractConversionFilterComplete the initialization phase.- Overrides:
prepareCompletein classAbstractConversionFilter<V extends SparseNumberVector,V extends SparseNumberVector>
-
filterSingleObject
protected V filterSingleObject(V featureVector)
Description copied from class:AbstractConversionFilterNormalize a single instance. You can implement this as UnsupportedOperationException if you override both public "normalize" functions!- Specified by:
filterSingleObjectin classAbstractConversionFilter<V extends SparseNumberVector,V extends SparseNumberVector>- Parameters:
featureVector- Database object to normalize- Returns:
- Normalized database object
-
restore
public V restore(V featureVector)
Description copied from interface:NormalizationTransforms a feature vector to the original attribute ranges.- Specified by:
restorein interfaceNormalization<V extends SparseNumberVector>- Parameters:
featureVector- a feature vector to be transformed into original space- Returns:
- a feature vector transformed into original space corresponding to the given feature vector
-
convertedType
protected SimpleTypeInformation<? super V> convertedType(SimpleTypeInformation<V> in)
Description copied from class:AbstractConversionFilterGet the output type from the input type after conversion.- Specified by:
convertedTypein classAbstractConversionFilter<V extends SparseNumberVector,V extends SparseNumberVector>- Parameters:
in- input type restriction- Returns:
- output type restriction
-
getInputTypeRestriction
protected SimpleTypeInformation<? super V> getInputTypeRestriction()
Description copied from class:AbstractConversionFilterGet the input type restriction used for negotiating the data query.- Specified by:
getInputTypeRestrictionin classAbstractConversionFilter<V extends SparseNumberVector,V extends SparseNumberVector>- Returns:
- Type restriction
-
getLogger
protected Logging getLogger()
Description copied from class:AbstractConversionFilterClass logger.- Specified by:
getLoggerin classAbstractConversionFilter<V extends SparseNumberVector,V extends SparseNumberVector>- Returns:
- Logger
-
-