
V - the type of NumberVector used@Description(value="This parser expects data in roughly the same format as the NumberVectorLabelParser,\nexcept that it will enumerate all unique strings to always produce numerical values.\nThis way, it can for example handle files that contain lines like \'y,n,y,y,n,y,n\'.") public class CategorialDataAsNumberVectorParser<V extends NumberVector> extends NumberVectorLabelParser<V>
| Modifier and Type | Class and Description |
|---|---|
static class |
CategorialDataAsNumberVectorParser.Parameterizer<V extends NumberVector>
Parameterization class.
|
BundleStreamSource.Event| Modifier and Type | Field and Description |
|---|---|
private static Logging |
LOG
Logging class.
|
(package private) Matcher |
nanpattern
Pattern for NaN values.
|
(package private) TObjectIntHashMap<String> |
unique
For String unification.
|
(package private) int |
ustart
Base for enumerating unique values.
|
attributes, columnnames, curlbl, curvec, factory, haslabels, labels, maxdim, meta, mindim, nexteventreader, tokenizer| Constructor and Description |
|---|
CategorialDataAsNumberVectorParser(CSVReaderFormat format,
long[] labelIndices,
NumberVector.Factory<V> factory)
Constructor.
|
CategorialDataAsNumberVectorParser(NumberVector.Factory<V> factory)
Constructor with defaults.
|
| Modifier and Type | Method and Description |
|---|---|
protected Logging |
getLogger()
Get the logger for this class.
|
BundleStreamSource.Event |
nextEvent()
Get the next event
|
protected boolean |
parseLineInternal()
Internal method for parsing a single line.
|
buildMeta, cleanup, createVector, data, getMeta, getTypeInformation, initStream, isLabelColumnasMultipleObjectsBundle, assignDBID, hasDBIDs, parseprivate static final Logging LOG
TObjectIntHashMap<String> unique
int ustart
Matcher nanpattern
public CategorialDataAsNumberVectorParser(NumberVector.Factory<V> factory)
factory - Vector factorypublic CategorialDataAsNumberVectorParser(CSVReaderFormat format, long[] labelIndices, NumberVector.Factory<V> factory)
format - Input formatlabelIndices - Column indexes that are numeric.factory - Vector factorypublic BundleStreamSource.Event nextEvent()
BundleStreamSourcenextEvent in interface BundleStreamSourcenextEvent in class NumberVectorLabelParser<V extends NumberVector>protected boolean parseLineInternal()
NumberVectorLabelParserparseLineInternal in class NumberVectorLabelParser<V extends NumberVector>true when a valid line was read, false on a label
row.protected Logging getLogger()
AbstractStreamingParsergetLogger in class NumberVectorLabelParser<V extends NumberVector>Copyright © 2015 ELKI Development Team, Lehr- und Forschungseinheit für Datenbanksysteme, Ludwig-Maximilians-Universität München. License information.