Package elki.datasource.parser
Class SimpleTransactionParser
- java.lang.Object
-
- elki.datasource.parser.AbstractStreamingParser
-
- elki.datasource.parser.SimpleTransactionParser
-
- All Implemented Interfaces:
BundleStreamSource,Parser,StreamingParser
public class SimpleTransactionParser extends AbstractStreamingParser
Simple parser for transactional data, such as market baskets.To keep the input format simple and readable, all tokens are assumed to be of text and separated by whitespace, and each transaction is on a separate line.
An example file containing two transactions looks like this
bread butter milk paste tomato basil
TODO: add a parameter to, e.g., use the first or last entry as labels instead of tokens.- Since:
- 0.7.0
- Author:
- Erich Schubert
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classSimpleTransactionParser.ParParameterization class.-
Nested classes/interfaces inherited from interface elki.datasource.bundle.BundleStreamSource
BundleStreamSource.Event
-
-
Field Summary
Fields Modifier and Type Field Description (package private) it.unimi.dsi.fastutil.longs.LongArrayListbufBuffer, will be reused.(package private) BitVectorcurvecCurrent vector.(package private) it.unimi.dsi.fastutil.objects.Object2IntOpenHashMap<java.lang.String>keymapMap.private static LoggingLOGClass logger.protected BundleMetametaMetadata.(package private) BundleStreamSource.EventnexteventEvent to report next.(package private) intnumtermsNumber of different terms observed.-
Fields inherited from class elki.datasource.parser.AbstractStreamingParser
reader, tokenizer
-
-
Constructor Summary
Constructors Constructor Description SimpleTransactionParser(CSVReaderFormat format)Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidcleanup()Perform cleanup operations after parsing.java.lang.Objectdata(int rnum)Access a particular object and representation.protected LogginggetLogger()Get the logger for this class.BundleMetagetMeta()Get the current meta data.voidinitStream(java.io.InputStream in)Init the streaming parser for the given input stream.BundleStreamSource.EventnextEvent()Get the next event-
Methods inherited from class elki.datasource.parser.AbstractStreamingParser
asMultipleObjectsBundle, assignDBID, hasDBIDs, parse
-
-
-
-
Field Detail
-
LOG
private static final Logging LOG
Class logger.
-
numterms
int numterms
Number of different terms observed.
-
keymap
it.unimi.dsi.fastutil.objects.Object2IntOpenHashMap<java.lang.String> keymap
Map.
-
meta
protected BundleMeta meta
Metadata.
-
nextevent
BundleStreamSource.Event nextevent
Event to report next.
-
curvec
BitVector curvec
Current vector.
-
buf
it.unimi.dsi.fastutil.longs.LongArrayList buf
Buffer, will be reused.
-
-
Constructor Detail
-
SimpleTransactionParser
public SimpleTransactionParser(CSVReaderFormat format)
Constructor.- Parameters:
format- Input format
-
-
Method Detail
-
initStream
public void initStream(java.io.InputStream in)
Description copied from interface:StreamingParserInit the streaming parser for the given input stream.- Specified by:
initStreamin interfaceStreamingParser- Overrides:
initStreamin classAbstractStreamingParser- Parameters:
in- the stream to parse objects from
-
nextEvent
public BundleStreamSource.Event nextEvent()
Description copied from interface:BundleStreamSourceGet the next event- Returns:
- Event type
-
cleanup
public void cleanup()
Description copied from interface:ParserPerform cleanup operations after parsing.- Specified by:
cleanupin interfaceParser- Overrides:
cleanupin classAbstractStreamingParser
-
data
public java.lang.Object data(int rnum)
Description copied from interface:BundleStreamSourceAccess a particular object and representation.- Parameters:
rnum- Representation number- Returns:
- Contained data
-
getMeta
public BundleMeta getMeta()
Description copied from interface:BundleStreamSourceGet the current meta data.- Returns:
- Metadata
-
getLogger
protected Logging getLogger()
Description copied from class:AbstractStreamingParserGet the logger for this class.- Specified by:
getLoggerin classAbstractStreamingParser- Returns:
- Logger.
-
-