Package elki.utilities.io
Class Tokenizer
- java.lang.Object
-
- elki.utilities.io.Tokenizer
-
-
Field Summary
Fields Modifier and Type Field Description private intendCurrent positions of result and iterator.private intindexCurrent positions of result and iterator.private java.lang.CharSequenceinputData currently processed.private static LoggingLOGClass logger.private java.util.regex.MatchermatcherRegular expression match helper.static java.lang.StringQUOTE_CHARQuote charactersprivate char[]quoteCharsStores the quotation characterprivate booleanquotedWhether the current token is a quoted string.private intsendSubstring to process.private intstartCurrent positions of result and iterator.
-
Constructor Summary
Constructors Constructor Description Tokenizer(java.util.regex.Pattern colSep, java.lang.String quoteChars)Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Tokenizeradvance()Moves the iterator forward to the next entry.voidcleanup()Perform cleanup.chargetChar(int off)Get a single character.doublegetDouble()Get current value as double.intgetEnd()Get end of token.intgetIntBase10()Get current value as int.intgetLength()Get length of token.longgetLongBase10()Get current value as long.intgetStart()Get start of token.java.lang.StringgetStrippedSubstring()Get the current part as substringjava.lang.StringgetSubstring()Get the current part as substringvoidinitialize(java.lang.CharSequence input, int begin, int end)Initialize parser with a new string.booleanisEmpty()Test for empty tokens; usually at end of line.private charisQuote(int index)Detect quote characters.booleanisQuoted()Test if the current string was quoted.booleanvalid()Returns true if the iterator currently points to a valid object.
-
-
-
Field Detail
-
LOG
private static final Logging LOG
Class logger.
-
QUOTE_CHAR
public static final java.lang.String QUOTE_CHAR
Quote characters- See Also:
- Constant Field Values
-
quoteChars
private char[] quoteChars
Stores the quotation character
-
matcher
private java.util.regex.Matcher matcher
Regular expression match helper.
-
input
private java.lang.CharSequence input
Data currently processed.
-
send
private int send
Substring to process.
-
start
private int start
Current positions of result and iterator.
-
end
private int end
Current positions of result and iterator.
-
index
private int index
Current positions of result and iterator.
-
quoted
private boolean quoted
Whether the current token is a quoted string.
-
-
Method Detail
-
initialize
public void initialize(java.lang.CharSequence input, int begin, int end)Initialize parser with a new string.- Parameters:
input- New string to parse.begin- Beginend- End
-
valid
public boolean valid()
Description copied from interface:IterReturns true if the iterator currently points to a valid object.
-
advance
public Tokenizer advance()
Description copied from interface:IterMoves the iterator forward to the next entry.
-
getSubstring
public java.lang.String getSubstring()
Get the current part as substring- Returns:
- Current value as substring.
-
getStrippedSubstring
public java.lang.String getStrippedSubstring()
Get the current part as substring- Returns:
- Current value as substring.
-
getDouble
public double getDouble()
Get current value as double.- Returns:
- double value
- Throws:
java.lang.NumberFormatException- when current value cannot be parsed as double
-
getIntBase10
public int getIntBase10()
Get current value as int.- Returns:
- int value
- Throws:
java.lang.NumberFormatException- when current value cannot be parsed as int.
-
getLongBase10
public long getLongBase10()
Get current value as long.- Returns:
- long value
- Throws:
java.lang.NumberFormatException- when current value cannot be parsed as long.
-
isEmpty
public boolean isEmpty()
Test for empty tokens; usually at end of line.- Returns:
- Empty
-
isQuote
private char isQuote(int index)
Detect quote characters.TODO: support more than one quote character, make sure opening and closing quotes match then.
- Parameters:
index- Position- Returns:
1when a quote character,0otherwise.
-
isQuoted
public boolean isQuoted()
Test if the current string was quoted.- Returns:
truewhen quoted.
-
getStart
public int getStart()
Get start of token.- Returns:
- Start
-
getEnd
public int getEnd()
Get end of token.- Returns:
- End
-
getLength
public int getLength()
Get length of token.- Returns:
- Token length
-
getChar
public char getChar(int off)
Get a single character.- Parameters:
off- Offset- Returns:
- Character
-
cleanup
public void cleanup()
Perform cleanup.
-
-