Package elki.utilities.io
Class Tokenizer
- java.lang.Object
-
- elki.utilities.io.Tokenizer
-
-
Field Summary
Fields Modifier and Type Field Description private int
end
Current positions of result and iterator.private int
index
Current positions of result and iterator.private java.lang.CharSequence
input
Data currently processed.private static Logging
LOG
Class logger.private java.util.regex.Matcher
matcher
Regular expression match helper.static java.lang.String
QUOTE_CHAR
Quote charactersprivate char[]
quoteChars
Stores the quotation characterprivate boolean
quoted
Whether the current token is a quoted string.private int
send
Substring to process.private int
start
Current positions of result and iterator.
-
Constructor Summary
Constructors Constructor Description Tokenizer(java.util.regex.Pattern colSep, java.lang.String quoteChars)
Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Tokenizer
advance()
Moves the iterator forward to the next entry.void
cleanup()
Perform cleanup.char
getChar(int off)
Get a single character.double
getDouble()
Get current value as double.int
getEnd()
Get end of token.int
getIntBase10()
Get current value as int.int
getLength()
Get length of token.long
getLongBase10()
Get current value as long.int
getStart()
Get start of token.java.lang.String
getStrippedSubstring()
Get the current part as substringjava.lang.String
getSubstring()
Get the current part as substringvoid
initialize(java.lang.CharSequence input, int begin, int end)
Initialize parser with a new string.boolean
isEmpty()
Test for empty tokens; usually at end of line.private char
isQuote(int index)
Detect quote characters.boolean
isQuoted()
Test if the current string was quoted.boolean
valid()
Returns true if the iterator currently points to a valid object.
-
-
-
Field Detail
-
LOG
private static final Logging LOG
Class logger.
-
QUOTE_CHAR
public static final java.lang.String QUOTE_CHAR
Quote characters- See Also:
- Constant Field Values
-
quoteChars
private char[] quoteChars
Stores the quotation character
-
matcher
private java.util.regex.Matcher matcher
Regular expression match helper.
-
input
private java.lang.CharSequence input
Data currently processed.
-
send
private int send
Substring to process.
-
start
private int start
Current positions of result and iterator.
-
end
private int end
Current positions of result and iterator.
-
index
private int index
Current positions of result and iterator.
-
quoted
private boolean quoted
Whether the current token is a quoted string.
-
-
Method Detail
-
initialize
public void initialize(java.lang.CharSequence input, int begin, int end)
Initialize parser with a new string.- Parameters:
input
- New string to parse.begin
- Beginend
- End
-
valid
public boolean valid()
Description copied from interface:Iter
Returns true if the iterator currently points to a valid object.
-
advance
public Tokenizer advance()
Description copied from interface:Iter
Moves the iterator forward to the next entry.
-
getSubstring
public java.lang.String getSubstring()
Get the current part as substring- Returns:
- Current value as substring.
-
getStrippedSubstring
public java.lang.String getStrippedSubstring()
Get the current part as substring- Returns:
- Current value as substring.
-
getDouble
public double getDouble()
Get current value as double.- Returns:
- double value
- Throws:
java.lang.NumberFormatException
- when current value cannot be parsed as double
-
getIntBase10
public int getIntBase10()
Get current value as int.- Returns:
- int value
- Throws:
java.lang.NumberFormatException
- when current value cannot be parsed as int.
-
getLongBase10
public long getLongBase10()
Get current value as long.- Returns:
- long value
- Throws:
java.lang.NumberFormatException
- when current value cannot be parsed as long.
-
isEmpty
public boolean isEmpty()
Test for empty tokens; usually at end of line.- Returns:
- Empty
-
isQuote
private char isQuote(int index)
Detect quote characters.TODO: support more than one quote character, make sure opening and closing quotes match then.
- Parameters:
index
- Position- Returns:
1
when a quote character,0
otherwise.
-
isQuoted
public boolean isQuoted()
Test if the current string was quoted.- Returns:
true
when quoted.
-
getStart
public int getStart()
Get start of token.- Returns:
- Start
-
getEnd
public int getEnd()
Get end of token.- Returns:
- End
-
getLength
public int getLength()
Get length of token.- Returns:
- Token length
-
getChar
public char getChar(int off)
Get a single character.- Parameters:
off
- Offset- Returns:
- Character
-
cleanup
public void cleanup()
Perform cleanup.
-
-