Class CSVReaderFormat


  • public class CSVReaderFormat
    extends java.lang.Object
    Basic format factory for parsing CSV-like formats. To read CSV files into ELKI, see NumberVectorLabelParser. This class encapsulates csv format settings, that need to be parsed in multiple places from ELKI, not only on input vector files.
    Since:
    0.1
    Author:
    Arthur Zimek, Erich Schubert
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  CSVReaderFormat.Par
      Parameterization class.
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static java.lang.String ATTRIBUTE_CONCATENATION
      A sign to separate attributes.
      protected java.util.regex.Pattern colSep
      Stores the column separator pattern
      protected java.util.regex.Pattern comment
      Comment pattern.
      static java.lang.String COMMENT_PATTERN
      Default pattern for comments.
      static CSVReaderFormat DEFAULT_FORMAT
      Default CSV input format.
      static java.lang.String DEFAULT_SEPARATOR
      A pattern defining whitespace.
      static java.lang.String NUMBER_PATTERN
      A pattern catching most numbers that can be parsed using Double.parseDouble: Some examples: 1 1. 1.2 .2 -.2e-03
      static java.lang.String QUOTE_CHARS
      A quote pattern
      protected java.lang.String quoteChars
      Stores the quotation character
    • Constructor Summary

      Constructors 
      Constructor Description
      CSVReaderFormat​(java.util.regex.Pattern colSep, java.lang.String quoteChars, java.util.regex.Pattern comment)
      Constructor.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      TokenizedReader makeReader()
      Make a reader for the configured format.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • DEFAULT_SEPARATOR

        public static final java.lang.String DEFAULT_SEPARATOR
        A pattern defining whitespace.
        See Also:
        Constant Field Values
      • QUOTE_CHARS

        public static final java.lang.String QUOTE_CHARS
        A quote pattern
        See Also:
        Constant Field Values
      • NUMBER_PATTERN

        public static final java.lang.String NUMBER_PATTERN
        A pattern catching most numbers that can be parsed using Double.parseDouble: Some examples: 1 1. 1.2 .2 -.2e-03
        See Also:
        Constant Field Values
      • COMMENT_PATTERN

        public static final java.lang.String COMMENT_PATTERN
        Default pattern for comments.
        See Also:
        Constant Field Values
      • ATTRIBUTE_CONCATENATION

        public static final java.lang.String ATTRIBUTE_CONCATENATION
        A sign to separate attributes.
        See Also:
        Constant Field Values
      • DEFAULT_FORMAT

        public static final CSVReaderFormat DEFAULT_FORMAT
        Default CSV input format.
      • colSep

        protected java.util.regex.Pattern colSep
        Stores the column separator pattern
      • quoteChars

        protected java.lang.String quoteChars
        Stores the quotation character
      • comment

        protected java.util.regex.Pattern comment
        Comment pattern.
    • Constructor Detail

      • CSVReaderFormat

        public CSVReaderFormat​(java.util.regex.Pattern colSep,
                               java.lang.String quoteChars,
                               java.util.regex.Pattern comment)
        Constructor.
        Parameters:
        colSep - Column separator
        quoteChars - Quote character
        comment - Comment pattern
    • Method Detail

      • makeReader

        public TokenizedReader makeReader()
        Make a reader for the configured format.
        Returns:
        A tokenized reader for this format.