Package org.apache.commons.csv


package org.apache.commons.csv

Apache Commons CSV

CSV are widely used as interfaces to legacy systems or manual data imports. CSV stands for "Comma Separated Values" (or sometimes "Character Separated Values"). The CSV data format is defined in RFC 4180 but many dialects exist.

Common to all file dialects is its basic structure: The CSV data-format is record-oriented, whereas each record starts on a new textual line. A record is build of a list of values. Keep in mind that not all records must have an equal number of values:

       csv    := records*
       record := values*
 

The following list contains the CSV aspects the Commons CSV parser supports:

Separators (for lines)
The record separators are hardcoded and cannot be changed. The must be '\r', '\n', or '\r\n'.
Delimiter (for values)
The delimiter for values is freely configurable (default ',').
Comments
Some CSV dialects support a simple comment syntax. A comment is a record which must start with a designated character (the commentStarter). A record of this kind is treated as a comment and gets removed from the input (default none)
Encapsulator
Two encapsulator characters (default '"') are used to enclose -> complex values.
Simple values
A simple value consists of all characters (except the delimiter) until (but not including) the next delimiter or a record terminator. Optionally all surrounding whitespaces of a simple value can be ignored (default: true).
Complex values
Complex values are encapsulated within a pair of the defined encapsulator characters. The encapsulator itself must be escaped or doubled when used inside complex values. Complex values preserve all kinds of formatting (including newlines -> multiline-values)
Empty line skipping
Optionally empty lines in CSV files can be skipped. Otherwise, empty lines will return a record with a single empty value.

In addition to individually defined dialects, two predefined dialects (strict-csv, and excel-csv) can be set directly.

Example usage:

 Reader in = new StringReader("a,b,c");
 for (CSVRecord record : CSVFormat.DEFAULT.parse(in)) {
     for (String field : record) {
         System.out.print("\"" + field + "\", ");
     }
     System.out.println();
 }