Parse-O-Matic
Home Sitemap
 
About the PSKB / Terms of Use

CSV stands for "Comma Separated Value". It is a standard means of exchanging data, and virtually all spreadsheets and databases can import data presented in this fashion.

Sample CSV File

Here is an example of a CSV file:
"REVIEW_DATE","AUTHOR","ISBN","DISCOUNTED_PRICE"
"1985/01/21","Douglas Adams",0345391802,5.95
"1990/01/12","Douglas Hofstadter",0465026567,9.95
"1998/07/15","Timothy ""The Parser"" Campbell",0968411304,18.99
"1999/12/03","Richard Friedman",0060630353,5.95
"2001/09/19","Karen Armstrong",0345384563,9.95
"2002/06/23","David Jones",0198504691,9.95
"2002/06/23","Julian Jaynes",0618057072,12.50
"2003/09/30","Scott Adams",0740721909,4.95
"2004/10/04","Benjamin Radcliff",0804818088,4.95
"2004/10/04","Randel Helms",0879755725,4.50

As you can see, the four fields (i.e. data items) are separated by commas. This is why CSV files are sometimes referred to as "comma-delimited files", even though such files usually have a .CSV extension.

Explanation

Note that the first line of the file lists the field names. This is not, strictly speaking, part of the CSV format, but it is a common convention, which is done to make the data easier to import into other programs.

Note that text fields are in quotes, whereas numeric fields are not. One reason for this is that a text field might contain a comma. Another reason is that by putting the field in quotes it tells the importing program that the field is to be treated "as-is", without alteration. Some programs honour this convention, while others do not. (Excel, for example, has a tendency to be terribly clever about the fields it imports.)

Note also that the fourth line contains doubled-up quotes within a quoted field. When quotes appear in a quoted field, the quotes are doubled to indicate that they do not mark the end of the field. When the CSV field is read in, the doubled-up quotes should be converted back to single instances. Be warned, though, that even though this is the convention, not all programs do this.

Exceptions

While we're on the subject of exceptions, you will frequently see CSV files in which the text fields are not surrounded by quotes. In such case the programmer has decided that it is impossible for the text field to ever contain a comma.

You will sometimes see CSV files in which the fields are separated by semicolons instead of commas. Such a file might look like this:

"Nom D'Employee";"Addresse";"Numero D'Employee"
"Jean Bonnehomme";"1234 rue Veritable";00001
"Marie Antoinette";"4321 rue Gateau";00002

The use of the semicolon in CSV files (as shown above) is particularly prevalent in Europe, since they may use a comma as a decimal separator. A North American might write 3.14159 while a European might write 3,14159 to represent the same value.


 
An alternative to the comma-separated-value file is the Tab-delimited file. This puts a tab character (character value: Decimal 9, Hex $09) between each field.

The advantage of tab-delimiting is that you do not have to worry about commas or quotes. (Tabs almost never appear in a human-readable (plain text) data field, except perhaps due to a data-entry mistake that wasn't filtered out.)

The disadvantage of delimiting this way is that the tab character is treated differently by different programs. A CSV file can be loaded into any text editor program, and it will be obvious how each record (one line of fields) is divided. But different text editors treat tabs in different ways. In most cases the tab will look exactly like one or more spaces.



 

Parse-O-Matic Free, Basic, Business and Enterprise are data conversion tools that allow you to parse, convert, mine, import and export data files, reports, web capture, logs, legacy databases, text, CSV (comma separated; comma delimited), ASCII, EBCDIC, and almost any data format that you may have.

Copyright © 1986-2011 National Data Parsing Canada Corporation All rights reserved. Legal