[comp.sys.ibm.pc] "comma-delimited" format for data files

aland@infmx.UUCP (Dr. Scump) (01/06/90)

I have heard the above term thrown around but can't find any hard info
on it.  I gather that the protocol is that all fields are delimited by
commas and that all character fields are enclosed in (double) quotes.

My questions:
   1) is the premise correct?  if not, what is the truth?
   2) what is the origin for the format, and how widely is it used?
      what product(s) use this format as their primary/only format
      for data unloaded to flat files?
   3) what is the protocol for embedding a double quote within a 
      character field to keep it from being taken as a delimiter?
   4) are DATE-type fields enclosed in quotes, for those products
      which support a DATE datatype and use comma-delimited format?
   5) why did De Anza have to start the Winter Quarter on January 2nd?
      (optional - extra credit)

I would appreciate it if people in the know email me a quick bit of
detail about this format.  We are considering supporting it in
our next DOS release, as customers do bring it up now and then.

Thanks muchly.

--
  Alan S. Denney  @  Informix Software, Inc.    "We're homeward bound
       {pyramid|uunet}!infmx!aland               ('tis a damn fine sound!)
 --------------------------------------------    with a good ship, taut & free
  Disclaimer:  These opinions are mine alone.    We don't give a damn, 
  If I am caught or killed, the secretary        when we drink our rum
  will disavow any knowledge of my actions.      with the girls of old Maui."

cramer@optilink.UUCP (Clayton Cramer) (01/09/90)

In article <3008@infmx.UUCP>, aland@infmx.UUCP (Dr. Scump) writes:
> I have heard the above term thrown around but can't find any hard info
> on it.  I gather that the protocol is that all fields are delimited by
> commas and that all character fields are enclosed in (double) quotes.

Character fields only need to be enclosed in double quotes if they
contains commas themselves.

> My questions:
>    1) is the premise correct?  if not, what is the truth?

Yes.  What is truth? :-)

>    2) what is the origin for the format, and how widely is it used?
>       what product(s) use this format as their primary/only format
>       for data unloaded to flat files?

I believe it derives from CP/M BASIC programs, since the strings
enclosed in double quotes are a traditional BASIC way of passing in
fields containing commas.

I know of at least one word processor (Volkswriter) that uses this
format for fill in the blanks type form letters.  PC-File III
(or whatever it is calling itself these days) both reads and write
comma-delimited files, as does Microsoft Excel.

>    3) what is the protocol for embedding a double quote within a 
>       character field to keep it from being taken as a delimiter?

Two double quotes in a row.  I don't know if anyone recognizes
the use of back slash as an escape.  The two double quotes rule
derives from ancient history -- it was standard when I learned
to program on an IBM 1401.

>    4) are DATE-type fields enclosed in quotes, for those products
>       which support a DATE datatype and use comma-delimited format?

I've never seen commas in DATE fields, so I would assume that DATE
fields wouldn't need quotes around them.

>   Alan S. Denney  @  Informix Software, Inc.    "We're homeward bound
-- 
Clayton E. Cramer {pyramid,pixar,tekbspa}!optilink!cramer
"Power still comes from guns" -- Newsweek, 01/08/90, p. 25.
===============================================================================
Disclaimer?  You must be kidding!  No company would hold opinions like mine!

dsmith@simpact.com (01/11/90)

In article <3008@infmx.UUCP>, aland@infmx.UUCP (Dr. Scump) writes:
> I have heard the above term thrown around but can't find any hard info
> on it.  I gather that the protocol is that all fields are delimited by
> commas and that all character fields are enclosed in (double) quotes.
> 
> My questions:
>    1) is the premise correct?  if not, what is the truth?

You are correct.

>    2) what is the origin for the format, and how widely is it used?
>       what product(s) use this format as their primary/only format
>       for data unloaded to flat files?

The format originated back in the CP/M days when DBASE and other
BASIC programs used this as their internal format.  The format is
used today for interchange of information between programs with
different internal formats such as DBASE III and Supercalc or 123.

>    3) what is the protocol for embedding a double quote within a 
>       character field to keep it from being taken as a delimiter?

Two double quotes in a row within a string.

>    4) are DATE-type fields enclosed in quotes, for those products
>       which support a DATE datatype and use comma-delimited format?

I have not seen a DATE type field.

>    5) why did De Anza have to start the Winter Quarter on January 2nd?
>       (optional - extra credit)

You got me!

> 
> I would appreciate it if people in the know email me a quick bit of
> detail about this format.  We are considering supporting it in
> our next DOS release, as customers do bring it up now and then.
> 
> Thanks muchly.
> 
> --
>   Alan S. Denney  @  Informix Software, Inc.    "We're homeward bound
>        {pyramid|uunet}!infmx!aland               ('tis a damn fine sound!)
>  --------------------------------------------    with a good ship, taut & free
>   Disclaimer:  These opinions are mine alone.    We don't give a damn, 
>   If I am caught or killed, the secretary        when we drink our rum
>   will disavow any knowledge of my actions.      with the girls of old Maui."

warren@.cs.pdx.edu (Warren Harrison) (01/13/90)

In article <2926@optilink.UUCP> cramer@optilink.UUCP (Clayton Cramer) writes:
>In article <3008@infmx.UUCP>, aland@infmx.UUCP (Dr. Scump) writes:
>> I have heard the above term thrown around but can't find any hard info
>> on it.  I gather that the protocol is that all fields are delimited by
>> commas and that all character fields are enclosed in (double) quotes.
>
>Character fields only need to be enclosed in double quotes if they
>contains commas themselves.

The most popular reason for using comma-delimited fields under MS-DOS is
for importing into Lotus or dBASE. I'm not sure with dBASE, but with Lotus,
things to be taken as labels (ie, non-numerics) must be enclosed in quotes
even if they don't contain commas or quotes themselves (at least my version
of Quattro will ignore thm if they're not quoted and I assume it acts like
Lotus in this respect).


==========================================================================
Warren Harrison                                          warren@cs.pdx.edu
Department of Computer Science                                503/725-3108
Portland State University