[comp.sys.ibm.pc] Why is comma delimiter in batch files?

rl@cbnewsl.ATT.COM (roger.h.levy) (01/02/90)

I've noticed that batch files regard both blanks and commas as delimiters
whereas c programs regard only blanks as a delimeter (If it matters, I'm
using DOS 3.2).  For example:

main(argc, argv)
int	argc;
char	**argv;
{
	while (--argc)
		printf("%s\n", *++argv);
}

When this program is fed a parameter string of "one,two", it prints that
string back exactly, i.e. it regards the string as a single parameter.

However, when the following batch file is fed the same string:

echo off
:start
if x.%1 == x. goto exit
echo %1
shift
goto start
:exit
echo on

It prints "one" and "two" on different lines, i.e. it regards a comma
delimited string as multiple parameters.

My question: Is there a way to convince batch files to think of the comma
as part of the parameter, i.e. to regard only blanks as delimiters so as
to be consistent with c programs?

Roger H. Levy
Bell Labs, Whippany NJ
att!groucho!rl

cox@jolnet.ORPK.IL.US (Ben Cox) (01/02/90)

In article <3445@cbnewsl.ATT.COM> rl@cbnewsl.ATT.COM (roger.h.levy) writes:
>
>I've noticed that batch files regard both blanks and commas as delimiters
>whereas c programs regard only blanks as a delimeter (If it matters, I'm
>using DOS 3.2).  For example:
[a C program to echo all the command-line parameters]
>When this program is fed a parameter string of "one,two", it prints that
>string back exactly, i.e. it regards the string as a single parameter.
>
>However, when the following batch file is fed the same string:
[a batch file to do the same thing]
>It prints "one" and "two" on different lines, i.e. it regards a comma
>delimited string as multiple parameters.
>
>My question: Is there a way to convince batch files to think of the comma
>as part of the parameter, i.e. to regard only blanks as delimiters so as
>to be consistent with c programs?

No, unless you want to patch DOS.  An alternative, if you want to do it and
have source to your C startup code, is to patch that (Turbo C Pro comes with
this source, Microsoft may/may not -- Turbo C's name is "c0?.obj" where the ?
is the memory model) to parse a comma as a delimiter.

The problem is that when a DOS program gets a command line, it gets it all in
one string.  The C0S (for small) module contains code which sets up error
handlers, parses the command line out into the argv[] array, and calls your
main() function with a counter of args, the pointers to the args, and a pointer
to the environment: main(argc,argv,envp).  So the fact that "one,two" is being
treated as one arg is entirely the fault of the startup code.  When you run a
batch file, DOS parses the command-line string out for the batch file and lets
you reference the individual strings as %0-%9.  The difference is, of course,
in the way the things are parsed out.  Anyway :-), the point is that while you
can customize your C command-line parser, you can't customize DOS's command-
line parser (unless you want to patch dos, which is generally a Bad Idea).

>Roger H. Levy
>Bell Labs, Whippany NJ
>att!groucho!rl

Ben Cox
b-cox2@uiuc.edu (this gets forwarded to me wherever I am)

leonard@bucket.UUCP (Leonard Erickson) (01/03/90)

rl@cbnewsl.ATT.COM (roger.h.levy) writes:


>I've noticed that batch files regard both blanks and commas as delimiters
>whereas c programs regard only blanks as a delimeter (If it matters, I'm
>using DOS 3.2).  For example:

[example c program and comments deleted]

>When this program is fed a parameter string of "one,two", it prints that
>string back exactly, i.e. it regards the string as a single parameter.

>However, when the following batch file is fed the same string:

[example batch file and comments deleted]

>It prints "one" and "two" on different lines, i.e. it regards a comma
>delimited string as multiple parameters.

>My question: Is there a way to convince batch files to think of the comma
>as part of the parameter, i.e. to regard only blanks as delimiters so as
>to be consistent with c programs?

In a word, no. Batch files use COMMAND.COM. But it's (and MS-DOS's) rules
spacem, comma, and semi-colon are all delimiters. This behavior is *defined*
in the DOS manuals, so I'm afraid your out of luck. I wouldn't go that far
but a case could be made that under MS-DOS, your C is showing the broken
behaviour.

I'm rather puzzled as to why you'd care that a C program abd a batch file have
different behavior anyway. But since you do, you'd have to re-write *at least*
COMMAND.COM. Possibly more. 
-- 
Leonard Erickson		...!tektronix!reed!percival!bucket!leonard
CIS: [70465,203]
"I'm all in favor of keeping dangerous weapons out of the hands of fools.
Let's start with typewriters." -- Solomon Short

cs4g6ag@maccs.dcss.mcmaster.ca (Stephen M. Dunn) (01/04/90)

In article <3445@cbnewsl.ATT.COM> rl@cbnewsl.ATT.COM (roger.h.levy) writes:
$My question: Is there a way to convince batch files to think of the comma
$as part of the parameter, i.e. to regard only blanks as delimiters so as
$to be consistent with c programs?

   Well, on the command line you can surround strings with quotes and
that will cause any special characters in them to be treated as normal
characters instead (e.g. you can type
echo "Three is greater than two:  3>2"
and DOS will respond with
"Three is greater than two:  3>2"
or at least it will on my generic MS-DOS 3.20)
and I would imagine that this will remove the special connotation from
the comma.

   The reason why your C program and the batch file respond differently
is due to what it is that parses the input string.  For the batch file,
the string is parsed by COMMAND.COM using its idea of what a delimiter
is.  For the C program, the parsing is done by the C0.OBJ code using
its idea of what a delimiter is (the command tail passed to the C program
is passed straight to the program with the exception of redirection being
removed so that it's transparent to the program).  If the two don't agree,
you can have problems such as you show in your original article.

-- 
Stephen M. Dunn                               cs4g6ag@maccs.dcss.mcmaster.ca
          <std_disclaimer.h> = "\nI'm only an undergraduate!!!\n";
****************************************************************************
    If it's true that love is only a game//Well, then I can play pretend

pipkins@qmsseq.imagen.com (Jeff Pipkins) (01/05/90)

In article <2646@jolnet.ORPK.IL.US> b-cox2@uiuc.edu writes:
>In article <3445@cbnewsl.ATT.COM> rl@cbnewsl.ATT.COM (roger.h.levy) writes:
>>
>>I've noticed that batch files regard both blanks and commas as delimiters
>>whereas c programs regard only blanks as a delimeter (If it matters, I'm
>>using DOS 3.2).  For example:
[example deleted]
>>My question: Is there a way to convince batch files to think of the comma
>>as part of the parameter, i.e. to regard only blanks as delimiters so as
>>to be consistent with c programs?
>
>No, unless you want to patch DOS.
[correct.  Or write your own command.com...]
>An alternative, if you want to do it and
>have source to your C startup code, is to patch that (Turbo C Pro comes with
>this source, Microsoft may/may not -- Turbo C's name is "c0?.obj" where the ?
>is the memory model) to parse a comma as a delimiter.

Incidently, the command tail is not parsed by c0?.obj, but by a function
that it calls.  I don't remember its name right now.  Same goes for MSC,
the name is setargv() I think.  You can link in your own version of this
function if you want.  The command tail is stored in the program's PSP
at offset 80h.  MSC has a global _psp that has the segment address of
the PSP.  { char far *p;  FP_SEG(p) = _psp; FP_OFF(p) = 0x80; }
The byte at 80h has the length of the string, which actually starts at
81h.  The first byte (at 81h) is always a blank, and the string always
has a 0x0D byte at the end.  { p[*p] = 0;  p++;  printf("command tail
is %s\n", p); }

Hope this helps.

Ralf.Brown@B.GP.CS.CMU.EDU (01/05/90)

In article <74@qmsseq.imagen.com>, pipkins@qmsseq.imagen.com (Jeff Pipkins) wrote:
}Incidently, the command tail is not parsed by c0?.obj, but by a function
}that it calls.  I don't remember its name right now.  Same goes for MSC,
}the name is setargv() I think.  You can link in your own version of this
}function if you want.  The command tail is stored in the program's PSP
}at offset 80h.  MSC has a global _psp that has the segment address of
}the PSP.  { char far *p;  FP_SEG(p) = _psp; FP_OFF(p) = 0x80; }

TurboC also uses setargv() and _psp.

}The byte at 80h has the length of the string, which actually starts at
}81h.  The first byte (at 81h) is always a blank, and the string always
}has a 0x0D byte at the end.

81h is *not* always a blank.  If you type
       FOO/X
the command tail FOO is passed at 80h is
       02h "/" "X" 0Dh
--
UUCP: {ucbvax,harvard}!cs.cmu.edu!ralf -=- 412-268-3053 (school) -=- FAX: ask
ARPA: ralf@cs.cmu.edu  BIT: ralf%cs.cmu.edu@CMUCCVMA  FIDO: Ralf Brown 1:129/46
"How to Prove It" by Dana Angluin              Disclaimer? I claimed something?
17. proof by mutual reference:
    In reference A, Theorem 5 is said to follow from Theorem 3 in reference B,
    which is shown to follow from Corollary 6.2 in reference C, which is an
    easy consequence of Theorem 5 in reference A.

richard@calvin.spp.cornell.edu (Richard Brittain) (01/06/90)

In article <74@qmsseq.imagen.com> pipkins@qmsseq.UUCP (Jeff Pipkins) writes:
>In article <2646@jolnet.ORPK.IL.US> b-cox2@uiuc.edu writes:
>>In article <3445@cbnewsl.ATT.COM> rl@cbnewsl.ATT.COM (roger.h.levy) writes:
>>>My question: Is there a way to convince batch files to think of the comma
>>>as part of the parameter, i.e. to regard only blanks as delimiters so as
>>>to be consistent with c programs?
>>
>>An alternative, if you want to do it and
>>have source to your C startup code, is to patch that (Turbo C Pro comes with
>>this source, Microsoft may/may not -- Turbo C's name is "c0?.obj" where the ?
>>is the memory model) to parse a comma as a delimiter.
>
>Incidently, the command tail is not parsed by c0?.obj, but by a function
>that it calls.  I don't remember its name right now.  Same goes for MSC,
>the name is setargv() I think.  You can link in your own version of this

  At least in Turbo-C, the function is definitely setargv(), and you can 
replace it without messing with the startup .obj file.  The option to have
wildcards expanded in the argv array just links in a different setargv(), and
if you don't need command line parameters at all, you can save some memory
in the executable by saying  setargv() {};
I have versions of setargv() written in both C and assembler  (distributed, I
think, before the current release of Turbo-C, which gives you the source for
setargv.asm), and it is quite easy to change the parsing and/or wildcard 
expansion.  I hacked mine to recognise [] as part of a wildcard pattern, so the
wildunix TSR program worked better.

Richard Brittain,                   School of Elect. Eng.,  Upson Hall   
                                    Cornell University, Ithaca, NY 14853
ARPA: richard@calvin.spp.cornell.edu	
UUCP: {uunet,uw-beaver,rochester,cmcl2}!cornell!calvin!richard