[comp.os.msdos.programmer] Long command lines

bright@Data-IO.COM (Walter Bright) (08/23/90)

I've been looking at ways that a parent program can spawn a child program
with command lines that are > 127 bytes long (the max for DOS and OS/2
(why didn't OS/2 fix this?)). The most obvious approach is to pass the
long line via an environment variable.

I've heard that some MAKE programs do this.

What name do they use for the environment variable? Is there any sort
of defacto standard for this? What if both the environment variable
and a command line exist, does the environment variable come 'before'
or 'after' the command line?

If there is no commonly used environment variable, how about defining one,
I propose "CMDLINE".

(Interestingly, MSC6 CL passes long command lines to the compiler passes
C1 with the environment variable "_MSC_CMD_FLAGS".)

a269@mindlink.UUCP (Mischa Sandberg) (08/27/90)

4DOS does exactly that. 4DOS is a replacement for COMMAND.COM
that seriously improves on MS's command shell. It can only put
the firs 127 bytes into the PSP, but up to 256 bytes goes into
the environment, coincidentally into a variable named "CMDLINE".
Why they limit it to 256 bytes I haven't the foggiest. 4DOS
is supplied by JP software of Arlington, MA. It is cheep, reliable
and innovative. A sample copy is on many BBS's and CompuServe.

ucbked@athena.berkeley.edu (Earl H. Kinmonth) (08/29/90)

In article <2954@mindlink.UUCP> a269@mindlink.UUCP (Mischa Sandberg) writes:

>that seriously improves on MS's command shell. It can only put
---------|
Valley speak?

Perhaps "seriously" means something different in your region.  I'm not
a math major (I teach Japanese).  256 is twice 128, what DOS usually
permits.  The MKS korn shell permits 5120 chars, 40 times the usual DOS
limit.  In the mathematics I learned while an EE major, 40 is larger
than 2 by a factor of 20.  Not being an authority on "valley speak," I
don't know the alphanumeric association between "seriously," "awesome,"
"humongous," etc.

>Why they limit it to 256 bytes I haven't the foggiest. 4DOS

Probably because the twit who wrote the programme never needed more
than 256 bytes for anything he/she did, and because he/she never did
anything more than compile the routine he/she was working on.

Thought for the day:

Programming is too important to leave to programmers.

Earl H. Kinmonth

History Department          Centre for Japanese Studies
Univ. of California         Univ. of Sheffield
Davis, California 95616     Sheffield, England S10 2TN

ucbked@athena.berkeley.edu

bright@Data-IO.COM (Walter Bright) (08/30/90)

In article <2954@mindlink.UUCP> a269@mindlink.UUCP (Mischa Sandberg) writes:
<4DOS does exactly that. 4DOS is a replacement for COMMAND.COM
<that seriously improves on MS's command shell. It can only put
<the firs 127 bytes into the PSP, but up to 256 bytes goes into
<the environment, coincidentally into a variable named "CMDLINE".
<Why they limit it to 256 bytes I haven't the foggiest.

Limiting it to 256 bytes only defers the problem for a little while...
A better limit would be 64k.

friedman@apple-gunkies.ai.mit.edu (Noah Friedman) (09/01/90)

In article <2677@dataio.Data-IO.COM> bright@Data-IO.COM (Walter Bright) writes:
>In article <2954@mindlink.UUCP> a269@mindlink.UUCP (Mischa Sandberg) writes:
><4DOS does exactly that. 4DOS is a replacement for COMMAND.COM
><that seriously improves on MS's command shell. It can only put
><the firs 127 bytes into the PSP, but up to 256 bytes goes into
><the environment, coincidentally into a variable named "CMDLINE".
><Why they limit it to 256 bytes I haven't the foggiest.
>
>Limiting it to 256 bytes only defers the problem for a little while...
>A better limit would be 64k.

Among other problems, when the command string is stored in the PSP,
the first byte is used to store the length. I guess this value must be
treated as signed, because 8 bits allows for a maximum (unsigned)
value of 255.

I started to write an MS-DOS compatible system, but I found that there
were so many bugs and stupid designs that I couldn't in good
conscience reproduce them. I hope to design a filesystem that allows
filenames of arbitrary length and characters (except for '/' and ASCII
0) and command-line length of a similar nature. I think I can even
improve on DOS's process management. 

Of course, this doesn't help the average DOS user, since my os
wouldn't be able to run DOS programs. 

I don't know of any way to get a commercial program to read a command
line greater than 128 characters, because they get this list from the
PSP. In your own programs, you might try using the two FCB blocks in
the PSP if they aren't being used for some other purpose (I don't
think DOS uses them anymore) to store arguments and retrieve them. Of
course, this also involves writing your own command shell to PUT those
arguments there, since command.com won't.

---
Noah Friedman
friedman@ai.mit.edu
my new OS, I'm

phys169@canterbury.ac.nz (09/20/90)

In article <7192@hydra.Helsinki.FI>, kankkune@cs.Helsinki.FI (Risto Kankkunen) writes:
> In article <26da5df5@ralf> Ralf.Brown@B.GP.CS.CMU.EDU writes:
>>In article <FooBarataio.Data-IO.COM>, bright@dataio.UUCP wrote:
>>}[passing long command lines in an env var]
>>}What name do they use for the environment variable? Is there any sort
>>}of defacto standard for this? ...If there is no commonly used environment 
>>}variable, how about defining one, I propose "CMDLINE".

There can be a problem using the environment to store long command lines with
conventional DOS: the environment space is small by default, and if you
increase it in config.sys you have to choose a number large enough for any long
command line you might need in the future, and this wastes space each time a
program is nested. 

>>}What if both the environment variable
>>}and a command line exist, does the environment variable come 'before'
>>}or 'after' the command line?
My software generally uses the conventions:
(a) if there is an environment variable with the same name as the program, it
inserts that string *before* the given command line in the PSP, assuming it
contains options (which traditionally go before other parameters, but in fact I
allow them anywhere).
(b) if the command line (or, for that matter, the envirnoment string mentioned
in (a) above), contains " @" followed by a filename, the contents of that
filename are inserted in the command line (where the "@" and filename was).
Althought this potentially allows for *very* long command lines (I'm used to DG
RDOS & B.Basic with kilobytes worth of command parameters!), all my programs at
the moment are limited to a total of 255 bytes, still, it's an easy way to get
around the fact that people may use the software from any version of DOS (2+)
and any environment allocation, and any command interpreter.
(c) I'm happy to look for an environment variable called CMDLINE; I suggest its
contents should go *after* whatever is in the command line already.
>
> An environment variable is simple to use, but I think it would be
> cleaner to place the command line arguments after all the variables and
> the terminating double null in C/Unix-fashion. After all, there is
> already the program name (=argv[0]) there and the word preceeding it
> contains 1, so it seems to be argc...
(d) I'd like to see such a method become a standard, but this would really
need replacing COMMAND.COM. The idea of placing them after the program name in
the environemt would be good, so long as enough people (and alternative command
interpreters) get together and agree on it. From DOS 2 onwards, it is possible
for each program called to get the environment space it actually needs, but so
far the memory allocation systems seems a bit poor, to say the least.

>>The 4DOS COMMAND.COM replacement uses CMDLINE to store the full command line..
I haven't played much with 4DOS, but this underlines up the question of what we
(the programming community, I guess), could do to set standards for command
interpreters. I suggest that, in addition to programs using (a) to (c) above,
and checking for a command interpreter doing (d), we define a few extra
punctuation features for command interpreters while we're at it, so everyone
can plan for them. So I suggest that command interpreters...

(e) expand any [@filename] to include the contents of the file in the command.
Notice the [ and ] around it; it would pass any plain "@" unchanged, for
compatibility with anything that presently wants a "@" in the command line.
Possibly %@filename% would be more consistent, but I prefer to see [ and ]
because it is easier to spot mistakes than with %, and is better for nesting,
and is similar to DG's AOS operating system - not that I'm going out of my way
to copy others, but if someone's got a standard already, why invent a new one?

(f) expand any [@filename number] to include the given line number from the
given file, as above.

(g) expand any [!command] by executing the command after the [! and placing
whatever it writes to the standard output into the new command line's text.
This is pretty much compatible with DG AOS/VS.

(h) expand [?prompt] by asking the user for input.

(i) define a "command continuation" character, e.g. ",", so command lines can
carry on over one line of text, as per decent operating systems.

(j) improve the piping facilities a little, e.g. a tee facility and perhaps a
special highly-buffered device driver for temporary files, and ">>", etc.

> Does anyone know if DOS 5.0 will set any standards in this area?
I don't know about MSDOS 5, but DRDOS 5 allows @filename in most of its own
utilities, (in the same way that my utilities do, it seems). It also allows /h
for help (I would have liked /?, but at least it's something).

Mark Aitchison, Physics, University of Canterbury, New Zealand.

ralf@b.gp.cs.cmu.edu (Ralf Brown) (09/20/90)

In article <1990Sep20.102811.9190@canterbury.ac.nz> phys169@canterbury.ac.nz writes:
}There can be a problem using the environment to store long command lines with
}conventional DOS: the environment space is small by default, and if you
}increase it in config.sys you have to choose a number large enough for any long
}command line you might need in the future, and this wastes space each time a
}program is nested. 

Not at all.  The only full-size environment is the master environment kept
by the shell.  Whenever a program is EXECed, the copy it gets has exactly
the used amount of memory allocated.  People have been bitten by that,
in fact....

}> cleaner to place the command line arguments after all the variables and
}> the terminating double null in C/Unix-fashion. After all, there is
}> already the program name (=argv[0]) there and the word preceeding it
}> contains 1, so it seems to be argc...
}(d) I'd like to see such a method become a standard, but this would really
}need replacing COMMAND.COM. The idea of placing them after the program name in
}the environemt would be good, so long as enough people (and alternative command
}interpreters) get together and agree on it. 

Actually, the place it would need to be implemented is in the EXEC DOS call,
since that is what builds the environment copy and adds the program name.
We would need all programs that call EXEC and want to pass a long command to
use a new subfunction that lets the calls specify a long command line
(wouldn't want to break the thousands of programs that use the existing EXEC
subfunctions...).  For backward compatibility, the PSP of the EXECed program
would still contain the first 126 bytes of the command tail.

}(j) improve the piping facilities a little, e.g. a tee facility and perhaps a
}special highly-buffered device driver for temporary files, and ">>", etc.

4DOS already has TEE and Y built in, but TEE is not nearly as useful under
MSDOS as it is under Unix (where you can use it to see the progress of a
long-running pipe).  Being single-tasking, all output from the previous pipe
stages will be accumulated and displayed at once with TEE.  For those not
using 4DOS, there are any number of TEE programs (trivial to write).

4DOS also has >&, >>&, >!, >>!, >&!, and >>&!.  It isn't DOS that limits
the redirection capabilities, it's COMMAND.COM....
-- 
{backbone}!cs.cmu.edu!ralf  ARPA: RALF@CS.CMU.EDU   FIDO: Ralf Brown 1:129/3.1
BITnet: RALF%CS.CMU.EDU@CMUCCVMA   AT&Tnet: (412)268-3053 (school)   FAX: ask
DISCLAIMER?  Did  | Everything is funny as long as it is happening to
I claim something?| someone else.  --Will Rogers

phys169@canterbury.ac.nz (09/21/90)

In article <10522@pt.cs.cmu.edu>, ralf@b.gp.cs.cmu.edu (Ralf Brown) writes:
> In article <1990Sep20.102811.9190@canterbury.ac.nz> phys169@canterbury.ac.nz writes:
> }(d) I'd like to see such a method become a standard, but this would really
> }need replacing COMMAND.COM. The idea of placing them after the program name in
> }the environemt would be good, so long as enough people (and alternative command
> }interpreters) get together and agree on it. 
> 
> Actually, the place it would need to be implemented is in the EXEC DOS call,
> since that is what builds the environment copy and adds the program name.

But a command processor could set up the environment with the parameters there,
without waiting for a new DOS, or requiring users to upgrade their whole o/s.
True, it would be nicer to get the DOS call to some of the work, but I'm still
correct in saying that it is really essential for the command interpreters to do
their bit. I feel it is easier for "other" command interpreters (like 4DOS) to
lead the way, rather than to campaign for Microsoft to change (they seem to add
stuff only a long time after it has been available elsewhere).

In the mean time, people probably want to write programs that can receive long
command lines, and I suspect the answer for the moment is:

(a) if an environment variable called "CMDLINE" has been defined, then use that
(after anything in the PSP, perhaps), else
(b) if the number of parameters after the normal environment variables is
greater than 1 (and the version of the o/s is 3 or later), then get the
parameters from there instead of the command tail in the PSP, and
(c) if the particular program has some extra convention for getting more
information from the environment, stuff that in now (e.g. my LOCATE program
looks at the environment variable "LOCATE" for such things as the default list
of directories to search), and lastly
(d) if any parameters start with an "@", and your program has no special reason
for accepting parameters starting with an "@", expand the filename following
the "@" (if possible) by including the contents of that file in the command
line at that point.

> 4DOS also has >&, >>&, >!, >>!, >&!, and >>&!.  It isn't DOS that limits
> the redirection capabilities, it's COMMAND.COM....
Yep, COMMAND.COM is pretty pathetic really, and it is what many people judge
DOS by; DR's version is a bit better, and 4DOS significantly better still. Even
with gooey windows and possibly voice-input user interfaces, there is, and will
remain, the need for improved communication between user and command interface,
as well as command interface and  application programs. If Microsoft themselves
aren't leading a push for improvements, can enough people band together to
establish standards? (I hope so, but I'm open to comments).

Mark Aitchison, U of Canty, N.Z.

ralf@b.gp.cs.cmu.edu (Ralf Brown) (09/21/90)

In article <1990Sep21.104931.9205@canterbury.ac.nz> phys169@canterbury.ac.nz writes:
}In article <10522@pt.cs.cmu.edu>, ralf@b.gp.cs.cmu.edu (Ralf Brown) writes:
}> In article <1990Sep20.102811.9190@canterbury.ac.nz> phys169@canterbury.ac.nz writes:
}> }(d) I'd like to see such a method become a standard, but this would really
}> }need replacing COMMAND.COM. The idea of placing them after the program name in
}> }the environemt would be good, so long as enough people (and alternative command
}> }interpreters) get together and agree on it. 
}> 
}>Actually, the place it would need to be implemented is in the EXEC DOS call,
}>since that is what builds the environment copy and adds the program name.
}
}But a command processor could set up the environment with the parameters there,
}without waiting for a new DOS, or requiring users to upgrade their whole o/s.
}True, it would be nicer to get the DOS call to some of the work, but I'm still
}correct in saying that it is really essential for the command interpreters to
}do their bit.

Sorry, that won't work, since anything after the environment (such as extra
parameters) is *NOT* copied when making a copy of the environment for the
EXECed program.  That's why the DOS call has to add the extra parameters as
it now adds the program name, if you want to pass a long commandline after
the program name.
-- 
{backbone}!cs.cmu.edu!ralf  ARPA: RALF@CS.CMU.EDU   FIDO: Ralf Brown 1:129/3.1
BITnet: RALF%CS.CMU.EDU@CMUCCVMA   AT&Tnet: (412)268-3053 (school)   FAX: ask
DISCLAIMER?  Did  | Everything is funny as long as it is happening to
I claim something?| someone else.  --Will Rogers

bright@Data-IO.COM (Walter Bright) (09/22/90)

In article <1990Sep21.104931.9205@canterbury.ac.nz> phys169@canterbury.ac.nz writes:
<In the mean time, people probably want to write programs that can receive long
<command lines, and I suspect the answer for the moment is:
<(d) if any parameters start with an "@", and your program has no special reason
<for accepting parameters starting with an "@", expand the filename following
<the "@" (if possible) by including the contents of that file in the command
<line at that point.

After going around and around on this, I came up with what I feel is a
fine solution:

A parameter starting with @ is taken to be a 'response file'. The response
file is read in and inserted into the command line replacing the @filename
parameter. So far, this is pretty standard. I extended the concept so that
filename was first searched for in the environment, and then looked for
on the disk. This avoids the inefficiency of writing and reading files
when spawning a program. There is no need to reserve a special environment
variable name.

putenv() is used to set the environment variable, so there is no problem,
the command line can be up to 64k (!) long.

I have implemented this in all of Zortech's utilities, including MAKE, and
it appears to neatly solve this problem. Especially with MAKE, now the
command lines can be arbitrarilly long. MAKE 'knows' that Zortech programs
can handle this. MAKE can be informed about other programs that can handle
passing the command line in the environment by a special flag on the rule
line (a '*').

This improvement will appear in the Zortech's next release.

I propose this be a standard because:
1. It is a rather trivial extension to the @filename response file concept.
2. It works with all versions of DOS and OS/2.
3. No special environment variable name needs to be reserved.
4. No cleverness with undocumented environment manipulations is necessary.
5. No decision about the positioning of the long command line relative
   to the PSP command line is necessary.
6. Since most compilers support using putenv() in conjunction with spawn(),
   there shouldn't be any problem supporting this.
7. One problem with response files is avoiding file name collisions when
   using a network or multitasking software. This problem doesn't exist with
   the environment variable approach, because each task has its own copy
   of the environment.
8. It's much faster than reading and writing disk files.

The only fault with this scheme is that it doesn't enable you to type in
a long command line at the DOS prompt. I think Microsoft will have to fix
that one.

kankkune@cs.Helsinki.FI (Risto Kankkunen) (09/22/90)

In article <10535@pt.cs.cmu.edu> ralf@b.gp.cs.cmu.edu (Ralf Brown) writes:
>In article <1990Sep21.104931.9205@canterbury.ac.nz> phys169@canterbury.ac.nz writes:
>}In article <10522@pt.cs.cmu.edu>, ralf@b.gp.cs.cmu.edu (Ralf Brown) writes:
>}>Actually, the place it would need to be implemented is in the EXEC DOS call,
>}>since that is what builds the environment copy and adds the program name.
>}
>}But a command processor could set up the environment with the parameters there
>}without waiting for a new DOS, or requiring users to upgrade their whole o/s.
>}True, it would be nicer to get the DOS call to some of the work, but I'm still
>}correct in saying that it is really essential for the command interpreters to
>}do their bit.
>
>Sorry, that won't work, since anything after the environment (such as extra
>parameters) is *NOT* copied when making a copy of the environment for the
>EXECed program.  That's why the DOS call has to add the extra parameters as
>it now adds the program name, if you want to pass a long commandline after
>the program name.

Yes, you're right. When I made the original suggestion I had not tested
the method, only read the interrupt list. I got the impression that if
the execing program builds the environment itself (instead of passing 0
as environment segment), DOS would not touch the segment. After your
comment I made a test program and found out that this is not the case.

I then noticed that there is the (undocumented) call 4b01, that only
loads a program. I think it would be possible to pass a long command
line using this call. First load the program letting DOS to stuff the
program name at the end of the environment. Then find this position of
the environment and add the command line parameters after that. It might
be necessary to pad the environment before the call so there is enough
room for the parameters. Finally, transfer control to the child program.

I tried to do this in a small test program, and it seemed it could be
done. The only problem I had was how to call the child program. What is
needed to pass the control to it? I tried to JMP or CALL to the initial
CS:IP and setting INT 22 to point to parent's return address, but this
didn't quite work. Any help?
-- 
 Risto Kankkunen                   kankkune@cs.Helsinki.FI (Internet)
 Department of Computer Science    rkankkunen@finuh         (Bitnet)
 University of Helsinki, Finland   ..!mcvax!uhecs!kankkune   (UUCP)

Ralf.Brown@B.GP.CS.CMU.EDU (09/22/90)

In article <2728@dataio.Data-IO.COM>, bright@Data-IO.COM (Walter Bright) wrote:
}After going around and around on this, I came up with what I feel is a
}fine solution:
}
}A parameter starting with @ is taken to be a 'response file'. The response
}file is read in and inserted into the command line replacing the @filename
}parameter. So far, this is pretty standard. I extended the concept so that
}filename was first searched for in the environment, and then looked for
}on the disk. This avoids the inefficiency of writing and reading files
}when spawning a program. There is no need to reserve a special environment
}variable name.

What if you want to use an actual response file that just happens to
have the same name as an environment variable needed for some other
program?

}putenv() is used to set the environment variable, so there is no problem,
}the command line can be up to 64k (!) long.

Minor nit: the environment may only be 32K long, so the practical limit on
command lines will be around 31K due to other environment strings.

--
UUCP: {ucbvax,harvard}!cs.cmu.edu!ralf -=- 412-268-3053 (school) -=- FAX: ask
ARPA: ralf@cs.cmu.edu  BIT: ralf%cs.cmu.edu@CMUCCVMA  FIDO: 1:129/3.1
Disclaimer?    |   I was gratified to be able to answer promptly, and I did.
What's that?   |   I said I didn't know.  --Mark Twain

otto@tukki.jyu.fi (Otto J. Makela) (09/23/90)

In article <1990Sep21.104931.9205@canterbury.ac.nz> phys169@canterbury.ac.nz writes:
[...]
   (d) if any parameters start with an "@", and your program has no special
   reason for accepting parameters starting with an "@", expand the filename
   following the "@" (if possible) by including the contents of that file in
   the command line at that point.

AAARRGGGH!  Exactly what we don't need is making a new set of filenames (that
is, the filenames starting with '@') unusable.  The filename space is small
enough in MS-DOS as it is (yes, I know, many programs do already use '@' as a
include-data-from-this-file character -- it is a BAD choice nonetheless).
I hate it when I have to type ".\@files" just to get at a file which happens
to have this name.  Some MS-DOS utilities are even "smart" enough to BREAK
the line after the ".\", try to include "@files" at the command line and THEN
COMPLAIN ABOUT THE LEADING ".\" BEING SYNTACTICALLY INCORRECT !
Why not use a character like '+' or some such which is already illegal in
MeSsy-DOS filenames in the first place ?
--
* * * Otto J. Makela <otto@jyu.fi> * * * * * * * * * * * * * * * * * * * * *
* Phone: +358 41 613 847, BBS: +358 41 211 562 (CCITT, Bell 2400/1200/300) *
* Mail: Kauppakatu 1 B 18, SF-40100 Jyvaskyla, Finland, EUROPE             *
* * * Computers Rule 01001111 01001011 * * * * * * * * * * * * * * * * * * *

kankkune@cs.Helsinki.FI (Risto Kankkunen) (09/24/90)

In article <2728@dataio.Data-IO.COM> bright@Data-IO.COM (Walter Bright) writes:
>A parameter starting with @ is taken to be a 'response file'. The response
>file is read in and inserted into the command line replacing the @filename
>parameter. So far, this is pretty standard. I extended the concept so that
>filename was first searched for in the environment, and then looked for
>on the disk. This avoids the inefficiency of writing and reading files
>when spawning a program. There is no need to reserve a special environment
>variable name.

I don't see this as the final solution, because it is a purely
syntactic, command line convention. What if you use sh, csh, ksh or
some other command interpreter than COMMAND.COM? Those shells may
already have some special meaning for @-character. Or what if you have
a program that wants @-character as one of its parameters? Another
problem is that the command line is passed as a single string instead
of separate arguments.

Passing the arguments at the end of the area for environment variables
would solve these problems: It would not put any character into a
special role, and thus arbitrary strings could be passed to programs.
And because the command line would be broken up into arguments, the
program wouldn't have to parse it (and try to guess what command shell
the user has to do it right).

I think that if we can put the arguments at the end of the environment
with a little hackery, we should start using that. Otherwise we should
get Microsoft to add a new version of exec call that can do it.
-- 
 Risto Kankkunen                   kankkune@cs.Helsinki.FI (Internet)
 Department of Computer Science    rkankkunen@finuh         (Bitnet)
 University of Helsinki, Finland   ..!mcvax!uhecs!kankkune   (UUCP)

kankkune@cs.Helsinki.FI (Risto Kankkunen) (09/25/90)

In article <1990Sep24.110816.9221@canterbury.ac.nz> phys169@canterbury.ac.nz writes:
>In article <7214@hydra.Helsinki.FI>, kankkune@cs.Helsinki.FI (Risto Kankkunen) writes:
>> In article <2728@dataio.Data-IO.COM> bright@Data-IO.COM (Walter Bright) writes:
>>>A parameter starting with @ is taken to be a 'response file'. The response
>>>file is read in and inserted into the command line replacing the @filename...
>>
>> I don't see this as the final solution, because it is a purely
>> syntactic, command line convention.
>> 
>Quite right, it shouldn't be a "final solution", but it isn't all that bad for
>the moment, either.

The response file method is quite good and flexible, I agree. But I'd
like to see some effort put, for example, on the environment segment
method, so we could solve this long command line problem for good.
Otherwise, we end up having lots of unnecessary conventions that every
programmer must support for backward compatibility.

So let's first see, if the environment method can be implemented (Or
does someone see any problems with this methods, or know a better way
to go?). If it can't be done without a new system call, then we must
maybe use response files etc. while waiting for Microsoft to react (and
that might take a long time...).

>Ultimately, it looks like DOS
>will want to use the special area after the environment. I believe command
>interpreters could set this up now, even without waiting for DOS to do it, but
>with some large degree of mess (loading a program in "by hand", and all that).

I don't think it requires any funky stuff to get working. And you don't
have to load the program by hand, because there is this "undocumented"
call to do it. And this is the call by which DOS DEBUG, and I think some
other debuggers, load programs to debug. So it isn't some obscure
internal function that might disappear in the next release of DOS.

>What is important is what goes in on the application program side of things.
>
>> And because the command line would be broken up into arguments, the
>> program wouldn't have to parse it (and try to guess what command shell
>> the user has to do it right).
>
>But the parsing is going to make assumptions, e.g. some people consider that: 
>PRINT file1,file2,file3 file4 has 2 parameters after the "PRINT" (one is the
>concatenation of files 1,2 & 3, which would be printed as one file, then file4;
>others would expect 4 separate parameters. Some programs treat what others
>consider to be a delimiter as special, and need to know it).

I think most of the programs are written in high level languages, like
C and Pascal. In these languages you don't access the command line
directly. Instead, the startup code parses it and separates to
parameters. You access these via the argv array in C and ParamStr in
TP.

So, every program written in high level language parses the command
lines, and makes lots of unnecessary assumptions, before the program
gets them. If the command line is passed in parameters, it is the shell,
and ultimately the user that makes the assumptions and decisions.

For example, in your PRINT example above, you could quote, or some
other way indicate, that the comma separated list is one parameter, if
your shell normally splits parameters at commas. However, currently the
startup code always parses the line and you couldn't even write PRINT
program like above, if your compiler treats commas as parameter
delimiters (ok, you can write your own startup code, but that's not the
point).

>Hopefully, the start of the un-processed command line would still be there, in 
>the PSP, but somebody (everybody?) has to work out what you can expect a shell 
>to do and not do first, and allow time for program writers to adjust to that.At
>the moment, the majority of programs for PC's at least are written under the
>assumption that they will be called from COMMAND.COM, not some "foreign" shell.

The unprocessed line would of course be there, so old programs won't
stop working. As for new programs, I don't see any problems. In normal
high level languages any modifications shouldn't be needed. The only
difference would be that they now could receive more and longer
parameters.

>I realise the information content is dropping a bit as personal opinions take 
>over, so feel free to e-mail me with further comments, flames, whatever.

No need to flame anybody. I prefer to post, so that we can discuss this
together and hear what everyone thinks about this issue. I thought this
topic would draw much more opinions and I hoped to get some technical
help from you gurus of DOS innards. Maybe it is just that everyone is
happy with their DOS... 8-}

I'm sorry, but it seems I can't write short follow-ups...
 Risto Kankkunen                   kankkune@cs.Helsinki.FI (Internet)
 Department of Computer Science    rkankkunen@finuh         (Bitnet)
 University of Helsinki, Finland   ..!mcvax!uhecs!kankkune   (UUCP)

jrwsnsr@nmt.edu (Jonathan R. Watts) (09/27/90)

From article <7658@hydra.Helsinki.FI>, by kankkune@cs.Helsinki.FI (Risto Kankkunen):
> I think most of the programs are written in high level languages, like
> C and Pascal. In these languages you don't access the command line
> directly. Instead, the startup code parses it and separates to
> parameters. You access these via the argv array in C and ParamStr in
> TP.
> 
> So, every program written in high level language parses the command
> lines, and makes lots of unnecessary assumptions, before the program
> gets them. If the command line is passed in parameters, it is the shell,
> and ultimately the user that makes the assumptions and decisions.

I use Turbo Pascal 5.5, and while it does have ParamStr to return your
command line arguments, I decided to write my own command-line parser
instead.  This way, I can parse the command-line in whatever fashion I
like.  For example, at the moment, my parser will check if 4DOS is the
parent shell, and if so, it will use the environment variable CMDLINE
instead of the passed command-line; the great thing about this is that
it is completely transparent...if the program wasn't loaded by 4DOS, it
just ignores the environment, even if a CMDLINE variable is present.
It works great!  I've picked up some good ideas to add to my parser from
this thread, too, such as the @<filename> expansion.  (BTW, if anyone wants
a copy of my parser, feel free to mail me; it's implemented as a TP 5.5
unit.)
 
  - Jonathan Watts
 
jrwsnsr@jupiter.nmt.edu (Internet address)

kankkune@cs.Helsinki.FI (Risto Kankkunen) (10/01/90)

In article <1990Sep27.044522.8475@nmt.edu> jrwsnsr@nmt.edu (Jonathan R. Watts) writes:
>From article <7658@hydra.Helsinki.FI>, by kankkune@cs.Helsinki.FI (Risto Kankkunen):
>> So, every program written in high level language parses the command
>> lines, and makes lots of unnecessary assumptions, before the program
>> gets them. If the command line is passed in parameters, it is the shell,
>> and ultimately the user that makes the assumptions and decisions.
>
>I use Turbo Pascal 5.5, and while it does have ParamStr to return your
>command line arguments, I decided to write my own command-line parser
>instead.  This way, I can parse the command-line in whatever fashion I
>like.  For example, at the moment, my parser will check if 4DOS is the
>parent shell, and if so, it will use the environment variable CMDLINE
>instead of the passed command-line; the great thing about this is that
>it is completely transparent...if the program wasn't loaded by 4DOS, it
>just ignores the environment, even if a CMDLINE variable is present.

Yes, as I said, it is possible to by-pass the command-line parser in
the run-time library, and in TP it is very easy to do so. But that
wasn't the point. The point was that currently it is the _program_ that
parses the line, whether it is done in the start-up code or in your own
routines. However, the program cannot know how to do this. Your program
would have to know all the shells some day to be used in MS-DOS, and
know which one the user is currently using, to do this right. Your
programs support now COMMAND.COM and 4DOS, but what if someone who is
using ms_sh wants to use your programs? So, every time a new shell
shell is made, or a new syntax is added to an old one, you have to
modify all your programs. All this code to parse the command-line in
tens or hundreds of programs also eats quite a lot of disk space and
RAM..

Wouldn't it be nice, if the shell passed the command-line to your
program pre-parsed to ASCIIZ strings?

-- 
 Risto Kankkunen                   kankkune@cs.Helsinki.FI (Internet)
 Department of Computer Science    rkankkunen@finuh         (Bitnet)
 University of Helsinki, Finland   ..!mcvax!uhecs!kankkune   (UUCP)

phys169@canterbury.ac.nz (09/24/12)

In article <7214@hydra.Helsinki.FI>, kankkune@cs.Helsinki.FI (Risto Kankkunen) writes:
> In article <2728@dataio.Data-IO.COM> bright@Data-IO.COM (Walter Bright) writes:
>>A parameter starting with @ is taken to be a 'response file'. The response
>>file is read in and inserted into the command line replacing the @filename...
>
> I don't see this as the final solution, because it is a purely
> syntactic, command line convention. What if you use sh, csh, ksh or
> some other command interpreter than COMMAND.COM? Those shells may
> already have some special meaning for @-character. Or what if you have
> a program that wants @-character as one of its parameters? 
> 
Quite right, it shouldn't be a "final solution", but it isn't all that bad for
the moment, either. But it must be implemented in the individual programs - the
command interpreter itself shouldn't expand any "@filename" it finds, because
there are going to be programs that actually want a "@" for some good reason.
Any good shell worth its salt should have the ability to force through any
character (like "@") to the program, even though it would like to interprete it
in some special way. I guess the original thoughts behind the discussion
included what "standard" should a programmer try to adopt now. I think
individual programs should look for a variety of possibilities, the environment
"CMDLINE" seems reasonable (and even if you don't like it, you shouldn't use it
for anything else), the "@" in parameters seems good too, but some people like
to use "@" in filenames (see my comments below), and individual programs may
have their own ideas about gathering extra information from environment
variables and/or special configuration files. Ultimately, it looks like DOS
will want to use the special area after the environment. I believe command
interpreters could set this up now, even without waiting for DOS to do it, but
with some large degree of mess (loading a program in "by hand", and all that).
What is important is what goes in on the application program side of things.

Now, the question of "@" in filenames. Personally, I prefer to keep punctuation
out of filenames, mainly because I work in an environment where DOS and UNIX
and VMS have to co-exist, so taking filenames to their limit in one
environment can cause problems if the file ever has to be moved. Some operating
systems treat "@" as special (e.g. in AOS it implies a peripheral device, of
all things, and in many systems it means "the contents of a text file" rather
than the file itself). We shouldn't try to encourage all sorts of junk going
into filenames (i.e. filenames should be *names*, not !@&^$!?!! punctuation!).
However, I realise that these are just my opinions, and it is reasonable to be
able to override such restrictions - but its also reasonable that anyone who
wants to do strange things with filenames should have to go to a little extra
effort in pushing them through the system.

The command shell itself probably does need a method of being told to take the
contents of a file and stuff it into a command line. My favourite method is to
avoid using a single character to do this, for compatibility reasons, but to
use a construct like: DELETE [@filename]  - i.e. both [ and @ are needed, so
few people can complain that they *really need* a filename starting with [@ in
their program. The actual method may change from shell to shell, by it should
generate the input to the program, under whatever o/s, that is appropriate for
the job, e.g. CMDLINE if it has to, or extra parameters after the environment,
or whatever.

> Passing the arguments at the end of the area for environment variables
> would solve these problems: It would not put any character into a
> special role, and thus arbitrary strings could be passed to programs.
> And because the command line would be broken up into arguments, the
> program wouldn't have to parse it (and try to guess what command shell
> the user has to do it right).
> 
But the parsing is going to make assumptions, e.g. some people consider that: 
PRINT file1,file2,file3 file4 has 2 parameters after the "PRINT" (one is the
concatenation of files 1,2 & 3, which would be printed as one file, then file4;
others would expect 4 separate parameters. Some programs treat what others
consider to be a delimiter as special, and need to know it). I still like the
idea of pre-parsing the input, but there could be problems, simply because of
the gigantic lack of standardisation between and within operating systems!
Hopefully, the start of the un-processed command line would still be there, in 
the PSP, but somebody (everybody?) has to work out what you can expect a shell 
to do and not do first, and allow time for program writers to adjust to that. At
the moment, the majority of programs for PC's at least are written under the
assumption that they will be called from COMMAND.COM, not some "foreign" shell.

I imagine it is possible to come up with a small list of what a program can
expect its shell to pass to it, without restricting shell writers too much. It
may involve caveats like "this shell requires such lines to be prefixed by
(some obscure character)" or "that shell will handle such things by default".
Such a set of guidelines might involve suggestions on characters that are good
to put in filenames (consider a teaching environment where one o/s lets you put
anything into a filename at one stage, but prohibits you from deleting or
renaming it at another - its all very well saying that an o/s should let you do
anything, but when a bunch of learners get fouled up because of that [partial]
freedom, it isn't much fun). So I don't think it is too much of a restriction
or infringement of civil liberties to say "don't make filenames with "@" at the
start, unless you (go through some strange Druid ritual) and are prepared to
handle problems down the line".

I realise the information content is dropping a bit as personal opinions take 
over, so feel free to e-mail me with further comments, flames, whatever. In
particular, I'd be interested to here from anyone for whom a "@" (or even "[@"
sequence) at the start of a parameter is essential, and any serious problems
with any of the suggestions so far (or new ideas). 

Mark Aitchison, University of Canterbury, Gnu Zealand.