[comp.lang.fortran] Command line arguements?

eesnyder@boulder.Colorado.EDU (Eric E. Snyder) (05/30/91)

I need to write a program that takes arguements from the 
command line and passes them to variables with in the program.

eg:

	a.out arg1 arg2 ....

This must be amazingly simple but I can't find anything in
my SGI Iris 4D f77 manual.  What do you call this and where
can I find out about it?

Disclaimer:

	Be nice-- I'm just a biologist:-)

---------------------------------------------------------------------------
TTGATTGCTAAACACTGGGCGGCGAATCAGGGTTGGGATCTGAACAAAGACGGTCAGATTCAGTTCGTACTGCTG
Eric E. Snyder                            
Department of MCD Biology              ...making feet for childrens' shoes.
University of Colorado, Boulder   
Boulder, Colorado 80309-0347
LeuIleAlaLysHisTrpAlaAlaAsnGlnGlyTrpAspLeuAsnLysAspGlyGlnIleGlnPheValLeuLeu
---------------------------------------------------------------------------

jlg@cochiti.lanl.gov (Jim Giles) (05/30/91)

In article <eesnyder.675543629@beagle>, eesnyder@boulder.Colorado.EDU (Eric E. Snyder) writes:
|> I need to write a program that takes arguements from the 
|> command line and passes them to variables with in the program.
|> 
|> eg:
|> 
|> 	a.out arg1 arg2 ....

Yeah.  I've always wanted to be able to do this on UNIX (with _any_
language).  Unfortunately, the shell trashes the command line before
I can have a look at it.  I can turn that unsavory behaviour off - 
which is fine for me, but I can't be sure all the users of my software
will turn it off.  So, I have to do without a command line and deal
with the junk the shell gives me instead.

J. Giles

cochran@spam.rtp.dg.com (Dave Cochran) (05/30/91)

In article <24632@lanl.gov>, jlg@cochiti.lanl.gov (Jim Giles) writes:
|> In article <eesnyder.675543629@beagle>, eesnyder@boulder.Colorado.EDU (Eric E. Snyder) writes:
|> |> I need to write a program that takes arguements from the 
|> |> command line and passes them to variables with in the program.
|> |> 
|> |> eg:
|> |> 
|> |> 	a.out arg1 arg2 ....
|> 
|> Yeah.  I've always wanted to be able to do this on UNIX (with _any_
|> language).  Unfortunately, the shell trashes the command line before
|> I can have a look at it.  I can turn that unsavory behaviour off - 
|> which is fine for me, but I can't be sure all the users of my software
|> will turn it off.  So, I have to do without a command line and deal
|> with the junk the shell gives me instead.

The shell trashes the command line?  What about argc and argv, which are 
pretty standard things in C?  Green Hills F77 also gives you a subroutine and
a function called getarg (corresponding to argv) and iargc (corresponding to
argc) that are used all the time to get the command line that executed the
F77 program in the first place.

-- 
+------------------------------------------------------+
|Dave Cochran (cochran@spam.rtp.dg.com)                |
|Data General Corporation, Research Triangle Park, NC  |
+------------------------------------------------------+
|Just suppose there were no hypothetical situations... |
+------------------------------------------------------+

pstowne@zargon.lerc.nasa.gov (Charlie Towne) (05/30/91)

In article <eesnyder.675543629@beagle> eesnyder@boulder.Colorado.EDU (Eric E. Snyder) writes:
>I need to write a program that takes arguements from the 
>command line and passes them to variables with in the program.
>
>eg:
>
>	a.out arg1 arg2 ....
>
>This must be amazingly simple but I can't find anything in
>my SGI Iris 4D f77 manual.  What do you call this and where
>can I find out about it?

See the getarg(3F) man page in the Iris-4D Fortran 77 Reference Manual 
Pages (Doc. number 007-0621-030).  It can be used as follows:

         character*1 iargc,jargc
   c
   c-----get arguments from command line
   c
         call getarg(1,iargc)
         call getarg(2,jargc)
         read (iargc,'(i1)') iarg
         read (jargc,'(i1)') jarg

The calls to getarg return the 1st and 2nd arguments in the character
variables iargc and jargc.  The internal reads convert the character
variables to integers.  Note that this example is for single digit
integer arguments.

>Disclaimer:
>
>	Be nice-- I'm just a biologist:-)

That's OK.  I'm just an engineer.

-- 
Charlie Towne                        Email: pstowne@zargon.lerc.nasa.gov  
MS 5-11                              Phone: (216) 433-5851
NASA Lewis Research Center
Cleveland, OH 44135

wggabb@sdrc.COM (Rob Gabbard) (05/30/91)

From article <eesnyder.675543629@beagle>, by eesnyder@boulder.Colorado.EDU (Eric E. Snyder):
> I need to write a program that takes arguements from the 
> command line and passes them to variables with in the program.
> 
> 	a.out arg1 arg2 ....
> 
> This must be amazingly simple but I can't find anything in
> my SGI Iris 4D f77 manual.  What do you call this and where
> can I find out about it?

Try getarg and iargc (section 3F in the IRIS-4D FORTRAN 77 Reference Manual)

	character*N c
	integer i,j

	call getarg(i,c)
	j = iargc()

	getarg returns the i-th command line argument
	iargc returns the index of the last argument

Most F77 compilers have some kind of getarg/iargc function. Some, at least HP,
have implemented it as an intrinsic.

-- 
The statements above are my own and do not neccesarily reflect the opinion of 
my employer.
-------------------------------------------------------------------------------
Rob Gabbard 				    wggabb@sdrc.sdrc.com
Technical Development Engineer
Structural Dynamics Research Corporation

mwette@csi.jpl.nasa.gov (Matt Wette) (05/30/91)

In article <1991May30.135749.10529@eagle.lerc.nasa.gov>, pstowne@zargon.lerc.nasa.gov (Charlie Towne) writes:
|> In article <eesnyder.675543629@beagle> eesnyder@boulder.Colorado.EDU (Eric E. Snyder) writes:
|> >I need to write a program that takes arguements from the 
|> >command line and passes them to variables with in the program.
|> >
|> >eg:
|> >
|> >	a.out arg1 arg2 ....
|> >
|> >This must be amazingly simple but I can't find anything in
|> >my SGI Iris 4D f77 manual.  What do you call this and where
|> >can I find out about it?
|> 
|> See the getarg(3F) man page in the Iris-4D Fortran 77 Reference Manual 
|> Pages (Doc. number 007-0621-030).  It can be used as follows:
|> 
|>          character*1 iargc,jargc
|>    c
|>    c-----get arguments from command line
|>    c
|>          call getarg(1,iargc)
|>          call getarg(2,jargc)
|>          read (iargc,'(i1)') iarg
|>          read (jargc,'(i1)') jarg
|> 
|> The calls to getarg return the 1st and 2nd arguments in the character
|> variables iargc and jargc.  The internal reads convert the character
|> variables to integers.  Note that this example is for single digit
|> integer arguments.
|> 
|> >Disclaimer:
|> >
|> >	Be nice-- I'm just a biologist:-)
|> 
|> That's OK.  I'm just an engineer.
|> 
|> -- 
|> Charlie Towne                        Email: pstowne@zargon.lerc.nasa.gov  
|> MS 5-11                              Phone: (216) 433-5851
|> NASA Lewis Research Center
|> Cleveland, OH 44135

The problem here is that getting command line arguments is a machine
dependent process.  The solution is to write a wrapper for each machine
that makes the process of getting command line arguments portable.  This
has been done in 
	R.K.Jones and T.Crabtree, "Fortran Tools for VAX/VMS and MS-DOS,"
	Wiley, 1988

I've written SunFortran versions of thier routines.  Things work much 
nicer in Fortran if you take the time to set up enough utility routines
as in the above reference....

Matt

-- 
 _________________________________________________________________
 Matthew R. Wette           | Jet Propulsion Laboratory, 198-326
 mwette@csi.jpl.nasa.gov    | 4800 Oak Grove Dr, Pasadena,CA 91109
 -----------------------------------------------------------------

silvert@cs.dal.ca (Bill Silvert) (05/31/91)

In article <24632@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes:
>In article <eesnyder.675543629@beagle>, eesnyder@boulder.Colorado.EDU (Eric E. Snyder) writes:
>|> I need to write a program that takes arguements from the 
>|> command line and passes them to variables with in the program.
>|> 
>|> 	a.out arg1 arg2 ....
>
>Yeah.  I've always wanted to be able to do this on UNIX (with _any_
>language).  Unfortunately, the shell trashes the command line before
>I can have a look at it.  I can turn that unsavory behaviour off - 
>which is fine for me, but I can't be sure all the users of my software
>will turn it off.  So, I have to do without a command line and deal
>with the junk the shell gives me instead.

Can you clarify this?  I've been using getarg() and iargc() for years on
all kinds of systems, and I distribute software that uses these calls.
Could you explain under what circumstances they don't work?
-- 
William Silvert, Habitat Ecology Division, Bedford Inst. of Oceanography
P. O. Box 1006, Dartmouth, Nova Scotia, CANADA B2Y 4A2.  Tel. (902)426-1577
UUCP=..!{uunet|watmath}!dalcs!biome!silvert
BITNET=silvert%biome%dalcs@dalac	InterNet=silvert%biome@cs.dal.ca

silvert@cs.dal.ca (Bill Silvert) (05/31/91)

In article <1991May30.135749.10529@eagle.lerc.nasa.gov> pstowne@zargon.lerc.nasa.gov (Charlie Towne) writes:
>In article <eesnyder.675543629@beagle> eesnyder@boulder.Colorado.EDU (Eric E. Snyder) writes:
>>I need to write a program that takes arguements from the 
>>command line and passes them to variables with in the program.
>>	a.out arg1 arg2 ....
>
>See the getarg(3F) man page in the Iris-4D Fortran 77 Reference Manual 
>Pages (Doc. number 007-0621-030).  It can be used as follows:
>
>         character*1 iargc,jargc
>   c
>   c-----get arguments from command line
>   c
>         call getarg(1,iargc)
>         call getarg(2,jargc)
>         read (iargc,'(i1)') iarg
>         read (jargc,'(i1)') jarg
>
>The calls to getarg return the 1st and 2nd arguments in the character
>variables iargc and jargc.  The internal reads convert the character
>variables to integers.  Note that this example is for single digit
>integer arguments.

Whoa!  My manual says that iargc() returns an integer.  The usage is:
	CALL GETARG(I, STRING)
	and
	ICOUNT = IARGC()
Conversion with internal reads is not called for!  (I just checked it,
the manual description is correct.)
-- 
William Silvert, Habitat Ecology Division, Bedford Inst. of Oceanography
P. O. Box 1006, Dartmouth, Nova Scotia, CANADA B2Y 4A2.  Tel. (902)426-1577
UUCP=..!{uunet|watmath}!dalcs!biome!silvert
BITNET=silvert%biome%dalcs@dalac	InterNet=silvert%biome@cs.dal.ca

jlg@cochiti.lanl.gov (Jim Giles) (05/31/91)

In article <1991May30.174626.12965@cs.dal.ca>, silvert@cs.dal.ca (Bill Silvert) writes:
|> In article <24632@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes:
|> >[...]
|> >Yeah.  I've always wanted to be able to do this on UNIX (with _any_
|> >language).  Unfortunately, the shell trashes the command line before
|> >I can have a look at it.  I can turn that unsavory behaviour off - 
|> >which is fine for me, but I can't be sure all the users of my software
|> >will turn it off.  So, I have to do without a command line and deal
|> >with the junk the shell gives me instead.
|> 
|> Can you clarify this?  I've been using getarg() and iargc() for years on
|> all kinds of systems, and I distribute software that uses these calls.
|> Could you explain under what circumstances they don't work?

As I said above, the _shell_ processes the command line before I get to
look at it.  So, getarg() and iargc() don't give me the arguments that
the user typed, they give me the _processed_ arguments.  This means that
my command line syntax has to avoid quotes, backslashes, wildcard characters,
and probably a few other things.

If the shell passed the _unprocessed_ command line _in_addition_ to the
parsed argument list (either as another argument to main() or as an
environment variable) then I could design code which used any command 
line syntax I wanted.  In particular, I could apply the same wildcard 
and quoting conventions to alternative search spaces - not just to local
file names.  For example, I could have a phonebook program that listed
phone numbers for people specified as:

cochiti => phone s*m S*de

Which would find (say) Sam Spade and print his number.  It would also 
find Slim Shade, etc.. You get the picture.  At present, the only ways
to do this are to require the user to quote the patterns to get them
past the shell or to make the utility interactive: ignore the command
line and ask for the name interactively.  Neither one of these is really
what I want.  Quoting intuitively means _not_ to expand wildcards, that's
what I would want it to mean in a utility.  Going interactive is less
convenient (extra work to call from within a script, for example).

J. Giles

pstowne@zargon.lerc.nasa.gov (Charlie Towne) (05/31/91)

In article <1991May30.175533.13315@cs.dal.ca> silvert%biome@cs.dal.ca writes:
>In article <1991May30.135749.10529@eagle.lerc.nasa.gov> I wrote:
>>In article <eesnyder.675543629@beagle> eesnyder@boulder.Colorado.EDU (Eric E. Snyder) writes:

>>>I need to write a program that takes arguements from the 
>>>command line and passes them to variables with in the program.
>>>	a.out arg1 arg2 ....

>>
>>See the getarg(3F) man page in the Iris-4D Fortran 77 Reference Manual 
>>Pages (Doc. number 007-0621-030).  It can be used as follows:
>>
>>         character*1 iargc,jargc
                       ^^^^^
                 poor name choice
>>   c
>>   c-----get arguments from command line
>>   c
>>         call getarg(1,iargc)
>>         call getarg(2,jargc)
>>         read (iargc,'(i1)') iarg
>>         read (jargc,'(i1)') jarg
>>
>>The calls to getarg return the 1st and 2nd arguments in the character
>>variables iargc and jargc.  The internal reads convert the character
>>variables to integers.  Note that this example is for single digit
>>integer arguments.

>
>Whoa!  My manual says that iargc() returns an integer.  The usage is:
>	CALL GETARG(I, STRING)
>	and
>	ICOUNT = IARGC()
>Conversion with internal reads is not called for!  (I just checked it,
>the manual description is correct.)

I probably caused some confusion by the very poor choice of iargc as a
variable name, when there's also a function by that name.  (Which, until
about 10 minutes ago I'd never used.)  I'll try to clear things up.

The function IARGC returns the index of the last argument, as an integer.
The subroutine GETARG returns a specified argument itself, as a character 
string.  Unless I'm missing something, an internal write is needed to
convert the argument itself from a character string to an integer variable
(or floating point, if that's appropriate.)

The example I gave _should_ have been written as:

         character*1 ic,jc
   c
   c-----get arguments from command line
   c
         n = iargc()
         call getarg(1,ic)
         call getarg(2,jc)
         read (ic,'(i1)') i
         read (jc,'(i1)') j

If this code is part of a.out, then

   a.out 3 6

yields n=2, i=3, j=6.

As another poster noted, all this is system-dependent.  It's not standard
fortran.  I've used getarg (and now iargc) on an SGI Iris 4D.  Getarg is
also available on the Cray X-MP and Y-MP under UNICOS 5.0, but as a
function instead of a subroutine.  I don't know about other systems.

Sorry about the screw-up.  That's what I get for posting code I haven't 
actually run.
-- 
Charlie Towne                        Email: pstowne@zargon.lerc.nasa.gov  
MS 5-11                              Phone: (216) 433-5851
NASA Lewis Research Center
Cleveland, OH 44135

bernhold@red8 (David E. Bernholdt) (05/31/91)

In article <24713@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes:
>As I said above, the _shell_ processes the command line before I get to
>look at it.

This is quite true, but it is actually a characteristic of unix shells
rather than unix implementation of Fortran -- the shell will process
the arguments before turning them over to _any_ program, no matter
what the language.  So at least its not discriminating!

Naieve (sp?) users can easily get confused between regular expressions
and shell filename expansions.  Grep is a good example of a standard
unix utility which is ripe for such confusion.  I've found it helps in
explaining such things to stress the fact that the shell does
expansions related to file names before anything else -- so anything
you want to be treated as a regular expression (which is what Jim
wanted in his 'phone S*m S*de' example) must be protected (by quoting)
in order to survive the shell's onslaught.

It doesn't change the situation, but it (in my experience) does help
users understand what's going on.

Although I don't consider terribly useful in general, it is also worth
notng that many shells have a way to turn off "globbing", as its
called.  This just reverses the problem -- regular expressions go
through okay, but you can't wildcard files...

We now return you to your regular Fortran discussions...
-- 
David Bernholdt			bernhold@qtp.ufl.edu
Quantum Theory Project		bernhold@ufpine.bitnet
University of Florida
Gainesville, FL  32611		904/392 6365

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (05/31/91)

In article <24713@lanl.gov>, jlg@cochiti.lanl.gov (Jim Giles) writes:
> As I said above, the _shell_ processes the command line before I get to
> look at it.  So, getarg() and iargc() don't give me the arguments that
> the user typed, they give me the _processed_ arguments.  This means that
> my command line syntax has to avoid quotes, backslashes, wildcard characters,
> and probably a few other things.

Yes and no.  It is not the case that every program is invoked from 
``the'' shell (do you mean sh, csh, tcsh, ksh, bash, rc, tcl, or what?).
It may be that there never *was* any such thing as a line typed by
``the user''.  Your command line syntax, if it uses any of the usual
meta-characters {}[]()<>,;$\'", will need quoting, but the only character
which is actually forbidden to you is NUL.

> If the shell passed the _unprocessed_ command line _in_addition_ to the
> parsed argument list (either as another argument to main() or as an
> environment variable) then I could design code which used any command 
> line syntax I wanted.

Improved Giles-specific shell:

	while read line ; do
	    export COMMAND_LINE
	    COMMAND_LINES="$line"
	    eval "$line"
	done

> In particular, I could apply the same wildcard 
> and quoting conventions to alternative search spaces - not just to local
> file names.

Not quite.  You would have to program them yourself.  UNIX shell
wild-cards are specialised to file names (for example, none of them
will match a '/').  If the things you want to match against aren't
file names (as in your phone book example), that may not be appropriate.
My equivalent of 'phone' expands to a call to Awk, and the pattern
matching provided by Awk is not identical to shell file name wildcards.

> For example, I could have a phonebook program that listed
> phone numbers for people specified as:
> 
> cochiti => phone s*m S*de
> 
> Which would find (say) Sam Spade and print his number.  It would also 
> find Slim Shade, etc.. You get the picture.  At present, the only ways
> to do this are to require the user to quote the patterns to get them
> past the shell or to make the utility interactive: ignore the command
> line and ask for the name interactively.

Well, you _could_ do
	phone <<'.'
	s*m S*de
	.

I've used TOPS-10 (where you _only_ get an uninterpreted command line)
and VM/CMS (where you can get both a tokenised and an uninterpreted
command line), and for that reason I very much like the approach taken
by the UNIX shells.  This divergence in the interfaces offered by
various operating systems is of course why "get command line arguments"
is not a standard feature of Fortran.

-- 
Should you ever intend to dull the wits of a young man and to
incapacitate his brains for any kind of thought whatever, then
you cannot do better than give him Hegel to read.  -- Schopenhauer.

joe@etac632 (Joe Fulson-Woytek) (05/31/91)

In article <eesnyder.675543629@beagle>, eesnyder@boulder.Colorado.EDU (Eric E. Snyder) writes:
> I need to write a program that takes arguements from the 
> command line and passes them to variables with in the program.
> 
> eg:
> 
> 	a.out arg1 arg2 ....
>
Most (actually all of the ones I've dealt with (SGI, SUN, HP, Cray UNICOS))
UNIX Fortran compilers have some form of a GETARG routine which returns
command line arguments. In the case of SGI there are 2 routines: iargc and
getarg. iargc is a function and returns the number og arguments:
      j = iargc()
getarg returns the i'th argument in the character string c:
     call getarg ( i,c)
Check the Fortran reference manual pages for more info.


Joe Fulson-Woytek
NASA/Goddard Space Flight Center
Code 932
joe@etac632.gsfc.nasa.gov

silvert@cs.dal.ca (Bill Silvert) (05/31/91)

In article <24713@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes:
>As I said above, the _shell_ processes the command line before I get to
>look at it.  So, getarg() and iargc() don't give me the arguments that
>the user typed, they give me the _processed_ arguments.  This means that
>my command line syntax has to avoid quotes, backslashes, wildcard characters,
>and probably a few other things.

So write your own operating system.  Or insert the command "set noglob"
in your startup file (most shells have this or an equivalent).
Operating systems offer features, why use one that offers features you
don't like?  Switch to MS-DOS if "set noglob" is too hard for you.
-- 
William Silvert, Habitat Ecology Division, Bedford Inst. of Oceanography
P. O. Box 1006, Dartmouth, Nova Scotia, CANADA B2Y 4A2.  Tel. (902)426-1577
UUCP=..!{uunet|watmath}!dalcs!biome!silvert
BITNET=silvert%biome%dalcs@dalac	InterNet=silvert%biome@cs.dal.ca

jlg@cochiti.lanl.gov (Jim Giles) (05/31/91)

Here are some clarifications on command line arguments.

1) I can't write my own shell because I can't be sure the end users of
   my utilities will be using my shell.  Writing utilities that only work
   under one specific shell is unprofessional.

2) Quote and escape marks _should_ mean that the quoted or escaped
   characters will be treated _literally_: that no 'globbing'  or
   other reprocessing will be applied to them in _any_ context. 
   So, in my phone example:

      cochiti => phone "S*m"

   This _should_ look up any company, organization, or individual whose
   phonebook entry contains the letters S and m separated by an asterisk.
   Since this is the proper meaning for escapes and quotes, the use of
   them to get the characters past the gauntlet of the shell is not 
   acceptable.

3) I am _not_ asking for the shell not to 'glob'.  The modification I 
   want is completely backward compatible.  I want the unparsed command
   line _in_addition_ to the normal parsed arguments.

4) The comment has been made that if I get the raw command line I would 
   have to (horrors) parse it myself!  Since this is _exactly_ what I
   _want_ to do, I don't see the problem.  I thought the standard reply
   from UNIX supporters defending how arcane and clumsy it is was "it 
   empowers the programmer, it assumes that the programmer knows what 
   he's doing."  Is that _not_ the case here.  Is parsing a command line 
   something that you want the system _not_ to assume that you know how 
   to do?  In any case the matter is easily settled: write a single 
   wildcard matching function which takes the pattern and a list of 
   strings and returns a list of those strings which match the wildcard
   pattern.  Then, everyone could use the same tool.

5) The have been references to the bogus argument that if the raw command
   line is passed to the program, then the command line syntax will become
   incompatible from one tool to the next.  There are three reasons this
   argument is bogus: a) UNIX tools (and those on any other system) are 
   _already_ inconsistent; b) consistency is better maintained by discipline
   among the authors of system utilities, not by arbitrary constraints on
   the environment; c) I can get the same inconsistent behaviour by just
   _ignoring_ the command line entirely and reading directives to the 
   utility from stdin.  In view of (c), the constraint on the command
   line visibility gains _nothing_ in terms of compatibility.  For example, 
   the phonebook tool could work like:

      cochiti => phone (arg list ignored, I can put anything here)
         Name? S*m s*de
            Sam Spade: (505) 677-7111
            Slim Shade: (505) 678-1234
         Name? quit
      cochiti =>

I guess it's only slightly less convenient.  And after all, that's the
whole point of UNIX isn't it?  To be just _slightly_ less convenient
(or _slightly_ less efficient, or _slightly_ more of a disk/memory 
hog, or _slightly_ harder to learn, or - wait a minute: some of these
aren't _slight_!).

J. Giles            (UNIX: the OS of Dorian Grey - ttw)

gl8f@astsun.astro.Virginia.EDU (Greg Lindahl) (06/01/91)

In article <24784@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes:
>Here are some clarifications on command line arguments.
>
>1) I can't write my own shell because I can't be sure the end users of
>   my utilities will be using my shell.  Writing utilities that only work
>   under one specific shell is unprofessional.

Repeatedly whining on Usenet is also unprofessional, but that doesn't
stop you. Try alt.religion.computers if you would like to dispute the
philosophy behind the Unix shell.

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (06/01/91)

In article <24784@lanl.gov>, jlg@cochiti.lanl.gov (Jim Giles) writes:
> 2) Quote and escape marks _should_ mean that the quoted or escaped
>    characters will be treated _literally_: that no 'globbing'  or
>    other reprocessing will be applied to them in _any_ context. 

In _any_ context?  That would be radically unlike other programming
languages.  How, for example, can you quote a Fortran character literal
so that no subroutine it is passed to will treat the chracters other
than "_literally_ ... _any_ context"?  In all the programming languages
I've used, quotation marks and escapes mean that the quoted or escaped
characters won't be treated literally in THIS context, with no guarantees
being made or possible for any other.  Why should a command language be
different?

>    So, in my phone example:
>       cochiti => phone "S*m"
> 
> 3) I am _not_ asking for the shell not to 'glob'.  The modification I 
>    want is completely backward compatible.  I want the unparsed command
>    line _in_addition_ to the normal parsed arguments.

If you're not asking for the shell to do filename expansion, you are in
big trouble.  There are at least three circumstances I can think of in
which file name expansion will do nasty things to you:

    a)	There is no file matching the pattern in question.
	Some shells report this as an error.
	Users who found that they were forbidden to use wild cards
	in names unless there was at least one matching _file_ would
	be very surprised by this.
	The alternative is to demand that every shell maintainer
	forsake backwards compatibility just in case one of your users
	wants to use his shell.
    b)	There may be so _many_ file names matching the pattern that
	the command is aborted.  Again, someone who doesn't intend his
	pattern to refer to files _at all_ is going to be very surprised
	when
		cochiti => phone S*m
	results in a message like "phone: not done: argument list too big".
	Again, the only way to prevent this is to require that all shells
	are changed so as to go ahead with the command anyway, somehow.
    c)  On a networked file system, if a server is heavily loaded or down,
	the execution of the command may be delayed until the server is
	ready again, despite the fact that the user does not intend the
	pattern to have any reference to that server.

There are other problems.  Different shells have different wild-card
conventions, and _none_ of them matches what you get in a tool like
egrep, awk, or sed.

	This would mean that either
	a.1) Users of your program may not use wild cards in names
	     unless there is at least one *file* matching the name!
	a.2) OR 

> 4) The comment has been made that if I get the raw command line I would 
>    have to (horrors) parse it myself!

There is a much worse problem.  There may never have BEEN any wretched
command line.  If I do
	execlp("phone", "kill", "S*m S*e", (char*)NULL);
then what is the command line that you expect to see?

>    In any case the matter is easily settled: write a single 
>    wildcard matching function which takes the pattern and a list of 
>    strings and returns a list of those strings which match the wildcard
>    pattern.  Then, everyone could use the same tool.

Uh, do you _really_ want file name expansion to be done by having
people load into their programs one string for every file in the
system?  And it is rather strange for someone who insists on NOT doing
things the way the existing tools do it to suggest that everyone should
use ONE pattern matcher.

> 5) The have been references to the bogus argument that if the raw command
>    line is passed to the program, then the command line syntax will become
>    incompatible from one tool to the next.

It is not a bogus argument.  It was observably the case that TOPS-10
programs didn't even agree on the syntax of file names, let alone anything
else.  And UNIX is a _model_ of consistency compared with MS-DOS or VM/CMS.

>    c) I can get the same inconsistent behaviour by just
>    _ignoring_ the command line entirely and reading directives to the 
>    utility from stdin.

No you can't.  There may not BE any such thing as stdin.  Again:
	close(0);
	execlp("phone", "kill", "S*m S*e", (char*)NULL);

If you want VM/CMS you know where to find it.
-- 
Should you ever intend to dull the wits of a young man and to
incapacitate his brains for any kind of thought whatever, then
you cannot do better than give him Hegel to read.  -- Schopenhauer.

jlg@lanl.gov (Jim Giles) (06/02/91)

From article <1991May31.234623.7735@murdoch.acc.Virginia.EDU>, by gl8f@astsun.astro.Virginia.EDU (Greg Lindahl):
> [...]
> Repeatedly whining on Usenet is also unprofessional, but that doesn't
> stop you. Try alt.religion.computers if you would like to dispute the
> philosophy behind the Unix shell.

Then you should stop repeatedly whining.  The only people whining on
this issue are those who refuse to respond to a legitimate request for
functionality in a rational way.  Telling someone who has a legitimate
request for specific and useful functionality to post his request to
alt.religion.computers is also whining.  Claiming that useful functionality
should _not_ be provided because of some hypothetical "philosophy" behind
the UNIX shell is what belongs in alt.religion.computers.

In the meantime, no one has yet given a legitimate _technical_ reason
that the shell should not be modified to provide the _unparsed_ command
line to programs.  I conclude that there is no such technical reason
and that the resistence to adding this feature is totally religious
in origin.

J. Giles

jlg@cochiti.lanl.gov (Jim Giles) (06/02/91)

In article <6069@goanna.cs.rmit.oz.au>, ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:
|> In article <24784@lanl.gov>, jlg@cochiti.lanl.gov (Jim Giles) writes:
|> > [...]
|> [...]                               In all the programming languages
|> I've used, quotation marks and escapes mean that the quoted or escaped
|> characters won't be treated literally in THIS context, with no guarantees
|> being made or possible for any other.  Why should a command language be
|> different?

Exactly.  So, in the context of my phone command, the meaning of the
parameter(s) is a pattern to match _phonebook_entries_, not file names.
So, the quotes should mean to treat that pattern literally in _this_
context.

|> [...]
|> If you're not asking for the shell to do filename expansion, you are in
|> big trouble.  There are at least three circumstances I can think of in
|> which file name expansion will do nasty things to you:

You left one out, so I'll fill it in before addressing your other cases:

     0) File name expansion is already doing nasty things to me by
        happening at all and not _at_least_ letting me get around it
        without arcane and non-intuitive syntax conventions.

|> [...]
|>     a)	There is no file matching the pattern in question.
|> 	Some shells report this as an error. [...]

This is a problem whether you let me see the raw command line or not.
Letting me see the command line won't fix or aggravate this failing
of the shell.  Frankly, I would recommend that user's of such a shell
either switch or lobby whoever sold it to them to fix it.

|> [...]
|>     b)	There may be so _many_ file names matching the pattern that
|> 	the command is aborted.  [...]

Again, this is a problem whether you pass me the raw command line or not.

|> [...]
|>     c)  On a networked file system, if a server is heavily loaded or down,
|> 	the execution of the command may be delayed until the server is
|> 	ready again, despite the fact that the user does not intend the
|> 	pattern to have any reference to that server.

True, I didn't claim that my solution was as efficient as it ought to be.
The ideal solution would be to let each utility decide for itself whether
'globbing' should take place and not to have the shell _always_ do it. 
But, that wouldn't be upward compatible.  Passing the raw command line in
_addition_ to the 'globbed' arg-list _is_ upward compatible despite its
other faults.

|> [...]
|> There are other problems.  Different shells have different wild-card
|> conventions, and _none_ of them matches what you get in a tool like
|> egrep, awk, or sed.

Yet, later you go on and on about how consistent UNIX is.

|> [...]
|> There is a much worse problem.  There may never have BEEN any wretched
|> command line.  If I do
|> 	execlp("phone", "kill", "S*m S*e", (char*)NULL);
|> then what is the command line that you expect to see?

If I document a program with the condition that it expects to see a raw
command line somewhere (say, as an environment variable), and you exec
the program without providing a command line - that's _your_ problem.
The whole thing is a convenience issue for interactive users.  I see
no reason that an intended _interactive_ utility should be expected to
work the same as the "batch mode" commands that UNIX is saddled with.

|> [...] And it is rather strange for someone who insists on NOT doing
|> things the way the existing tools do it to suggest that everyone should
|> use ONE pattern matcher.

Not at all.  I was addressing the bogus concern that people raise that
'utter chaos' will arise if the shell isn't granted exclusive rights over
command line parameter parsing.  It's bogus because consistency comes
from planning and discipline among the systems programmers - not arbitrary
and arcane constraints on the environment.  Providing a common tool to
do pattern matching is _one_ way of allowing that planning and discipline
to be practiced.  For example, if such a tool existed, I could use it for
_both_ file name 'globbing' and for matching the names in the phonebook
program.  That way I would _know_ that the wildcard syntax that each used
was identical.

|> 
|> > 5) The have been references to the bogus argument that if the raw command
|> >    line is passed to the program, then the command line syntax will become
|> >    incompatible from one tool to the next.
|> 
|> It is not a bogus argument.  It was observably the case that TOPS-10
|> programs didn't even agree on the syntax of file names, let alone anything
|> else.  And UNIX is a _model_ of consistency compared with MS-DOS or VM/CMS.

I've never used VM/CMS (I don't even know what type of hardware it 
runs  on).  MS-DOS has a number of inconsistencies because of a lack 
of communication between implementors (much less any planning and 
discipline).  UNIX is a model of chaos compared to the mainframe 
systems I'm used to.  From your comments above:  I can't count on the
same 'globbing' syntax (from one shell to the next), I can't count on
the same regular expression syntax (from one tool to the next), I can't
assume a legal command line will even cause a program to be executed
(if the pattern doesn't match or matches too much), etc..

|> > [... workaround using stdin ...]
|> No you can't.  There may not BE any such thing as stdin.  Again:
|> 	close(0);
|> 	execlp("phone", "kill", "S*m S*e", (char*)NULL);

If I don't have a way of getting the raw command line, I _will_
write tools which use stdin.  This is _already_ done (for exactly
the same reason) by tools such as dc and write.  If you close stdin
before you start up a program which reads stdin, that's _your_ 
problem.  Like your other example involving direct exec() calls, 
this is _irrelevant_ to the issue at hand.

|> [...]
|> If you want VM/CMS you know where to find it.

Actually, no I don't.

J. Giles

yfcw14@castle.ed.ac.uk (K P Donnelly) (06/02/91)

Here is my explanation for the trouble writing a program on Unix which
acquires parameters from the command line but which doesn't need ugly
quoting to stop the shell trying to do wildcarding on filenames:-

Standard Unix does not have dynamic loading or shareable libraries or
whatever you want to call it.  So standard Unix does not have a
"parameter acquisition mechanism" usable by all commands.  Without
shareable libraries, a parameter acquisition mechanism would probably
add 100K or more to the size of every executable.

One consequence of this is that every Unix command has its own command
line syntax.  There is very little consistency between commands, and
most of them have a very primitive and non robust method of parameter
acquisition (or "option" acquisition, since most parameters are called
"options" on Unix).

Another consequence is that commands depend on the shell to do an
extremely primitive form of command line parsing.  The shell knows
nothing about the kind of parameters the command is expecting, so the
kinds of things it does to the command line may be completely inappropriate.
This is what is happening in the
            phone sm*th*
example.  The shell doesn't whether or not the "phone" command is
expecting a filename parameter, so it just goes ahead and assumes that
it is anyway.

Contrast with this Edinburgh University's own operating system, EMAS,
which has had dynamic linking at for a decade or more.  It has
a "parameter aquisition mechanism" (PAM) which all commands make use of.
It is linked in at load time so it doesn't add to the size of executables.
Commands like "cp" would tell PAM that they were expecting filename
parameters - in fact possibly too filenames which which agree in their
wildcarding, so
            cp dir1/* dir2/*
would copy all files from directory dir1 to dir2.  Commands like "phone"
would not tell PAM that they were expecting a filename parameter, so
"sm*th*" would not be interpreted as a set of filenames.  No cumbersome
quoting is necessary to switch off such interpretation.  As well as
that, parameter acquisition is much more user friendly.  If the
parameter is not of the type expected (e.g. a filename which is exists
and is readable) then PAM issues a meaningful error message and
reprompts the user for the parameter.

Unfortunately Edinburgh University's EMAS operating system is
disappearing exactly one year from now.  They are moving over to Unix to
be the same as everyone else.

Does anyone know whether any kind of user friendly parameter acquisition
mechanism is available for versions of Unix like SunOS which do have
shareable libraries?

   Kevin Donnelly

silvert@cs.dal.ca (Bill Silvert) (06/03/91)

In article <24885@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
>> Repeatedly whining on Usenet is also unprofessional, but that doesn't
>> stop you. Try alt.religion.computers if you would like to dispute the
>> philosophy behind the Unix shell.
>
>Then you should stop repeatedly whining.  The only people whining on
>this issue are those who refuse to respond to a legitimate request for
>functionality in a rational way.  Telling someone who has a legitimate
>request for specific and useful functionality to post his request to
>alt.religion.computers is also whining.

Your postings have nothing to do with Fortran.  Whether your complaints
about Unix are valid or not belongs in another news group.  Please do
not waste bandwidth by posting complaints about operating systems to a
language-specific newsgroup.
-- 
William Silvert, Habitat Ecology Division, Bedford Inst. of Oceanography
P. O. Box 1006, Dartmouth, Nova Scotia, CANADA B2Y 4A2.  Tel. (902)426-1577
UUCP=..!{uunet|watmath}!dalcs!biome!silvert
BITNET=silvert%biome%dalcs@dalac	InterNet=silvert%biome@cs.dal.ca