[comp.arch] shell architecture

mike@bria.UUCP (Michael Stefanik) (01/20/91)

In article <1991Jan17.185527.9824@Neon.Stanford.EDU> Theory.Stanford.EDU!andy (Andy Freeman) writes:
>In article <360@bria> mike@bria.UUCP (Michael Stefanik) writes:
>>   with it, and compile.  Under what you are proposing, I would additionally
>>   have to tell the shell that the program has been changed.  This is a
>>   real hassle.
>
>So is updating documentation, but it is worthwhile.

The documentation I write at the *end* of the development phase; maintaining
command databases while I'm still developing is what I think is the pain.

>Besides, there's no particular reason that one can't include a
>description of the argv arguments to the executable.  Then, the shell
>can merely look at the program it is about to invoke to decide what it
>(the shell) should do with the typed arguments before invoking the
>program.

This puts an unecessary burden on the shell and would degrade startup time
on the image.  The shell would have to open the program file, seek to this
mythical table, parse it, close the program, and then hand it to the OS, which
would open the file again, etc.  This is *far* too much overhead for the
minimal benefits that it would provide.

Going the opposite way, and having exec() deal with command arguments insofar
as qualifying them, etc. is a *worse* kludge and would force the COFF to
be revised.

>>2. In my view, it is the job of the shell to parse (and glob) args, not the
>>   programs that are being given the arguments.
>
>I'd love to have a shell that did something reasonable with arguments.
>Instead, all I can have is a shell that assumes that all arguments
>are filenames and expands them as filenames.
>
>The issue isn't whether or not I can pass "*" as an argument to a
>program, it is what to do about arguments that aren't filenames.
>Shells treat each and every argument as a file name.  If an argument
>isn't a filename, there's no way to have the shell expand it
>appropriately.
>

I may be nitpicking, but I don't believe that the shell thinks that *all*
arguments are filenames ... only those arguments that include special,
well-known characters that are not quoted.  The shell doesn't actually
give a dang what you are specifying as aguments.  Again, the shell follows
a predefined set of rules.  If you don't like those rules, then write your
own shell.  IMHO, this is the way it should be.  If you don't like something,
then you're free to do it yourself another way.  As a programmer, though,
the "standard" UNIX shells suit me just fine, thank you.

>BTW - There are things one would like to see a command processor do
>beyond expand arguments.  For example, it could tell you what kind of
>argument is expected, possibly including a list of options.  It might
>even tell you what it is going to do with that argument.  (For
>example, the command parser could tell you that mv's first argument is
>a source, if you then asked what the second argument was, it could say
>"destination or another source, in which case the last argument must
>be a directory".)

This is decidedly what I *WOULD NOT* like to see in UNIX.  I don't want
the operating system, the shell, or your program second-guessing me.  If
I did something wrong, then a simple, short, straightforward error message
is all I want (along with a error status returned by the program).  I
find myself doing this so often in my scripts ...

	somecommand >/dev/null 2>&1

I know this is going to irk a few people, but I think that UNIX is *too*
verbose as it is (AIX is ridiculous with it's multiline error messages).
The error messages (as is) are too complex for the ignorant user, and contain
too much meaningless fluff for the programmer.

The day I stop using using UNIX is whan I see messages like this:

cp: (meaning "copy"): I'm sorry, but I could not create your file
    /tmp/foo because there is not enough free blocks on that filesystem.
    Please do a "df -v /dev/hd3" to determine the free space and remove
    any extraneous files.

Or how about something as nauseating as this:

mv: (meaning "move"): You didn't supply a second argument, please enter one.
    File:

The one thing I *love* about UNIX is that it is terse, doesn't second guess,
and assumes that you know what you're doing (thereby giving me all the
freedom that I want).

The real challenge is to provide an end-user interface that is simple to
use, and covers most of the bases.  However, a fundamental deviation from 
the philosphy of "power to the programmer" would be a major mistake, in my
mind anyhow.
-- 
Michael Stefanik, Systems Engineer (JOAT), Briareus Corporation
UUCP: ...!uunet!bria!mike
--
technoignorami (tek'no-ig'no-ram`i) a group of individuals that are constantly
found to be saying things like "Well, it works on my DOS machine ..."

kenw@skyler.arc.ab.ca (Ken Wallewein) (01/20/91)

> >Besides, there's no particular reason that one can't include a
> >description of the argv arguments to the executable.  Then, the shell
> >can merely look at the program it is about to invoke to decide what it
> >(the shell) should do with the typed arguments before invoking the
> >program.
> 
> This puts an unecessary burden on the shell and would degrade startup
> time on the image.  The shell would have to open the program file, seek
> to this mythical table, parse it, close the program, and then hand it to
> the OS, which
> would open the file again, etc.  This is *far* too much overhead for the
> minimal benefits that it would provide.

  Well, it might, if it were done that way.  The implementation with which
I am familiar doesn't, and it works rather well.  The syntax of the command
is described in a special "concise command language", which is compiled
into a library, and which the shell keeps in shared memory.

  This has a number of benefits, including _simplified_ software
development (because the syntax of the commands can be decoupled from the
code itself), flexibility (you can call the parser from within your program
if you want to), flexibility (you can change the command syntax without
changing the program -- you don't even need source!), efficiency (you don't
even bother to load the program if the command doesn't follow the required
syntax), and so on.

> Going the opposite way, and having exec() deal with command arguments insofar
> as qualifying them, etc. is a *worse* kludge and would force the COFF to
>  be revised.

  IF it were done that way.  It needn't be, and such arguments distract
from the value of the central idea.

>...
> The one thing I *love* about UNIX is that it is terse, doesn't second guess,
> and assumes that you know what you're doing (thereby giving me all the
> freedom that I want).
> 
> The real challenge is to provide an end-user interface that is simple to
> use, and covers most of the bases.  However, a fundamental deviation from 
> the philosphy of "power to the programmer" would be a major mistake, in my
> mind anyhow.
>...

  I agree.  But there are many kinds of power, and many kinds of
simplicity.  Let's keep our minds open.  The idea of having the shell know
something about the syntax expected by programs, as long as it is
configurable, really shouldn't take anything away from you.  The idea is
powerful.  Don't sell it short.

--
/kenw

Ken Wallewein                                                     A L B E R T A
kenw@noah.arc.ab.ca                                             R E S E A R C H
(403)297-2660                                                     C O U N C I L

mike (Michael Stefanik) (01/21/91)

In article <KENW.91Jan20003950@skyler.arc.ab.ca> skyler.arc.ab.ca!kenw (Ken Wallewein) writes:

[ discussion about shell looking at the program to determine usage ]

>  Well, it might, if it were done that way.  The implementation with which
>I am familiar doesn't, and it works rather well.  The syntax of the command
>is described in a special "concise command language", which is compiled
>into a library, and which the shell keeps in shared memory.
>
>  This has a number of benefits, including _simplified_ software
>development (because the syntax of the commands can be decoupled from the
>code itself), flexibility (you can call the parser from within your program
>if you want to), flexibility (you can change the command syntax without
>changing the program -- you don't even need source!), efficiency (you don't
>even bother to load the program if the command doesn't follow the required
>syntax), and so on.

This "special concise command language" is essentially what VMS has, and what
I argued against in the first place.  Consider:

1. It does *not* simplify development, because it adds yet another
   link in the already lengthy chain of project management

2. I *dont want* my command syntax decoupled from the program 
   because it tends to disrupt the continuity of the program,
   and increases development and debugging time by adding "extras"
   into the development phase.

3. I *dont want* to call a parser or have my arguments spoon-fed
   to me in a fashion that some external entity (ie: the shell)
   decides that I sould receive them.  I simply want a list of
   arguments.  The whole VMS approach, as snazzy as it may look from
   the outside, can be damn frustrating when you are tied to it.
   I speak from personal experience.

4. The ability to change the command syntax externally of the program
   violates the autonomy of the tool. If I want to change the syntax,
   then I'll change the code.  The idea of giving an end user or
   administrator the ability to change how tools interact "on the fly"
   (as it were), gives me the willies.  This is just an approach that
   begs for endless debugging opportunities.

The only thing that I see of value is that the shell doesn't have to
load the image to know that there is a problem with the command
syntax.

A compromise would be be that the shell may *optionally* have a command
prototype in some shared memory hash table; but if the command is not in
the list, you let it go by.  I would also want to be able to turn this
"feature" off and on at will.  Thus, prototypes could be defined in
/etc/rc (or wherever), such as:

	proto rm 'fr' '(file ...)'

Where the first argument is (of course) the command, the second argument
is the flags (with getopt() compatability) and folowed by a syntax
description of some sort.  Some syntax handling (such as using rm to
remove a directory without the -r switch) should still be handled by
the tool in question. 

I suppose that you could even go one step further, and define usage strings
for mistyped commands as well:

	usage rm 'rm [-fr] file ...'

So that if the novice user would enter:

	$ rm
	sh: rm: missing filename in argument 1
	Usage: rm [-fr] file ...

This is all nice and wonderful, but again, I had better have a way to
turn all of this nonsense off, such as:

	$ shutup

Whaddaya think?
-- 
Michael Stefanik, Systems Engineer (JOAT), Briareus Corporation
UUCP: ...!uunet!bria!mike
--
technoignorami (tek'no-ig'no-ram`i) a group of individuals that are constantly
found to be saying things like "Well, it works on my DOS machine ..."

andy@Theory.Stanford.EDU (Andy Freeman) (01/22/91)

In article <365@bria> mike@bria.UUCP (Michael Stefanik) writes:
>In article <1991Jan17.185527.9824@Neon.Stanford.EDU> Theory.Stanford.EDU!andy (Andy Freeman) writes:
>>In article <360@bria> mike@bria.UUCP (Michael Stefanik) writes:
>>Besides, there's no particular reason that one can't include a
>>description of the argv arguments to the executable.  Then, the shell
>>can merely look at the program it is about to invoke to decide what it
>>(the shell) should do with the typed arguments before invoking the
>>program.
>
>This puts an unecessary burden on the shell and would degrade startup time
>on the image.  The shell would have to open the program file, seek to this
>mythical table, parse it, close the program, and then hand it to the OS, which
>would open the file again, etc.  This is *far* too much overhead for the
>minimal benefits that it would provide.

The most expensive resource on my system is the users, and that's true
of every site with less expensive computers than Los Alamos and
Livermore.  In any event, strings(1) is pretty fast and the argument
description can be preloaded.  (My shell has a program/command hash
table already, so it isn't unreasonable to preload.)

>>I'd love to have a shell that did something reasonable with arguments.
>>Instead, all I can have is a shell that assumes that all arguments
>>are filenames and expands them as filenames.
>>
>>The issue isn't whether or not I can pass "*" as an argument to a
>>program, it is what to do about arguments that aren't filenames.
>>Shells treat each and every argument as a file name.  If an argument
>>isn't a filename, there's no way to have the shell expand it
>>appropriately.
>
>I may be nitpicking, but I don't believe that the shell thinks that *all*
>arguments are filenames ... only those arguments that include special,
>well-known characters that are not quoted.  The shell doesn't actually
>give a dang what you are specifying as aguments.  Again, the shell follows
>a predefined set of rules.

The point is that the shell CAN'T do anything useful with arguments
that aren't filenames.  If one of the arguments to a program is
uniquely specified by the prefix "a", the shell can do something
useful if the argument happens to be a filename.  However, if it
isn't, either the user gets to type more or the program has to do its
own expansion.  In any event, the shell can't help a user know what
kind of arguments are expected or what will be done with them.

There's no question that the shell is "complete", the question is
whether or not it is a useful command processor.  I note that menu
systems are becoming more popular.  I doubt that this is because
people like moving their hands from the keyboard to a mouse and back,
but because the argument parsing and instantaneous help provided is
markedly superior to the best bare unix can provide.

-andy
--
UUCP:    {arpa gateways, sun, decwrl, uunet, rutgers}!neon.stanford.edu!andy
ARPA:    andy@neon.stanford.edu
BELLNET: (415) 723-3088

pdsmith@bbn.com (Peter D. Smith) (01/24/91)

In article <1991Jan21.200544.29795@Neon.Stanford.EDU> andy@Theory.Stanford.EDU (Andy Freeman) writes:
>The point is that the shell CAN'T do anything useful with arguments
>that aren't filenames.  [...] In any event, the shell can't help a user
>know what kind of arguments are expected or what will be done with them.
>
>-andy
>--

VMS, the 'shell' of overwhelming choice on DEC platforms :-), can already
do this.  It's just a question of time before everyone and their kid
Standards Organization writes any number of incompatible versions for Unix.
No smileys here, I'm afraid.

					Peter D. Smith

PS - it's also intelligent-user-extensible.

jesup@cbmvax.commodore.com (Randell Jesup) (01/24/91)

In article <KENW.91Jan20003950@skyler.arc.ab.ca> kenw@skyler.arc.ab.ca (Ken Wallewein) writes:
>  Well, it might, if it were done that way.  The implementation with which
>I am familiar doesn't, and it works rather well.  The syntax of the command
>is described in a special "concise command language", which is compiled
>into a library, and which the shell keeps in shared memory.

	AmigaDos 2.0 has a similar ability (ReadArgs()).  It greatly helps
improve the consistency of user-interfaces (quick, which unix commands take
'-' parameters only before others, for no really good reason except easier
parsing?)  It's part of a shared library (as are most things).

>  This has a number of benefits, including _simplified_ software
>development (because the syntax of the commands can be decoupled from the
>code itself), flexibility (you can call the parser from within your program
>if you want to), flexibility (you can change the command syntax without
>changing the program -- you don't even need source!), efficiency (you don't
>even bother to load the program if the command doesn't follow the required
>syntax), and so on.

	Well, in AmigaDos it's called from within the program, but otherwise
you get similar effects (for example, you could extend the library call to
allow it to optionally open a window with widgets for playing with all the
available arguments, etc).  It can also be used to parse files, if the format
is reasonably close to a command-line format (which is keyword based on the
Amiga).

-- 
Randell Jesup, Keeper of AmigaDos, Commodore Engineering.
{uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.commodore.com  BIX: rjesup  
The compiler runs
Like a swift-flowing river
I wait in silence.  (From "The Zen of Programming")  ;-)

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (01/25/91)

In article <62271@bbn.BBN.COM> pdsmith@spca.bbn.com (Peter D. Smith) writes:

| VMS, the 'shell' of overwhelming choice on DEC platforms :-), can already
| do this.  It's just a question of time before everyone and their kid
| Standards Organization writes any number of incompatible versions for Unix.

  My impression of the way the VMS DCL works is that it's something like
ANCSI C procedure prototypes, specifying the valid options and
positional parameters, what's required, what's optional, etc.

  This isn't totally unique to VMS, although the implementation is.
Using functions, getopts, and typeset in ksh, one can get arg and option
checking and other stuff. It's certainly not table driven, but the
functionality is there, and people use it.
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
  "I'll come home in one of two ways, the big parade or in a body bag.
   I prefer the former but I'll take the latter" -Sgt Marco Rodrigez

kenw@skyler.arc.ab.ca (Ken Wallewein) (02/14/91)

In article <378@bria> mike@bria.UUCP writes:

>    This "special concise command language" is essentially what VMS has, and
>    what I argued against in the first place.  Consider:
>
>    1. It does *not* simplify development, because it adds yet another
>       link in the already lengthy chain of project management

  Depends how complex your syntax is.  In the case ANU News (a PD network
news reader), for example, CDL plays an important role, and managing the
syntax without it would be awkward.  Simple programs would probably be
better off without it.

>    2. I *dont want* my command syntax decoupled from the program 
>       because it tends to disrupt the continuity of the program,
>       and increases development and debugging time by adding "extras"
>       into the development phase.

  You gotta parse commands somehow.  The question, as I see it, is whether
the overhead of working with a more formal tool is too great for the size
of your project.  To me, that depends on how the tool is implemented.

  If you could specify the syntax of the command _within_ the program, that
would be nice, I suppose.  But what if it were in a subroutine call?  Well,
no big deal, I guess.  But what if that subroutine were in a different
language -- one optimized for command specification?  Now we're starting to
talk increased programmer overhead, and poor suitability for small projects.

>    3. I *dont want* to call a parser or have my arguments spoon-fed
>       to me in a fashion that some external entity (ie: the shell)
>       decides that I sould receive them.  I simply want a list of
>       arguments.  The whole VMS approach, as snazzy as it may look from
>       the outside, can be damn frustrating when you are tied to it.
>       I speak from personal experience.

  Uh, I beg your pardon, but the VMS approach does _not_ "tie you to it".
Command parsing can be done in several ways: as a separate command
definition which is provided to the shell; as a definition which is instead
compiled and linked to the program, to be called when needed; with no
definition at all wherein you do all your parsing with whatever kludge you
hack up; and variations in between.

  CDU even works for parsing commands entered WHILE THE PROGRAM IS RUNNING.
In programs with an extensive command syntax, any other method of parsing
would be major work.

>    4. The ability to change the command syntax externally of the program
>       violates the autonomy of the tool. If I want to change the syntax,
>       then I'll change the code.  The idea of giving an end user or
>       administrator the ability to change how tools interact "on the fly"
>       (as it were), gives me the willies.  This is just an approach that
>       begs for endless debugging opportunities.

  Seems to me the principles of user interface design expressly distinguish
between program functionality and command syntax.

  In any case, experience with this implementation has shown it not to be a
problem.  Users have no particular desire to confuse themselves by messing
with command syntax -- even those who realise it's possible.  And it's
sufficiently non-trivial that those who know how to do it generally realise
the consequences.

>    The only thing that I see of value is that the shell doesn't have to
>    load the image to know that there is a problem with the command
>    syntax.

  It a pretty trivial advantage, really (surprise :-).

>    A compromise would be be that the shell may *optionally* have a command
>    prototype in some shared memory hash table; but if the command is not in
>    the list, you let it go by.  I would also want to be able to turn this
>    "feature" off and on at will.  Thus, prototypes could be defined in
>    /etc/rc (or wherever), such as:
> 
> 	   proto rm 'fr' '(file ...)'

  Interesting...

>    Where the first argument is (of course) the command, the second argument
>    is the flags (with getopt() compatability) and folowed by a syntax
>    description of some sort.  Some syntax handling (such as using rm to
>    remove a directory without the -r switch) should still be handled by
>    the tool in question. 
> 
>    I suppose that you could even go one step further, and define usage strings
>    for mistyped commands as well:
> 
> 	   usage rm 'rm [-fr] file ...'
> 
>    So that if the novice user would enter:
> 
> 	   $ rm
> 	   sh: rm: missing filename in argument 1
> 	   Usage: rm [-fr] file ...
> 
>    This is all nice and wonderful, but again, I had better have a way to
>    turn all of this nonsense off, such as:
> 
> 	   $ shutup
> 
>    Whaddaya think?
>    -- 
>    Michael Stefanik, Systems Engineer (JOAT), Briareus Corporation

  I think it's a good idea.  Worth building on.  And I very much agree with
your point about such a facility being optional.  A tool one can't avoid
using is a straitjacket -- kinda like Pascal :-^).  Tools will be used if
they are worthwhile.

--
/kenw

Ken Wallewein                                                     A L B E R T A
kenw@noah.arc.ab.ca                                             R E S E A R C H
(403)297-2660                                                     C O U N C I L

kenw@skyler.arc.ab.ca (Ken Wallewein) (02/14/91)

  A side note.

  If the command has a syntax known to the agent that does the globbing,
and the program that receives the command knows where the globbing has
taken (or should take) place, there can be a significant increase in
command flexibility and power.

  Ever notice that all Unix commands that support globbing only allow it at
the end of the command, and only allow one (possibly list) argument to be
globbed?  That's why you can't say

	mv here/* there/*

for example. 

  There are a number of ways to approach this, when command parsing is done
cooperatively between the program and the system.  Globbing could be done
before the program receives the command line, wherein it is passed a more
complex data object than a simple string or array of strings.  It could be
passed a reference, or a handle to a call-back routine -- in which case,
the actually globbing could occur at various times.  If the facility is
used to parse commands entered while the program is running, this provides
a handy capability.

  I sometimes think that shell-based command line globbing is a violation
of the Unix "one tool, one job" philosophy.  And a kludge, at that.  It
presumes to know something about the syntax desired by the program.
--
/kenw

Ken Wallewein                                                     A L B E R T A
kenw@noah.arc.ab.ca                                             R E S E A R C H
(403)297-2660                                                     C O U N C I L

barmar@think.com (Barry Margolin) (02/14/91)

In article <KENW.91Feb13135544@skyler.arc.ab.ca> kenw@skyler.arc.ab.ca (Ken Wallewein) writes:
>  Ever notice that all Unix commands that support globbing only allow it at
>the end of the command, and only allow one (possibly list) argument to be
>globbed?  

I never noticed that, because it isn't true.  First of all, there's no such
thing as "all Unix commands that support globbing", since the globbing is
independent of the command.  More to the point, though, you can glob as
many arguments as you want, and they can be anywhere on the line.  For
instance, I've done things like

	find foo/* bar/* -name "*frob*" -print

>	   That's why you can't say
>
>	mv here/* there/*

You can say it, but it won't do what you probably hope.  The wildcards are
expanded, and then all the pathnames are passed to the mv command.  The
command doesn't see the original wildcards, so it can't treat them
specially.  The above command is exactly equivalent to

	mv here/file1 here/file2 here/file3 there/filea there/fileb

Your reference to "only at the end of the command" is probably is probably
confusion about the fact that most Unix commands have syntax of the form

	command -options files

This is merely a common syntax convention; it makes it a little easier to
parse the arguments, because it can stop processing options as soon as it
sees the first argument that doesn't begin with "-", and then simply loop
over the remaining arguments.  The find command is a good counterexample.
Another example of a command where the globbed argument is likely not to be
the last one is mv: one of its allowed syntaxes is

	mv -options file file file ... directory

(this moves all the files into the directory).  It is quite common to use
wildcards in the file arguments, e.g.

	mv here/* there/* nowhere
--
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

mike (02/14/91)

In an article, skyler.arc.ab.ca!kenw (Ken Wallewein) writes:
>
>Ever notice that all Unix commands that support globbing only allow it at
>the end of the command, and only allow one (possibly list) argument to be
>globbed? [...]

The shell does the globbing, not the command itself.

>That's why you can't say
>
>	mv here/* there/*
>
>for example. 

Sure you can, just as long as the last expanded filename is a directory.
This is probably not what you want, however. :-)

>[...]
>I sometimes think that shell-based command line globbing is a violation
>of the Unix "one tool, one job" philosophy.  And a kludge, at that.  It
>presumes to know something about the syntax desired by the program.

Actually, the presumption is that arguments are files.  If you would
prefer to not use globbing, then set noglob or use quotes.  If there was
no way to disable the shell's argument globbing, then yes, I'd agree
with you.   So, if you want to do what you really intended above, write
a program that does the globbing the way you want, and use:

	movem 'here/*' 'there/*'

The joy of UNIX.  To be free to be who you want to be. :-)
-- 
Michael Stefanik                       | Opinions stated are not even my own.
Systems Engineer, Briareus Corporation | UUCP: ...!uunet!bria!mike
-------------------------------------------------------------------------------
technoignorami (tek'no-ig'no-ram`i) a group of individuals that are constantly
found to be saying things like "Well, it works on my DOS machine ..."

kenw@skyler.arc.ab.ca (Ken Wallewein) (02/15/91)

In article <1991Feb14.024803.1252@Think.COM> barmar@think.com (Barry Margolin) writes:

> independent of the command.  More to the point, though, you can glob as
> many arguments as you want, and they can be anywhere on the line.  For
> instance, I've done things like
> 
> find foo/* bar/* -name "*frob*" -print

  Syntactically, "foo/* bar/*" is a single list argument.  Granted, though,
it wasn't the last argument on the line.  See below...

> > That's why you can't say
> >
> >	mv here/* there/*
> 
>    You can say it, but it won't do what you probably hope.  The wildcards are
>    expanded, and then all the pathnames are passed to the mv command.  The
>    command doesn't see the original wildcards, so it can't treat them
>    specially.  The above command is exactly equivalent to
> 
> 	   mv here/file1 here/file2 here/file3 there/filea there/fileb

  Thank you -- that was exactly my point.  'mv' has no way of knowing which
of the original arguments were globbed.  That is potentially valuable
information about the syntax of the original command which is lost to the
program.  

  I submit that if globbing were not done in such a way as to loose this
information, mv _would_ have been written to do what you correctly assumend
was my intention (it was, as intended, intuitively obvious).  And even one
used quotes, mv -- and most other programs -- don't take advantage of that.
Presumably because the parsing would be difficult, which is precisely to
the point.

> Your reference to "only at the end of the command" is probably is probably
> confusion about the fact that most Unix commands have syntax of the form
> 
>    command -options files

  Well, no, it wasn't.

> This is merely a common syntax convention; it makes it a little easier to
> parse the arguments, because it can stop processing options as soon as it
> sees the first argument that doesn't begin with "-", and then simply loop
> over the remaining arguments.  The find command is a good counterexample.
> Another example of a command where the globbed argument is likely not to be
> the last one is mv: one of its allowed syntaxes is
> 
>     mv -options file file file ... directory
> 
> (this moves all the files into the directory).  It is quite common to use
>  wildcards in the file arguments, e.g.
> 
> 	   mv here/* there/* nowhere
>    --
>    Barry Margolin, Thinking Machines Corp.

  All good points, Barry, and thank you for responding thoughtfully rather
than simply rushing to the rebuttal.

  I especially like your point that it is _possible_ to separate multiple
globbed arguments while working with a simple shell globber.  But I think
you have made my point, in that the sophicated parser and awkward syntax
required to do so mean that it is only done in rare cases, and then not in
a consistent manner.  I think better parsing and command syntax definition
tools might go a long way to correcting this.

  On the other hand, am _I_ missing the whole point of the Unix Way to even
want this?  Certainly there are workarounds -- shell loops, etc.  The
questions is, are those workarounds really the Right Way, the way of Truth,
Beauty, and Simplicity?  Or are they merely kludges?

/kenw
--
/kenw

Ken Wallewein                                                     A L B E R T A
kenw@noah.arc.ab.ca                                             R E S E A R C H
(403)297-2660                                                     C O U N C I L

barmar@think.com (Barry Margolin) (02/15/91)

In article <KENW.91Feb14160614@skyler.arc.ab.ca> kenw@skyler.arc.ab.ca (Ken Wallewein) writes:
>  On the other hand, am _I_ missing the whole point of the Unix Way to even
>want this?  Certainly there are workarounds -- shell loops, etc.  The
>questions is, are those workarounds really the Right Way, the way of Truth,
>Beauty, and Simplicity?  Or are they merely kludges?

If you've read my past posts on the subject of globbing, you'd know that I
am in favor of more context-sensitive wildcard matching.  It would support
more flexible use of wildcards (the "mv * *.old" example), and prevent
globbing of non-filename arguments containing wildcard characters
(such as the pattern argument to grep).  I used to be a Multics programmer,
and Multics successfully used a globbing library that was called by
commands, rather than automatically globbing all arguments.

But I also understand and appreciate the Unix Way, even though I don't
agree with it.  There's a consistency and simplicity to it.  It guarantees
that all commands will treat wildcards equivalently; most other mechanisms
are susceptible to bugs due to the programmer failing to glob arguments, or
doing it incorrectly.  Variant of "mv" that support the above usage have
been implemented as shell scripts, merely requiring that the user quote the
arguments.  Finally, by globbing in the shell, users may switch shells in
order to get different globbing behavior (some shells have more elaborate
wildcard patterns than others); dynamic linking may allow similar
customization, but not many systems provide dynamic linking, and not all
programs are distributed as dynamically-linked binaries.

--
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

marc@marc.watson.ibm.com (Marc Auslander) (02/16/91)

In article <434@bria> >Path: arnor!aides.watson.ibm.com!scifi!bywater!uunet!bria!mike
...
>Actually, the presumption is that arguments are files.  If you would
>prefer to not use globbing, then set noglob or use quotes.  If there was
>no way to disable the shell's argument globbing, then yes, I'd agree
>with you....

The design of many unix commands assumes globbing, and thus they are
functionally deficient when it is turned off.  The shell ability for
turning of globbing is in fact for writing shell scripts which must
deal with file names and not prematurely glob them.  I don't think
anyone would be happy using command mode without it.

--


Marc Auslander       <marc@ibm.com>

throopw@sheol.UUCP (Wayne Throop) (02/18/91)

> jesup@cbmvax.commodore.com (Randell Jesup)
> If you have a richer expression space [..for arguments..]
> (ala regexp), you end up having to
> do a LOT of quoting.

There is an alternative to this, and that is to make the shell language
lexically simpler and more (let me call it) generic, and context
sensitive.  The reason that shell globbing and regular expressions (to
name the common case of this) end up needing so much quoting is that
their syntaxes overlap.  If the shell didn't react so aggressively to
so many special characters, less quoting would need be done.
--
Wayne Throop <backbone>!mcnc!rti!sheol!throopw or sheol!throopw@rti.rti.org

jesup@cbmvax.commodore.com (Randell Jesup) (02/18/91)

b13135544@skyler.arc.ab.ca>
Sender: 
Reply-To: jesup@cbmvax.commodore.com (Randell Jesup)
Followup-To: 
Distribution: 
Organization: Commodore, West Chester, PA
Keywords: 

In article <KENW.91Feb13135544@skyler.arc.ab.ca> kenw@skyler.arc.ab.ca (Ken Wallewein) writes:
>  Ever notice that all Unix commands that support globbing only allow it at
>the end of the command, and only allow one (possibly list) argument to be
>globbed?  That's why you can't say
>
>	mv here/* there/*

	The classic problem with shell-provided globbing (and causes you to
have to use escapes to avoid the shell globbing things that aren't filenames).
If you have a richer expression space (ala regexp), you end up having to
do a LOT of quoting.

-- 
Randell Jesup, Keeper of AmigaDos, Commodore Engineering.
{uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.commodore.com  BIX: rjesup  
The compiler runs
Like a swift-flowing river
I wait in silence.  (From "The Zen of Programming")  ;-)

jesup@cbmvax.commodore.com (Randell Jesup) (02/21/91)

In article <3210@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.com (bill davidsen) writes:
>  Or write a shell which does nothing but start the process and pass the
>arguemnts... Of course you could just turn globbing off and get the same
>effect. The nice thing is that someone can just go ahead and do this if
>they want. We have sh, csh, ksh (AT&T and PD versions), vcsh, bash,
>clam, and room for more if people think thay need them,

	The problem is that none of the unix utilities do their own
globbing: it wasn't a function in the kernel, and they could always assume
the shell did it.  For example, in AmigaDos (and Stratus VOS), that isn't
the case: there are standard file-pattern-matching and argument parsing
routines that are called by the program.

-- 
Randell Jesup, Keeper of AmigaDos, Commodore Engineering.
{uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.commodore.com  BIX: rjesup  
The compiler runs
Like a swift-flowing river
I wait in silence.  (From "The Zen of Programming")  ;-)

bernie@metapro.DIALix.oz.au (Bernd Felsche) (02/25/91)

In <19062@cbmvax.commodore.com>
   jesup@cbmvax.commodore.com (Randell Jesup) writes:

>	The classic problem with shell-provided globbing (and causes you to
>have to use escapes to avoid the shell globbing things that aren't filenames).
>If you have a richer expression space (ala regexp), you end up having to
>do a LOT of quoting.

You don't have to escape it ad nauseum. A rich set of escapes can
potentially escape varying levels of glob patterns. e.g. UNIX's \'"
quoting.

Also, the ability to turn off globbing, and later to turn it back on,
is essential to maintain one's sanity. I'd be insane by now if that
wasn't possible!

It's been mentioned previously, that early UNIX shells did not have
globbing in-built, and called a 'glob' program for the job.

IMHO, it makes more sense to glob in the shell, than in the program.

It saves typing, especially when one uses globbing to specify the
program name. :-)

-- 
Bernd Felsche,                 _--_|\   #include <std/disclaimer.h>
Metapro Systems,              / sale \  Fax:   +61 9 472 3337
328 Albany Highway,           \_.--._/  Phone: +61 9 362 9355
Victoria Park,  Western Australia   v   Email: bernie@metapro.DIALix.oz.au

kenw@skyler.arc.ab.ca (Ken Wallewein) (02/26/91)

> In article <1991Feb25.052212.2338@metapro.DIALix.oz.au> bernie@metapro.DIALix.oz.au (Bernd Felsche) writes:
> 
> In <19062@cbmvax.commodore.com>
>    jesup@cbmvax.commodore.com (Randell Jesup) writes:
> 
> >	The classic problem with shell-provided globbing (and causes you to
> >have to use escapes to avoid the shell globbing things that aren't filenames).
> >If you have a richer expression space (ala regexp), you end up having to
> >do a LOT of quoting.
> 
> You don't have to escape it ad nauseum. A rich set of escapes can
> potentially escape varying levels of glob patterns. e.g. UNIX's \'"
> quoting.
1~
  Sounds pretty confusing.  And as you pass through levels, each level
looses syntactical information that was available to the previous level.
It can be difficult syntactically (read "impossible") to distinguish
between expanded formerly-globbed arguments and multiple separate
arguments.  That's why some Unix commands which support a (possibly
globbed) filespec, followed directory spec as the last argument, can't
handle a globbed argument in that position.

  To me, loss of syntactical information is equally significant to
"escape-nausea" (:-) as a problem with shell or preprocesssor-based
globbing.  Some sort of quoting or other syntactical convention could
potentially bypass this broblem, making it possible to tell where the
original argument was and how much of the command line it comprised.
However, if would significantly increase the level of program-level parsing
required.

  As a trivil but amusing example, the other day I had a file whose name
started with '-'.  There was no way to tell programs which expects shell
globbing that "this is not a command option; this is a filename".

  Ambiguity...  is sometimes the price we pay for generality.  Globbing is
handy, but it sure ain't perfect.  Methinks it could be done better.
--
/kenw

Ken Wallewein                                                     A L B E R T A
kenw@noah.arc.ab.ca  <-- replies (if mailed) here, please       R E S E A R C H
(403)297-2660                                                     C O U N C I L

tif@doorstop.austin.ibm.com (Paul Chamberlain) (02/27/91)

In article <KENW.91Feb25170431@skyler.arc.ab.ca> kenw@skyler.arc.ab.ca (Ken Wallewein) writes:
>  As a trivil but amusing example, the other day I had a file whose name
>started with '-'.  There was no way to tell programs which expects shell
>globbing that "this is not a command option; this is a filename".

This seems to be a trivial and slightly amusing example of the problems
of standardizing switch notation with a legal filename character.  Anyone
that understands the concepts of switch parsing would know that this has
nothing to do with globbing.

The current way of doing this in Unix is almost always intuitive and
is close to infinitely flexible.  A basic knowledge of simple quoting,
globbing behavior, and switch parsing goes a long way.

Paul Chamberlain | I do NOT speak for IBM.          IBM VNET: PAULCC AT AUSTIN
512/838-9662     | ...!cs.utexas.edu!ibmchs!auschs!doorstop.austin.ibm.com!tif

kenw@skyler.arc.ab.ca (Ken Wallewein) (02/27/91)

In article <5615@awdprime.UUCP> tif@doorstop.austin.ibm.com (Paul Chamberlain) writes:

> In article <KENW.91Feb25170431@skyler.arc.ab.ca> kenw@skyler.arc.ab.ca (Ken Wallewein) writes:
> >  As a trivil but amusing example, the other day I had a file whose name
> >started with '-'.  There was no way to tell programs which expects shell
> >globbing that "this is not a command option; this is a filename".
> 
> This seems to be a trivial and slightly amusing example of the problems
> of standardizing switch notation with a legal filename character.  Anyone
> that understands the concepts of switch parsing would know that this has
> nothing to do with globbing.

  It has _everything_ to do with globbing.  Certainly, if "-" wasn't a
valid filename character, the parser could use that as a parsing guide.
But even then, it shouldn't matter.

> The current way of doing this in Unix is almost always intuitive and
> is close to infinitely flexible.  A basic knowledge of simple quoting,
> globbing behavior, and switch parsing goes a long way.

  You are referring to the use of a './' prefix, I presume?  I don't find
that intuitive.  The existence of a hack workaround does not invalidate my
point.

  Because globbing changes the command line before the program sees it, the
program has _no way_ of determining the syntax of the original command.
The program must _assume_ that anything it sees in the command that _looks_
like a command option _is_ a command option.  It has no way of knowing
whether that "option" was entered by the user, or is actually a filename.
It must make an assumption which is _usually_ correct.

  Sure, one can apply escaping and quoting, etc.  But programs which rely
on shell globbing generally can't handle such arguments anyway.

  Consider another case (what the heck, I'm brave :-)

	> mv cmp* compress*

as a file rename operation, without shell hacks?  Same problem: how can the
program parse it properly?  It has simply no way to know what the original
command was.  It would see two lists of "deglobbed" filenames, with
no way to know which was which, or even that it wasn't a single list.
Sure, mv can check the last one to see if it is a directory, but that's an
inelegant kludge.

  Handling such a command requires either preventing globbing of the
original command line so that the program (script or otherwise) can see the
original version, or some "intuitive" syntactical trick such as interposing
a "place-holder" like "-" between the two arguments, and hoping there
aren't any files by that name.

  Don't get me wrong -- I'm not saying globbing is a bad idea.  It's slick.
But command line preprocessing which removes syntactical information _does_
sometimes make it hard to be unambiguous.  

--
/kenw

Ken Wallewein                                                     A L B E R T A
kenw@noah.arc.ab.ca  <-- replies (if mailed) here, please       R E S E A R C H
(403)297-2660                                                     C O U N C I L

peter@ficc.ferranti.com (Peter da Silva) (02/27/91)

In article <5615@awdprime.UUCP> tif@doorstop.austin.ibm.com (Paul Chamberlain) writes:
> This seems to be a trivial and slightly amusing example of the problems
> of standardizing switch notation with a legal filename character.

Better than standardizing on unadorned keywords for switches. Try deleting
a file named "all" on AmigaDOS.

The problem here is that there are no spare characters you can type into the
shell that aren't legal in UNIX file names. This is true on many recent
operating systems, as the advantages of files named "Joe's Report Part #1"
have become obvious.

(what does this have to do with comp.arch?)
-- 
Peter da Silva.  `-_-'  peter@ferranti.com
+1 713 274 5180.  'U`  "Have you hugged your wolf today?"

mike (02/28/91)

In an article, skyler.arc.ab.ca!kenw (Ken Wallewein) writes:
>> This seems to be a trivial and slightly amusing example of the problems
>> of standardizing switch notation with a legal filename character.  Anyone
>> that understands the concepts of switch parsing would know that this has
>> nothing to do with globbing.
>
>  It has _everything_ to do with globbing.  Certainly, if "-" wasn't a
>valid filename character, the parser could use that as a parsing guide.
>But even then, it shouldn't matter.

Wait a second here.  The dash is not part of globbing, per se (no cruft
about the brackets, now.  We all *know* that, don't we?)

The shell globs filenames.  Beyond that, it makes no decisions for the
program.  It does not re-order arguments, it does not qualify switches,
etc.  The way the program handles the dash has absolutely nothing to do
with the shell; it has to do with getopt() and whether or not the program
in question chooses to use it.

Mixeth not apples with thine oranges.

>> The current way of doing this in Unix is almost always intuitive and
>> is close to infinitely flexible.  A basic knowledge of simple quoting,
>> globbing behavior, and switch parsing goes a long way.
>
>  You are referring to the use of a './' prefix, I presume?  I don't find
>that intuitive.  The existence of a hack workaround does not invalidate my
>point.

Ahem.  "Hack workaround"?  I am absolutely sick to death of every gripe
that someone has with the shell and their innumerable references to "hacks".
If you don't like the way the shell does it, then roll your own.  In UNIX,
you see, you're free to do that.

People who groan about the complexity of UNIX, and all of the "hack
workarounds" deserve DOS.  Go play in that sandbox awhile.  You'll come
crawling back to UNIX soon enough.

>  Sure, one can apply escaping and quoting, etc.  But programs which rely
>on shell globbing generally can't handle such arguments anyway.
>
>  Consider another case (what the heck, I'm brave :-)
>
>	> mv cmp* compress*

Please.  Not *this* example *again*.  Yawwwwwwwwn.
Again, roll your own if you don't like it they way it is.

>  Don't get me wrong -- I'm not saying globbing is a bad idea.  It's slick.

Damn right it is.

-- 
Michael Stefanik, MGI Inc., Los Angeles| Opinions stated are not even my own.
Title of the week: Systems Engineer    | UUCP: ...!uunet!bria!mike
-------------------------------------------------------------------------------
Remember folks: If you can't flame MS-DOS, then what _can_ you flame?

kenw@skyler.arc.ab.ca (Ken Wallewein) (02/28/91)

In article <488@bria>...:

> >  It has _everything_ to do with globbing.  Certainly, if "-" wasn't a
> >valid filename character, the parser could use that as a parsing guide.
> >But even then, it shouldn't matter.
> 
> Wait a second here.  The dash is not part of globbing, per se (no cruft
> about the brackets, now.  We all *know* that, don't we?)
> 
> The shell globs filenames.  Beyond that, it makes no decisions for the
> program.  It does not re-order arguments, it does not qualify switches,
> etc.  The way the program handles the dash has absolutely nothing to do
> with the shell; it has to do with getopt() and whether or not the program
> in question chooses to use it.
> 
> Mixeth not apples with thine oranges.

** Flame on **

  I'm getting tired of this.  I try to make a reasonable point clearly,
and some Unix accolyte with blinkers steadfastly refuses to even try to
understand it.  Reminds me of Saddam Hussein claiming victory.

  Read my lips, or whatever. I am not attacking Unix, and I am not
attacking shell globbing.  I am simply trying to point out an apparently
rather subtle limitation of common implementations of command line
preprocessing -- that shell globbing tends to hide potentially useful
information about the users' intentions, and places restrictions on
one's choice of command line syntax.

  I think most of those who haven't already killed this thread understand
that.  I've said enough; I'm not going to pursue this thread any further.

** Flame off ** 

>   People who groan about the complexity of UNIX, and all of the "hack
> workarounds" deserve DOS.  Go play in that sandbox awhile.  You'll come
> crawling back to UNIX soon enough.

  ... aw, hell, that comment doesn't belong here, and it's not worth
replying to.  I'm not sure why I've bothered replying to the message at
all.
--
/kenw

Ken Wallewein                                                     A L B E R T A
kenw@noah.arc.ab.ca  <-- replies (if mailed) here, please       R E S E A R C H
(403)297-2660                                                     C O U N C I L

peter@ficc.ferranti.com (Peter da Silva) (03/01/91)

In article <KENW.91Feb25170431@skyler.arc.ab.ca>, kenw@skyler.arc.ab.ca (Ken Wallewein) writes:
>   As a trivil but amusing example, the other day I had a file whose name
> started with '-'.  There was no way to tell programs which expects shell
> globbing that "this is not a command option; this is a filename".

	command options-and-arguments -- filenames-only

If the command wasn't written to support this syntax, blame that
particular program: this has been a standard for years. There are badly
written programs in program-does-globbing systems too.
-- 
Peter da Silva.  `-_-'  peter@ferranti.com
+1 713 274 5180.  'U`  "Have you hugged your wolf today?"

Making programs do globbing is like making them handle expose events.

bernie@metapro.DIALix.oz.au (Bernd Felsche) (03/03/91)

In <KENW.91Feb26134115@skyler.arc.ab.ca> kenw@skyler.arc.ab.ca
	(Ken Wallewein) writes:

>In article <5615@awdprime.UUCP> tif@doorstop.austin.ibm.com (Paul Chamberlain) writes:

>> In article <KENW.91Feb25170431@skyler.arc.ab.ca> kenw@skyler.arc.ab.ca (Ken Wallewein) writes:
>> >  As a trivil but amusing example, the other day I had a file whose name
>> >started with '-'.  There was no way to tell programs which expects shell
>> >globbing that "this is not a command option; this is a filename".
>> 
>> This seems to be a trivial and slightly amusing example of the problems
>> of standardizing switch notation with a legal filename character.  Anyone
>> that understands the concepts of switch parsing would know that this has
>> nothing to do with globbing.

>  It has _everything_ to do with globbing.  Certainly, if "-" wasn't a
>valid filename character, the parser could use that as a parsing guide.
>But even then, it shouldn't matter.

There is a convention that a "--" terminates command options. This is
how you can do an "rm -- -ha.ha" without rm complaining.

>  Because globbing changes the command line before the program sees it, the
>program has _no way_ of determining the syntax of the original command.
>The program must _assume_ that anything it sees in the command that _looks_
>like a command option _is_ a command option.  It has no way of knowing
>whether that "option" was entered by the user, or is actually a filename.
>It must make an assumption which is _usually_ correct.

Again, read the convention explained above. This convention is
explained in INTRO(1) of most UNIX manuals.

>  Sure, one can apply escaping and quoting, etc.  But programs which rely
>on shell globbing generally can't handle such arguments anyway.

>  Consider another case (what the heck, I'm brave :-)

>	> mv cmp* compress*

>as a file rename operation, without shell hacks?  Same problem: how can the
>program parse it properly?  It has simply no way to know what the original
>command was.  It would see two lists of "deglobbed" filenames, with
>no way to know which was which, or even that it wasn't a single list.
>Sure, mv can check the last one to see if it is a directory, but that's an
>inelegant kludge.

Not that mv does this now, but globbing would pass the unaltered
pattern to the program as an argument, if there are no matches. i.e.
if there are no compress* files, then mv could be clever about it,
because it would be passed the pattern compress* as an argument.

There could be highly amusing side effects, if the patterns are
matched, so the current implementation if probably safest.

>  Handling such a command requires either preventing globbing of the
>original command line so that the program (script or otherwise) can see the
>original version, or some "intuitive" syntactical trick such as interposing
>a "place-holder" like "-" between the two arguments, and hoping there
>aren't any files by that name.

As I mentioned before, all the shells I know of, allow you to turn off
globbing.
-- 
Bernd Felsche,                 _--_|\   #include <std/disclaimer.h>
Metapro Systems,              / sale \  Fax:   +61 9 472 3337
328 Albany Highway,           \_.--._/  Phone: +61 9 362 9355
Victoria Park,  Western Australia   v   Email: bernie@metapro.DIALix.oz.au

srg@quick.com (Spencer Garrett) (03/03/91)

In article <KENW.91Feb25170431@skyler.arc.ab.ca>,
	kenw@skyler.arc.ab.ca (Ken Wallewein) writes:

	(ken complains about Unix globbing)

>   As a trivil but amusing example, the other day I had a file whose name
> started with '-'.  There was no way to tell programs which expects shell
> globbing that "this is not a command option; this is a filename".

But this has nothing to do with globbing.  The filename/option ambiguity
is the same however the arglist gets generated.  There's an easy way
around it, however.  For any filename "-foo" that exibits the ambiguity, 
the string "./-foo" is equivalent and avoids the ambiguity.

>   Ambiguity...  is sometimes the price we pay for generality.  Globbing is
> handy, but it sure ain't perfect.  Methinks it could be done better.

We'd love to hear how.  Having each program perform its own globbing in a
different way isn't it, though.  That isn't new and it isn't better.

firth@sei.cmu.edu (Robert Firth) (03/04/91)

In article <1991Mar3.020451.5596@metapro.DIALix.oz.au> bernie@metapro.DIALix.oz.au (Bernd Felsche) writes:

>There is a convention that a "--" terminates command options. This is
>how you can do an "rm -- -ha.ha" without rm complaining.

As seen, for example, in this little fragment:

	%rm -- -ha.ha
	rm: unknown option --

Gak.

jesup@cbmvax.commodore.com (Randell Jesup) (03/05/91)

In article <1991Mar3.020451.5596@metapro.DIALix.oz.au> bernie@metapro.DIALix.oz.au (Bernd Felsche) writes:
>There is a convention that a "--" terminates command options. This is
>how you can do an "rm -- -ha.ha" without rm complaining.
...
>Again, read the convention explained above. This convention is
>explained in INTRO(1) of most UNIX manuals.

	Unfortunately, the user must know in advance if a glob will result in
-xxx filenames BEFORE invoking the command, or must always use -- on all
command lines.  What would happen someone created a file named -r in a 
directory where you often delete some files with *, or *.c, or whatever?
Or -rf?  There are all sorts of other nasty possibilities that you can't know
about unless you do the globbing once to see which files it matches.

	BTW, according to the man page I get for rm, it doesn't follow
your standard: it wants rm - -ha.ha.

>Not that mv does this now, but globbing would pass the unaltered
>pattern to the program as an argument, if there are no matches. i.e.
>if there are no compress* files, then mv could be clever about it,
>because it would be passed the pattern compress* as an argument.

	What if there was a match?  The point is that it's an output spec,
not an input one, and therefor shouldn't be matched against existing files.
You could design an interface where the shell globbed, but preserved the
original argument so the program could pick.  That means you may spend a lot
of cycles/disk-accesses to glob arguments that don't need it.

>As I mentioned before, all the shells I know of, allow you to turn off
>globbing.

	An implicit admission that shell globbing can get in the way at times.
It's painful to turn it off just because you want to run a specific command.
Also, we're talking more about which design is better (on their own merits),
not whether you can at this date retrofit Unix to have this be the
default.  That's why we're not discussing this in comp.unix.shells.
(BTW, I'm sending followups to comp.os.misc).

	You can get around most any annoyance with enough work, quoting, etc.
However, which is better for the user?  This is why I reject the arguments
that making calls to a system globbing function is too much work for 
programmers: yes, it is (a very little) more work.  However, I think it
produces a far better and easier to use interface for the user of the
system.  It also makes it easier to have a richer globbing language (more than
merely * and ?, you can have more RE's - but if you do shell-globbing, you'll
end up with command lines that look a bit like Lisp code because of all the
quoting and escaping).

-- 
Randell Jesup, Keeper of AmigaDos, Commodore Engineering.
{uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.commodore.com  BIX: rjesup  
The compiler runs
Like a swift-flowing river
I wait in silence.  (From "The Zen of Programming")  ;-)

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (03/05/91)

In article <KENW.91Feb28130436@skyler.arc.ab.ca>, kenw@skyler.arc.ab.ca (Ken Wallewein) writes:
>   Read my lips, or whatever. I am not attacking Unix, and I am not
> attacking shell globbing.

Reading your keystrokes, it came across as a rather strong attack on UNIX
and on file-name substitution (which is what the shell manuals call it
these days; shell globbing sounds as though the result of this processing
stage is a collection of shells).

> I am simply trying to point out an apparently
> rather subtle limitation of common implementations of command line
> preprocessing -- that shell globbing tends to hide potentially useful
> information about the users' intentions, and places restrictions on
> one's choice of command line syntax.

(a) Filename globbing places no real restrictions on one's choice of
    command line syntax.  In every shell I have used, you can switch
    it off.  (Place the right magic in your .profile or .cshrc and
    you'll never see globbing ever again.)

(b) There is nothing to stop any program doing its own globbing.
    How?  There isn't any FIND_FIRST/FIND_NEXT in the library.
    It is very easy to do with system().

(c) It doesn't matter *what* shell you use, you are going to have
    subtle limitations.  Try to have your own command called "if".
    (A limitation that _does_ come up in practice:  try to call a
    program of yours "test".)

(d) *Which* user's intentions?  (By the way, I have been bitten
    *badly* by the equivalent of "mv foo* baz*" on non-UNIX systems
    so much so that I now refuse to do it on any system, unless it
    will let me verify each separate move.)  How is the shell
    supposed to read people's minds?

(e) Whether you agree with Dan Bernstein about how to provide portable
    high-level access to machine-specific instructions (eh?) or not,
    he has proposed a specific technique for doing it.  What would you
    put in the place of file-name substitution?  (I have used Burroughs
    WFL, DEC's DCL, Prime's whatever-they-called it, VM/CMS, and a
    little JCL.  I'm actually rather fond of WFL, but I'll take sh any
    day.)  How about TCL -- available over the net; could you get what
    you want by starting from that?

-- 
The purpose of advertising is to destroy the freedom of the market.

boyd@necisa.ho.necisa.oz.au (Boyd Roberts) (03/06/91)

In article <21884@as0c.sei.cmu.edu> firth@sei.cmu.edu (Robert Firth) writes:
>
>As seen, for example, in this little fragment:
>
>	%rm -- -ha.ha
>	rm: unknown option --

It depends whether your rm(1) uses that abortion getopt(3).  System V.2.2 rm(1)
certainly does.  However, I wouldn't say that all versions of rm(1) would.


Boyd Roberts			boyd@necisa.ho.necisa.oz.au

``When the going gets wierd, the weird turn pro...''

gsarff@meph.UUCP (Gary Sarff) (03/12/91)

In article <488@bria>, lindsay@gandalf.cs.cmu.edu (Donald Lindsay) writes:
>In an article, skyler.arc.ab.ca!kenw (Ken Wallewein) writes:
>
>>  Sure, one can apply escaping and quoting, etc.  But programs which rely
>>on shell globbing generally can't handle such arguments anyway.
>>
>>  Consider another case (what the heck, I'm brave :-)
>>
>>	> mv cmp* compress*
>
>Please.  Not *this* example *again*.  Yawwwwwwwwn.
>Again, roll your own if you don't like it they way it is.
>

I don't know about the person posting above, D. Lindsay, not K. Wallewein
but there are actually people who have to use unix who are not programmers
and are not capable of rolling their own.  I assume K. Wallewein probably is
capable, I am capable, I have written a shell and fixed up utils, so it makes
the thing work more like the other OS I am developer for, but some people are
not.  The point of this posting though, is I am struck by K. Wallewein's
statement, "not *this* example *again*"  Think about it, think about the
"again" part. That means it's common isn't it?  Many people say this, or want
it, so might we wish to enquire why?  Maybe because it seems intuitive to
them? That after becoming a bit familiar with wildcards and living in the
shell for a bit of time, they, mistakenly, but I believe with some little
justification, expect things like the mv example above to work. They want
this facility, and all they get from people, as from K. Wallewein is "go roll
your own, I'm not interested." I thought that _our_ job (our being those
technical people among us) is to help our users, make their lives easier, and
help them get their work done. Not force them into mold that they don't fit
into just because it is frozen into a "standard" and it would sacrilege to
change it.

---------------------------------------------------------------------------
Do memory page swapping to floppies?, I said, yes we can do that, but you 
haven't lived until you see our machine do swapping over a 1200 Baud modem
line, and keep on ticking.
     ..uplherc!wicat!sarek!gsarff

throopw@sheol.UUCP (Wayne Throop) (03/18/91)

> gsarff@meph.UUCP (Gary Sarff)
>>	> mv cmp* compress*
>>Please.  Not *this* example *again*.  Yawwwwwwwwn.
> I am struck by K.  Wallewein's statement, "not *this* example *again*"
> Think about it, think about the "again" part.  That means it's common
> isn't it? Many people say this, or want it, so might we wish to enquire
> why? Maybe because it seems intuitive to them?

Just because something is "intuitive" doesn't mean it's right, reasonable,
or desirable.  For example, just because most people find Aristotalian
physics more "intuitive" than Newtonian physics doesn't mean that
we should use Aristotalian models to explain things to people.

Thus, the common "thinko"% that "mv *.x *.y" ought to do something
sensible doesn't mean that we should rush right out and supply
"the obvious" meaning to that construction.

Having said that, I agree that renaming groups of files is a common
enough task that *some* tool (whather mv or not) ought to be around to
handle it.  And I even agree that something like "mmv '*.x' '*.y'"
might well be a good syntax to support it%%.  But note that it would HAVE
to be quoted in anything like the current shells, because the second
'*' character is in NO way a wildcard expansion in this context. 

---

%  This common thinko comes from a distinct misfeature in current shell
   argument syntax, IMO.  Specifically, it is not lexically apparent how
   the positional argument binding is being done.  This leads to the
   sub-thinko that each *.x corresponds to some *.y as is lexically
   apparent in the unexpanded form (but not in the shell-based expanded
   form).   The rest of the thinko is a result of over-generalizing the
   meaning of the "*" to take the role of "&" (or perhaps \N) in Unix
   regular expressions.

   But again, the fact that the thinko is common doesn't mean it
   ought to be catered to or enshrined.

%% I say *might* be... it would be nice to have (at least) the full power
   of ed-like regular expressions to express the renames, but it isn't
   clear how this can be reconciled with the goal of making the
   simple cases easy to express.  In any event, with the shell
   "promiscuously globbing" as it usually does, the quotes would
   be necessary, because what's going on is NOT wildcard expansion.
--
Wayne Throop  ...!mcnc!dg-rtp!sheol!throopw

drh@duke.cs.duke.edu (D. Richard Hipp) (03/18/91)

In article <1407@sheol.UUCP> throopw@sheol.UUCP (Wayne Throop) writes:
>...[R]enaming groups of files is a common
>enough task that *some* tool (whather mv or not) ought to be around to
>handle it.

Try this from the Bourne shell (or equivalent):

ls *.x | sed -e 's/\.x//' | while read name; do mv $name.x $name.y; done;

A similar idiom will handle most problems that Mr. Throop describes.

Why is this in comp.arch?