[net.cog-eng] standards, tools, history: options=?

gsp@circe.UUCP (Gary Perlman) (03/24/84)

Historically, development on UNIX was a public activity and universities
had near immediate access to work done at Bell Labs.  This is no longer true.
For something to get outside, it has to be approved by several levels of
management of several organizations in several companies.  This looks like
it will take several years on average.  My personal impression is that
there is some movement to release software supporting a standard UNIX
command line syntax.  Getopt, the parser referred to by several people
was a first attempt at this, but it was never released.  My conclusion is
that the worst way to put forth a standard is to keep it a secret.  My hope
is that Ralph Hill's wish of supporting software gets answered by the
powers that be.

Further elaborating on Hill's recent note, there is some consideration
of building command line parsing into the shell.  His suggestion is
full of merits, one of which not mentioned was program size and efficiency.
If the shell could have a parser added to it (adding pK text to the shell),
then some proportion of that coiuld be recovered from many programs.
Further, if the shell knew that a command line call to a program was invalid,
it would not have to fork and exec the program.  There are many other
tradeoffs, but the full discussion is in progress.

Also of substance is the proposal by Ian! D. Allen (-IAN! -> IAN=!)
that name=value more closely visually binds name and value than -name value.
I think he is right, Gestalt psychologists would claim he is right,
and there might even be some data supporting this, though I am not sure where.
I think it would be useful to bring up a distinction between ease of
comprehension and ease of expression in an artificial language like UNIX
syntax.  If name=value format is more easily comprehended by people,
it may still be the case that -n value format is just as easily expressed,
maybe even more so given the ability to bind single character names.
Most people are concerned with using UNIX syntax to express a command;
usually people do not read other users' commands.  An important exception to
this is when people are learning, either from watching others or reading
the UNIX manual.  The ability and preference of people to learn by example,
makes the comprehension of command syntax important.  The example of APL
being highly expressive and incomprehensible comes to mind.  Allen's example
of flag options eating up following arguments when misused is a valid comment.
If the `c' flag takes no value, then:
	c=value
would be easy to catch, but
	-c value
would not.  Further, if `c' takes a value, but none is supplied,
	-c -f ...
would cause `-f' to be eaten as the value.  The fix proposed by Hemenway
and Armitage for this is that option values are not optional.  While this
is not a solution (people will make mistakes and these should be detected)
there is some evidence that standard command line parsers will have type and
range checking so that obvious errors would be detected.

The argument (pardon the pun) about options (ugh) for options could be
volleyed back and forth until we are blue in the face.  I think most of
the major issues have been hashed, and they are good for people to know.
The final arbitrator is probably way up in AT&T for several reasons:
	1.  AT&T is in business and has paying customers for UNIX.
	2.  AT&T will support them by not pulling the rug out from under them.
	3.  AT&T will continue to be the largest single developer of UNIX tools.
	4.  All users, including universities, will want the improved tools.
	5.  Therefore, all users will accept the tools in AT&T's form, or
		spend their time reworking the tolls to fit their own standards.
These seem to be the facts of business life, at least as I understand them.
My own opinion is that if someone were going to design a command language
from scratch, the flag system would not be the one chosen, and probably
the name=value system would.  But a business has to live with its history.
And anyways, you wouldn't want to give up bundling options, would you?

Maybe the UNIX license should be changed from:
	UNIX -- live free or die
to
	UNIX -- love it or leave it

Gary Perlman	Bell Labs MH 5D-105	ulysses!gsp	x3624

guy@rlgvax.UUCP (Guy Harris) (03/25/84)

> Getopt, the parser referred to by several people was a first attempt at
> this, but it was never released.

Gee, our System III and System V manuals have pages GETOPT(3C) which
describes that resembles the proposed AT&T standard an *awful* lot, and
they also have a manual page GETOPT(1) for a command which lets shell
files use this - hey, we even got the *source* to them on our S3 and S5
distributions!

In other words, a parser that implements 90% of the standard is available
to anyone with a license for S3 or any later release - it's on your tape.
It implements the following rules of the standard:

- rule 3 (one-letter option names, which is a historical crock - note "f77"'s
	"onetrip" option (which is in itself a historical crock forced on us
	by implementors of pre-F77 Fortran compilers, but we won't get into
	that))

- rule 4 (options must be delimited by "-")

- rule 5 (options with no arguments may be grouped behind one delimiter - this
	is the history that forces rule 3)

- rule 7 (option arguments cannot be optional)

- rule 8, partially (groups of option-arguments following an option must
	be separated by commas or separated by white space and quoted)

- rule 9 (all options precede operands on the command line)

- rule 10 ("--" may be used to delimit the end of the options)

Rule 6 (the first option-argument following an option must be preceded by
white space) is not supported by "getopt"; it permits "foo -ofrobozz" or
"foo -o frobozz".  It supports Rule 8 to the extent that the option-argument
must be recognized by the shell's lexical analyzer as one token.  All other
rules are supported by the code that uses "getopt".

	Guy Harris
	{seismo,ihnp4,allegra}!rlgvax!guy

guy@rlgvax.UUCP (Guy Harris) (03/25/84)

> And anyways, you wouldn't want to give up bundling options, would you?

The main cost of bundling options is that options can only be one character
in length.  This means 1) that option names, at best, are a bit cryptic, and
2) at worst, you either have to pick *very* cryptic names as you run out of
letters or you have to go to other schemes if you need a lot of options -
admittedly, a command with more than 52 options is probably overdesigned
and/or overwritten.  The main benefit of bundling options is that it
cuts the number of times you have to hit the space bar.  Frankly, I'm used
to bundling options, but if I were starting from scratch I wouldn't put
in a feature which confined me to one-letter option names now and forever,
world without end, amen.  In this case, I think the tradeoff favors not
supporting bundling options.

While we're on the topic, though, the argument against "=" to indicate options
and values can be countered by mentioning the DEC standard command language,
where there is an "option indicator" character ("/" in DEC-land, which would be
a poor choice in UNIX-land for obvious reasons - DEC's pathname syntax is
just flat *UGLY*) and an "=" to indicate option arguments.

I.e.

FOO/MUMBLE/OINK=PIG BLETCH.C/WIDTH=132

which could be done in UNIX-land as

foo -mumble -oink=pig bletch.c -width=132

or somesuch.  (Also note that the DEC command language ties options to
specific tokens on the command line - either the command itself or its
arguments.)  (Further note, by the way, that "=" is an un-shifted key on
non-TTY-style keyboards, at least, and is right next to the "-" key.)

	Guy Harris
	{seismo,ihnp4,allegra}!rlgvax!guy

henry@utzoo.UUCP (Henry Spencer) (03/27/84)

> Getopt, the parser referred to by several people was a first attempt at
> this, but it was never released.

Not only was getopt released with System N, as Guy Harris says, but there
is a public-domain re-implementation of it that was published in net.sources
quite some while ago.  If you want it and don't have it, either talk to
somebody who has news archives or send me mail.  (I wrote it.)  So stop
complaining that you can't get it; you can, you should, and you should
have done so quite some time ago!

(If I get enough "I'm interested" mail, I will re-post the thing.  It's
not large.)

Incidentally, another thing to scream at Berkeley about:  the public-domain
getopt was around in plenty of time to get onto 4.2BSD, but for some
reason the twits didn't include it.  I know it got to Berkeley.

So, there is *no* *excuse* for not using getopt, if your argument syntax
is anything like the normal Unix pattern.  Hereabouts, any time we work
on any existing program we fix it to use getopt.  And of course all our
new stuff uses it.  It's a huge relief to have everything accept the
same consistent syntax.  Just the uniformity is a large win.

This, incidentally, sums up my views on the proposed syntax standard.
Just the uniformity alone is a large win, well worth the trouble.  One
may argue that it is an inferior syntax, but it is *still* highly worth-
while to clean it up and make it uniform.  If for no other reason than
because a conversion to a new syntax will be a much lengthier process.

(The paper that gives details of the reasoning behind the standard is
quite interesting, by the way.  One of the things it says is that the
idea of a new and radically different, but perhaps superior, syntax was
explicitly rejected.  Why?  Because it was tried before, and nobody
paid any attention to it.  A standard that nobody follows is useless.)

I have promised, in front of a lot of witnesses, to keep the public-
domain implementation of getopt in step with whatever AT+T does in the
way of parsing software for the new standard.  The folks who did the
standard promised to keep me informed about this; nothing yet.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

stevens@inuxh.UUCP (W Stevens) (03/28/84)

Obviously, no one wants to type

cc input=foo.c output=foo define=4bsbd define=localmachine=ut-ngp -optomize library=curses library=termlib

But then no one wants to type

cc foo.c -o foo -D4bsd -Dlocalmachine=ut-ngp -O -lcurses -ltermlib

either.  That's why people use make(1).

--
Scott Stevens
AT&T Consumer Products
Indianapolis, Indiana, USA
UUCP: inuxh!stevens

rpw3@fortune.UUCP (03/28/84)

#R:circe:-4600:fortune:29300012:000:1263
fortune!rpw3    Mar 28 02:34:00 1984

Jim Knutson's example has a subtle but important difference from some of the
examples posted earlier. He showed file-valued options for each of his library
files:

	$ cc -output=foo -define=4bsd -define=localmachine -optimize \
		-input=foo.c -library=curses -library=otherlib

as compared with file-specific options which were "sticky though end-of-line
unless overridden". The latter would give

	$ cc -out=foo -def=4bsd,localmachine -opt foo.c -lib curses otherlib

(Since I recommend supporting abbreviations, I took the liberty of using them.)

While there are indeed places where file-valued options are useful (especially
places where you DON'T want "sticky" behavior like "-out=foo"), I would not
suggest using them as the normal option style.

File-specific (but standalone) "sticky" options fit traditional UNIX style
much better, for if you put them all up at the front of the command you
have the current command-global options. If you "spread them out" they then
apply to specific files or groups of files (as in the "number of copies to
print" example, given in another article).

Rob Warnock

UUCP:	{sri-unix,amd70,hpda,harpo,ihnp4,allegra}!fortune!rpw3
DDD:	(415)595-8444
USPS:	Fortune Systems Corp, 101 Twin Dolphin Drive, Redwood City, CA 94065

knutson@ut-ngp.UUCP (03/29/84)

Now for an opinion from the other side of the fence.  I started out in
a CDC world running a homegrown NOS lookalike operating system.  CDC
uses positional and keyword parameters (xxx=yyy) and keywords may
appear anywhere on a line.  Well, this is great except when you start
to provide several options to a command.  NOS finally fixed the problem
by having some flaky capability to continue a command on the next line.
We have yet to do this for our system.  By the way, have you ever heard
of UT-2D?  No?  Lucky dog.  Anyway, I prefer the tersness of 1 character
flags all bundled together.  Who wants to type:

cc input=foo.c output=foo define=4bsbd define=localmachine=ut-ngp -optomize library=curses library=termlib
-- 
Jim Knutson
ARPA: knutson@ut-ngp
UUCP: {ihnp4,seismo,kpno,ctvax}!ut-sally!ut-ngp!knutson

henry@utzoo.UUCP (Henry Spencer) (04/01/84)

Since my posting a few days ago, there has been a deluge of letters
asking for my public-domain implementation of the getopt(3) argument
parser.  I will re-post it to net.sources shortly, and will mail copies
to those persons who've indicated they don't get net.sources.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry