[net.cog-eng] name=value or -n value?

gsp@circe.UUCP (Gary Perlman) (03/21/84)

I am impressed with the suggestions for standards which concentrate on
the name=value pair system.  While this is contrary to UNIX tradition,
it seems more consistent and allows multi-character names for options.
My own support for such a standard has changed over time and I will tell why.
At first I rebelled against the traditional "convention" for the more
clear name=value convention, but with some convincing by some analysis
of usage and some results of implementation, I have come to respect both.
I think the attempts at replacing the current "convention" with another
is an over-reaction.  What is needed is a consistent standard that gets
followed; the lack of any (or software supporting it) has resulted in poorly
implemented programs that give people grief they don't deserve.  I think
that the recently proposed system (mostly consistent with tradition)
is as good as any, and that poor and inconsistent implementations are to
blame for the problems so many have observed.

One of the problems with UNIX commands is that the use of command line options
is inconsistent.  Some of the `rules' to which there are numerous exceptions are:
	Options are preceded by a dash (-)
	Options can be bundled
	Options are single characters
	Option letters must be immediately followed by a value
	Flagged options must precede others
	Commands without arguments print their options (dangerous)
The exceptions are everywhere:
	ps (some versions) does not allow a -
	nroff does not bundle flags
	stty takes multi character options
	cc option values must be preceded by a space (cc -o file)
	nroff values must follow option flags (nroff -man is really the "an" macros!)
	dd uses name=value format
	find has brain damaged syntax
	cc doesn't care about where options to the loader go

Looking over these, one has to say: "If there were ONE way to do it,
things would be better,"  and "What way is the best way?"  Anyone
can see that UNIX command line arguments are a mess, but what
to do about it is more difficult to determine.  In the 1984 Winter USENIX
conference, Hemenway and Armitage presented a proposed standard for options.
It looked a lot like traditional UNIX command line handling, except that
standards were introduced:
	options are preceded by a dash
	options are single characters
	options taking values must be preceded by a space
	boolean options can be bundled
There are 13 rules, which are pretty easy to follow.
At first glance, I said "Ugh!"  I wanted to see name=value pairs
to allow multi-character option names.  I almost fully supported Brad
Templeton posting of the Waterloo system (I referred to have ALL args
as name=value pairs so that alphanumeric option names for logical flags
could be specified as name=t or name=false, and + and - would be synonymous
with true and false, respectively so that name+ and name- would be parsable
and fall out of the same name=value convention).  This is a standard on
some systems inside Bell Labs where a common computing environment for
several systems is needed.  I further supported name=value pairs because
I too had written software for parsing them in command lines and files.

But then I learned about the work of Hemenway and Armitage.  They looked
at hundreds of commands on UNIX and categorized the full range of usage
exemplified by the exceptions above.  Their goal was to come up with
recommendations about what to do about UNIX and its command line problems.
Their conclusion was that consistency was highly desirable, but so was
compatibility with existing programs.  Their analysis, which I consider to
be the most sane I have seen, points out that backward compatibility is
not only with command usage by people, but also by shell scripts.  People
at Bell Labs use shell more than anywhere else, especially since ksh,
the new shell by Dave Korn (See Summer USENIX, 1983) is so much faster.
I found it hard to believe, but there are system here with thousands of
lines of shell.  To admit total defeat and say "Scrap the -options for
name=value pairs" would mean that a lot of systems depending on shell scripts
would not work any more.  This would not fare well with management concerned
with giving support to a naive user community; you can't say "Everything is
changed (for the better)" without alienating a lot of people (I cite BSD 4.2).

Hemeway and Armitage were not naive.  They realized that people would not
react well to old commands being changed to fit ANY standard, not matter which.
They have stated that standards would apply to new commands; old ones might
be reworked to make them more robust, but not different.  To maintain good
terms with programmers, they surveyed to find out what people liked about
the - system.  They found that people liked to bundle logical flags:
	ls -ltr
	ps aux
Imagine typing:
	ls +long +temporal +reverse (abbreviated ls +l +t +r)
The amount of typing, using abbreviations, is not much greater,
though any mnemonic enhancements of multi-character options are lost.

Another problem pointed out about name=value pairs is that global file
expansion of the shell might not work correctly (ls file=* passes * to ls)
though the name=value convention could be programmed in the shell so that
anything after = would be expanded.  This would work except that the shell
already has options to take name=value environment variables anywhere on
the command line; little known, but there all along.

At this point, you probably feel as I did.  There seems to be no complete
solution.  For every suggestion, there is an opposite as well founded.
I worked on an enhanced version of getopt (the little known standard
parser for UNIX command lines; little known because it was not released
to the public [To make a standard fail, keep it a secret]).  The main
point of the new parser was to enforce the simple standards of Hemeway
and Armitage.  As an exercise, I rewrote some of my own commands.
I found that my programs had exceptions to every rule, and that few commands
were consistent with others.  After doing the retrofitting, I found that
I could use my own commands more easily than before.  I was impressed
because it demonstrated to me that CONSISTENCY was so important, even with my
own programs.  I was also impressed that the use of "-X value" instead of X=value
no longer seemed to be much of an issue.  When the parser allowed on-line
help phrases for options, the advantage of multi-character option names
seemed less of an issue.

My conclusion is that the syntax does not matter much, so long as there
really is a syntax.  To call the current usage of flag options in UNIX
a "standard" is dangerously mistaken; thus far on UNIX, the exception is the rule.
Given good implementations of a simple rule, whether the rule is -X value
or X=value, or maybe even if the rule were value#X, users should fare about
equally well.  I conclude that conventions like the Waterloo system are
successful because the rule is simple and well implemented, not because there
is any advantage to their particular rule.  This is a hypothesis that is
relatively easy to support with data and I would like to see some or maybe
even collect it myself.  Until then, it seems like the Hemenway and Armitage
analysis and proposal is the most practical because it incorporates simple
standards with the highest degree of backward compatibility.
	Gary Perlman	BTL MH 5D-105	(201) 582-3624	ulysses!gsp

ken@ihuxq.UUCP (ken perlow) (03/21/84)

--
About a year ago I was working with a severely cut-up Unix* clone
for 8-bit micros called UniFlex.  Aside from changing command names
to words that actually mean what the command does (viz. cat -> list,
pwd -> path, grep -> find), they (TSC Corp in N.C.) changed the
option flag from "-" to "+", so that you type "command +n value".
This made intuitive sense, as an option is generally a sort of add-on,
but I ultimately disassembled their shell and built my own with the
traditional Unix* "-", mostly because I couldn't stand having to
shift to hit the "+" when everything else was lower-case.  Most
terminals I've worked on have "+" as an upper-case character.  The
rhythm of typing a command convention is at least as important as the
convention itself.

* My employer suggests that I point out the obvious fact that Unix
  is a trademark of AT&T Bell Laboratories.
-- 
                    *** ***
JE MAINTIENDRAI   ***** *****
                 ****** ******    21 Mar 84 [1 Germinal An CXCII]
ken perlow       *****   *****
(312)979-7261     ** ** ** **
..ihnp4!ihuxq!ken   *** ***

brownell@harvard.UUCP (Dave Brownell) (03/21/84)

System V has a nice way of providing a standard:  they have a library
routine getopt(3) for use in C programming and a shell script version
getopt(1) for Bourne shell/ Korn shell programming.  It standardizes
what most of us would agree is the current UNIX standard, with the
possible exception of forcing flag arguments before file arguments
(i.e., no mixing).  I'ue used getopt(3) and liked it.  I'd like to
see it on Berkeley systems.


	Dave Brownell
	...{decvax,linus,sdcsvax}!genrad!wjh12!harvard!brownell

idallen@watmath.UUCP (03/22/84)

The best use for "+" is in conjunction with "-"; "+" turns on and "-"
turns off.  Just changing all the "-" to "+" doesn't win you anything
except more familiarity with the dreaded shift key.
-- 
        -IAN!  (Ian! D. Allen)      University of Waterloo

idallen@watmath.UUCP (03/23/84)

Here's one reason I prefer "name=value" over "-name value".

A command that looks like:

    bleen -name value1 value2

is syntactically ambiguous.  You have to "know" whether "-name" does or does
not take a following "value1" parameter.  This means you have to know the
semantic behaviour of the "-name" flag to know whether BLEEN is receiving
both VALUE1 and VALUE2.  It also means the potential for error is greater
with "-name value", since the "-name" might eat up one of the otherwise
independent arguments if someone thinks it doesn't take a following parameter.

This syntax ambiguity isn't present with the syntax form:

    bleen name=value1 value2

I need know nothing about any of the parameters to tell that BLEEN is
receiving only one unflagged parameter: VALUE2.  Since there is no
ambiguity, I can't accidentally create a semantically meaningful command
if I misunderstand the behaviour of "name=".  If "name" doesn't take a
parameter, then "name=value" would be flagged as an error.  If "name"
does take a parameter, and I try to use it without one, this would again
be flagged as an error.  With "-name value", the program can't tell.

I prefer the syntactic un-ambiguity.
-- 
        -IAN!  (Ian! D. Allen)      University of Waterloo

leiby@masscomp.UUCP (03/24/84)

At Uniforum in January a standard for command line
options was proposed.  I didn't get hold of a copy;
could someone who did please post same?

(I agree that it is better to simply define a standard
using the current style than switch to name=value.  This
seems to require less recoding, at least on the face of
it.)

	Mike Leibensperger
	{decvax,harpo,tektronix}!masscomp!leiby
-- 
Mike Leibensperger
{decvax,tektronix,harpo}!masscomp!leiby
Masscomp; One Technology Park; Westford MA 01886

rpw3@fortune.UUCP (03/25/84)

#R:circe:-4400:fortune:29300004:000:2612
fortune!rpw3    Mar 24 22:27:00 1984

As requested, I am posting a copy of the "quick reference card"
that AT&T has been giving out on the command language standard.
(Sorry, I don't have the full paper on-line.) As it says at the
bottom, send comments elsewhere -- I prefer the TOPS-10/SCAN
style (modified for UNIX). SInce I'm posting this, I get my
licks in first (o.k. to send comments on THIS to me):

    -option		Full word, but abbreviations implemented in 'getopt'
			(as in either "-recursive" or "-rec")
    -nooption		As in "-norecursive" or "-norec".
    -option=yes		Always allowed where "-option" is
    -option=no		Always allowed where "-nooption" is (as in
			"-recursive=no")
    -option=keyword	As in "dd conv=ascii" (but "dd" doesn't like the "-")
			or (new "find") "find . -user=rpw3"
    -option=key,key,...	Lists of keywords or'd with comma (as in
			"find . -name=big,small,fat") as with csh "{,,}"

That's MY preference. Now for "equal time" to AT&T... (below)

Rob Warnock

UUCP:	{sri-unix,amd70,hpda,harpo,ihnp4,allegra}!fortune!rpw3
DDD:	(415)595-8444
USPS:	Fortune Systems Corp, 101 Twin Dolphin Drive, Redwood City, CA 94065

----------------------Attachment (Re-Typed from Quick Reference Card)-------

							   ====   
							 ==---===
							=------===
							=------===  A T & T
							 ==--====
							   ====   
Proposed Syntax Standard for UNIX System Commands:

Rule 1:     Command names must be between two and nine characters.
Rule 2:     Command names must include lower case letters and digits only.
Rule 3:     Option names must be a single character in length.
Rule 4:     All options must be delimited by "-".
Rule 5:     Options with no arguments may be grouped behind one delimeter.
Rule 6:     The first option argument following an option must be preceded
	    by white space.
Rule 7:     Option arguments cannot be optionable.
Rule 8:     Groups of option-arguments following an option must be separated
	    by commas or separated by white space and quoted.
Rule 9:     All options precede operands on the command line.
Rule 10:    "--" may be used to delimit the end of the options.
Rule 11:    The order of options relative to each other should not matter.
Rule 12:    The order of operands may matter and position-related inter-
	    pretations should be determined on a command-specific basis.
Rule 13:    "-" preceded and followed by white space should be used only
	    to mean standard input.
							November 1983
---------
*UNIX is a trademark of AT&T Bell Laboratories

Send comments to: Software Sales and Marketing
		  PO Box 25000
		  Greensboro, Noth Carolina  27420

		  nwuxd!UNIXSYS

guy@rlgvax.UUCP (Guy Harris) (03/25/84)

The AT&T card also contained the L.sys file entry for "nwuxd" (by the way,
on my card "unixsys", as in "nwuxd!unixsys", was all lower-case); it was

nwuxd Any ACU 1200 13122601844 in-BREAK-in-BREAK-in unixml word bellmail

or, for those of you less fortunate,

nwuxd Any ACU 300 13122601844 in-BREAK-in-BREAK-in unixml word bellmail

(also, you put the Death Star in the wrong corner - it was in the upper left
corner :-))

My preferences are the same as Rob's; the only advantages of the AT&T
standard are 1) less typing, because you have all those "lovely" one-character
flag names (Data General had that in RDOS but dumped it in AOS in favor of
longer names; DEC had it in RT-11 and had two-character flags in RSX but
dumped it in VMS in favor of longer names) and 2) it's compatible with the
commands that already exist.  (Assume that you had a restriction of one- or
two-character *command* names in UNIX; wouldn't that be fun?)

	Guy Harris
	{seismo,ihnp4,allegra}!rlgvax!guy

geoff@callan.UUCP (Geoff Kuenning) (03/27/84)

Another issue when designing command-language interfaces is the ease of
typing a particular character.  For this reason, I am in love with the Unix
convention of separating arguments with blank, rather than "," or some such.
This applies to the discussion of Unix arguments because "-" is unshifted,
while "+" requires typing a shift key--much less convenient and much more
error-prone.  Many systems use "/" to introduce command switches, because it
is not only unshifted but located where it is easy to type without missing
the key.  Unfortunately, Unix pre-empted this with the pathname syntax (this
isn't a complaint, I like it better than the other characters they could
have chosen).

The following unshifted characters are available on non-braindamaged keyboards
(VT100/Selectric layout):  -=`'[];,.\/

Most of these are totally unacceptable (can you see typing "ls ,l" for a long
listing?).  The only acceptable ones seem to me to be -=.\/  "-" and "=" both
require a long reach;  "\" and "/" are already used.  "." isn't TOO bad, but
it sure isn't great.  "-" has a mnemonic problem that has already been pointed
out, and "=" as in the dd syntax seems to turn people off.  I vote that we
stay with "-":  it has serious problems, but as has been pointed out, it saves
us from the horrible fate of rewriting all of our shell scripts.

By the way, I am getting tired of seeing people posting notes to the effect of
"if you don't like Unix syntax, go use JCL for a while".  We are all aware of
the deficiencies of JCL;  most of us are also experienced enough to realize
that UNIX and JCL are not the only command interpreters available.  It is
clear that UNIX has problems;  sticking your head in the sand won't cure them.
If you want to take an attitude of "it's perfect the way it is because God
(Dennis Ritchie, apparently) did it that way", I suggest that YOU go to work
for IBM and write JCL--IBM seems to like that kind of attitude.

	Geoff Kuenning
	Callan Data Systems
	...!ihnp4!sdcrdcf!trwrb!wlbr!callan!geoff

jsq@ut-sally.UUCP (John Quarterman) (03/29/84)

It's rather amazing how long an argument can continue over the relative
merits of - and + based almost solely on one being shifted on most keyboards.
People use <, >, !, &, $, etc. all the time with no complaints about shifts.
If dealing with a keyboard is *really* that much of a problem, wouldn't it
be more appropriate to go to a mouse and menu system?
-- 
John Quarterman, CS Dept., University of Texas, Austin, Texas
jsq@ut-sally.ARPA, jsq@ut-sally.UUCP, {ihnp4,seismo,ctvax}!ut-sally!jsq

geoff@callan.UUCP (Geoff Kuenning) (04/07/84)

>	It's rather amazing how long an argument can continue over the relative
>	merits of - and + based almost solely on one being shifted on most
>	keyboards.  People use <, >, !, &, $, etc. all the time with no
>	complaints about shifts. If dealing with a keyboard is *really* that
>	much of a problem, wouldn't it be more appropriate to go to a mouse
>	and menu system?
		- John Quarterman

The problem is that switch indicators and argument separators are typed much
more often than <, >, !, &, $, etc.  I can notice a definite speed difference
when I type the aforementioned characters (< and > are faster because they at
least don't require major hand motions even though they are shifted).  I don't
mind this with a character I type once per command at most like < and &, but
I mind it a lot with characters like the argument separator or the switch
indicator, which I type a lot more.

As to the mouse suggestion, I can only say "hear, hear".  What I *REALLY* want
(I think, I haven't tried it) is a two-handed chord keyboard where one or
even both chord keyboards are mounted on a mouse.  Then I could have the best
of all worlds.

	Geoff Kuenning
	Callan Data Systems
	...!ihnp4!sdcrdcf!trwrb!wlbr!callan!geoff

Vax?  Is that a 68000 with the bytes going the other way?

jsq@ut-sally.UUCP (John Quarterman) (04/08/84)

As to -option (or +option) being more common than >, <, or & in commands,

make >& errs &
(Bourne shell equivalent is "make >errs 2>&1 &")

is probably the most common command I type, with the possible
exception of

!!
(means replay last command to the Cshell)

so I'm not convinced.  Also, anything with more than one option per
command usually ends up in a makefile anyway, so where's the argument?


I'd kind of like to have a chording keyboard too:  it seems silly to
have to use two hands all the time.
-- 
John Quarterman, CS Dept., University of Texas, Austin, Texas
jsq@ut-sally.ARPA, jsq@ut-sally.UUCP, {ihnp4,seismo,ctvax}!ut-sally!jsq

olson@fortune.UUCP (04/11/84)

#R:circe:-4400:fortune:29300014:000:552
fortune!olson    Apr 10 18:36:00 1984

>	make >& errs &
>	is probably the most common command I type, with the possible
>	exception of
>	!!
>	John Quarterman, CS Dept., University of Texas, Austin, Texas

If that is the most common command you type, why don't you alias it
to something like 'mkbg' (for make background)?

I have 
alias mkbg 'make \!* >& MADE &' 
alias mkbgW 'mkbg \!* ; tail -f MADE'

for running it in the background, and for running it in the background
while watching the (saved) output.

What's the point in having the csh and aliases if you don't use them?
	Dave Olson

jsq@ut-sally.UUCP (John Quarterman) (04/12/84)

I do use Cshell aliases, I was just trying to avoid bringing them up
when they weren't necessary to make the point, which is that option
arguments aren't much of a problem whatever character you use to make them.
Aliases make them even less of a problem.
-- 
John Quarterman, CS Dept., University of Texas, Austin, Texas
jsq@ut-sally.ARPA, jsq@ut-sally.UUCP, {ihnp4,seismo,ctvax}!ut-sally!jsq