[net.unix] name=value or -n value?

gsp@circe.UUCP (Gary Perlman) (03/21/84)

I am impressed with the suggestions for standards which concentrate on
the name=value pair system.  While this is contrary to UNIX tradition,
it seems more consistent and allows multi-character names for options.
My own support for such a standard has changed over time and I will tell why.
At first I rebelled against the traditional "convention" for the more
clear name=value convention, but with some convincing by some analysis
of usage and some results of implementation, I have come to respect both.
I think the attempts at replacing the current "convention" with another
is an over-reaction.  What is needed is a consistent standard that gets
followed; the lack of any (or software supporting it) has resulted in poorly
implemented programs that give people grief they don't deserve.  I think
that the recently proposed system (mostly consistent with tradition)
is as good as any, and that poor and inconsistent implementations are to
blame for the problems so many have observed.

One of the problems with UNIX commands is that the use of command line options
is inconsistent.  Some of the `rules' to which there are numerous exceptions are:
	Options are preceded by a dash (-)
	Options can be bundled
	Options are single characters
	Option letters must be immediately followed by a value
	Flagged options must precede others
	Commands without arguments print their options (dangerous)
The exceptions are everywhere:
	ps (some versions) does not allow a -
	nroff does not bundle flags
	stty takes multi character options
	cc option values must be preceded by a space (cc -o file)
	nroff values must follow option flags (nroff -man is really the "an" macros!)
	dd uses name=value format
	find has brain damaged syntax
	cc doesn't care about where options to the loader go

Looking over these, one has to say: "If there were ONE way to do it,
things would be better,"  and "What way is the best way?"  Anyone
can see that UNIX command line arguments are a mess, but what
to do about it is more difficult to determine.  In the 1984 Winter USENIX
conference, Hemenway and Armitage presented a proposed standard for options.
It looked a lot like traditional UNIX command line handling, except that
standards were introduced:
	options are preceded by a dash
	options are single characters
	options taking values must be preceded by a space
	boolean options can be bundled
There are 13 rules, which are pretty easy to follow.
At first glance, I said "Ugh!"  I wanted to see name=value pairs
to allow multi-character option names.  I almost fully supported Brad
Templeton posting of the Waterloo system (I referred to have ALL args
as name=value pairs so that alphanumeric option names for logical flags
could be specified as name=t or name=false, and + and - would be synonymous
with true and false, respectively so that name+ and name- would be parsable
and fall out of the same name=value convention).  This is a standard on
some systems inside Bell Labs where a common computing environment for
several systems is needed.  I further supported name=value pairs because
I too had written software for parsing them in command lines and files.

But then I learned about the work of Hemenway and Armitage.  They looked
at hundreds of commands on UNIX and categorized the full range of usage
exemplified by the exceptions above.  Their goal was to come up with
recommendations about what to do about UNIX and its command line problems.
Their conclusion was that consistency was highly desirable, but so was
compatibility with existing programs.  Their analysis, which I consider to
be the most sane I have seen, points out that backward compatibility is
not only with command usage by people, but also by shell scripts.  People
at Bell Labs use shell more than anywhere else, especially since ksh,
the new shell by Dave Korn (See Summer USENIX, 1983) is so much faster.
I found it hard to believe, but there are system here with thousands of
lines of shell.  To admit total defeat and say "Scrap the -options for
name=value pairs" would mean that a lot of systems depending on shell scripts
would not work any more.  This would not fare well with management concerned
with giving support to a naive user community; you can't say "Everything is
changed (for the better)" without alienating a lot of people (I cite BSD 4.2).

Hemeway and Armitage were not naive.  They realized that people would not
react well to old commands being changed to fit ANY standard, not matter which.
They have stated that standards would apply to new commands; old ones might
be reworked to make them more robust, but not different.  To maintain good
terms with programmers, they surveyed to find out what people liked about
the - system.  They found that people liked to bundle logical flags:
	ls -ltr
	ps aux
Imagine typing:
	ls +long +temporal +reverse (abbreviated ls +l +t +r)
The amount of typing, using abbreviations, is not much greater,
though any mnemonic enhancements of multi-character options are lost.

Another problem pointed out about name=value pairs is that global file
expansion of the shell might not work correctly (ls file=* passes * to ls)
though the name=value convention could be programmed in the shell so that
anything after = would be expanded.  This would work except that the shell
already has options to take name=value environment variables anywhere on
the command line; little known, but there all along.

At this point, you probably feel as I did.  There seems to be no complete
solution.  For every suggestion, there is an opposite as well founded.
I worked on an enhanced version of getopt (the little known standard
parser for UNIX command lines; little known because it was not released
to the public [To make a standard fail, keep it a secret]).  The main
point of the new parser was to enforce the simple standards of Hemeway
and Armitage.  As an exercise, I rewrote some of my own commands.
I found that my programs had exceptions to every rule, and that few commands
were consistent with others.  After doing the retrofitting, I found that
I could use my own commands more easily than before.  I was impressed
because it demonstrated to me that CONSISTENCY was so important, even with my
own programs.  I was also impressed that the use of "-X value" instead of X=value
no longer seemed to be much of an issue.  When the parser allowed on-line
help phrases for options, the advantage of multi-character option names
seemed less of an issue.

My conclusion is that the syntax does not matter much, so long as there
really is a syntax.  To call the current usage of flag options in UNIX
a "standard" is dangerously mistaken; thus far on UNIX, the exception is the rule.
Given good implementations of a simple rule, whether the rule is -X value
or X=value, or maybe even if the rule were value#X, users should fare about
equally well.  I conclude that conventions like the Waterloo system are
successful because the rule is simple and well implemented, not because there
is any advantage to their particular rule.  This is a hypothesis that is
relatively easy to support with data and I would like to see some or maybe
even collect it myself.  Until then, it seems like the Hemenway and Armitage
analysis and proposal is the most practical because it incorporates simple
standards with the highest degree of backward compatibility.
	Gary Perlman	BTL MH 5D-105	(201) 582-3624	ulysses!gsp

ken@ihuxq.UUCP (ken perlow) (03/21/84)

--
About a year ago I was working with a severely cut-up Unix* clone
for 8-bit micros called UniFlex.  Aside from changing command names
to words that actually mean what the command does (viz. cat -> list,
pwd -> path, grep -> find), they (TSC Corp in N.C.) changed the
option flag from "-" to "+", so that you type "command +n value".
This made intuitive sense, as an option is generally a sort of add-on,
but I ultimately disassembled their shell and built my own with the
traditional Unix* "-", mostly because I couldn't stand having to
shift to hit the "+" when everything else was lower-case.  Most
terminals I've worked on have "+" as an upper-case character.  The
rhythm of typing a command convention is at least as important as the
convention itself.

* My employer suggests that I point out the obvious fact that Unix
  is a trademark of AT&T Bell Laboratories.
-- 
                    *** ***
JE MAINTIENDRAI   ***** *****
                 ****** ******    21 Mar 84 [1 Germinal An CXCII]
ken perlow       *****   *****
(312)979-7261     ** ** ** **
..ihnp4!ihuxq!ken   *** ***

brownell@harvard.UUCP (Dave Brownell) (03/21/84)

System V has a nice way of providing a standard:  they have a library
routine getopt(3) for use in C programming and a shell script version
getopt(1) for Bourne shell/ Korn shell programming.  It standardizes
what most of us would agree is the current UNIX standard, with the
possible exception of forcing flag arguments before file arguments
(i.e., no mixing).  I'ue used getopt(3) and liked it.  I'd like to
see it on Berkeley systems.


	Dave Brownell
	...{decvax,linus,sdcsvax}!genrad!wjh12!harvard!brownell

idallen@watmath.UUCP (03/22/84)

The best use for "+" is in conjunction with "-"; "+" turns on and "-"
turns off.  Just changing all the "-" to "+" doesn't win you anything
except more familiarity with the dreaded shift key.
-- 
        -IAN!  (Ian! D. Allen)      University of Waterloo

idallen@watmath.UUCP (03/23/84)

Here's one reason I prefer "name=value" over "-name value".

A command that looks like:

    bleen -name value1 value2

is syntactically ambiguous.  You have to "know" whether "-name" does or does
not take a following "value1" parameter.  This means you have to know the
semantic behaviour of the "-name" flag to know whether BLEEN is receiving
both VALUE1 and VALUE2.  It also means the potential for error is greater
with "-name value", since the "-name" might eat up one of the otherwise
independent arguments if someone thinks it doesn't take a following parameter.

This syntax ambiguity isn't present with the syntax form:

    bleen name=value1 value2

I need know nothing about any of the parameters to tell that BLEEN is
receiving only one unflagged parameter: VALUE2.  Since there is no
ambiguity, I can't accidentally create a semantically meaningful command
if I misunderstand the behaviour of "name=".  If "name" doesn't take a
parameter, then "name=value" would be flagged as an error.  If "name"
does take a parameter, and I try to use it without one, this would again
be flagged as an error.  With "-name value", the program can't tell.

I prefer the syntactic un-ambiguity.
-- 
        -IAN!  (Ian! D. Allen)      University of Waterloo

dick@tjalk.UUCP (Dick Grune) (03/23/84)

How do I get this getopt parser Gary Perlman describes?
					Dick Grune
					Vrije Universiteit
					...!decvax!mcvax!vu44!tjalk!dick
					Amsterdam

leiby@masscomp.UUCP (03/24/84)

At Uniforum in January a standard for command line
options was proposed.  I didn't get hold of a copy;
could someone who did please post same?

(I agree that it is better to simply define a standard
using the current style than switch to name=value.  This
seems to require less recoding, at least on the face of
it.)

	Mike Leibensperger
	{decvax,harpo,tektronix}!masscomp!leiby
-- 
Mike Leibensperger
{decvax,tektronix,harpo}!masscomp!leiby
Masscomp; One Technology Park; Westford MA 01886

geoff@callan.UUCP (Geoff Kuenning) (03/27/84)

Another issue when designing command-language interfaces is the ease of
typing a particular character.  For this reason, I am in love with the Unix
convention of separating arguments with blank, rather than "," or some such.
This applies to the discussion of Unix arguments because "-" is unshifted,
while "+" requires typing a shift key--much less convenient and much more
error-prone.  Many systems use "/" to introduce command switches, because it
is not only unshifted but located where it is easy to type without missing
the key.  Unfortunately, Unix pre-empted this with the pathname syntax (this
isn't a complaint, I like it better than the other characters they could
have chosen).

The following unshifted characters are available on non-braindamaged keyboards
(VT100/Selectric layout):  -=`'[];,.\/

Most of these are totally unacceptable (can you see typing "ls ,l" for a long
listing?).  The only acceptable ones seem to me to be -=.\/  "-" and "=" both
require a long reach;  "\" and "/" are already used.  "." isn't TOO bad, but
it sure isn't great.  "-" has a mnemonic problem that has already been pointed
out, and "=" as in the dd syntax seems to turn people off.  I vote that we
stay with "-":  it has serious problems, but as has been pointed out, it saves
us from the horrible fate of rewriting all of our shell scripts.

By the way, I am getting tired of seeing people posting notes to the effect of
"if you don't like Unix syntax, go use JCL for a while".  We are all aware of
the deficiencies of JCL;  most of us are also experienced enough to realize
that UNIX and JCL are not the only command interpreters available.  It is
clear that UNIX has problems;  sticking your head in the sand won't cure them.
If you want to take an attitude of "it's perfect the way it is because God
(Dennis Ritchie, apparently) did it that way", I suggest that YOU go to work
for IBM and write JCL--IBM seems to like that kind of attitude.

	Geoff Kuenning
	Callan Data Systems
	...!ihnp4!sdcrdcf!trwrb!wlbr!callan!geoff

jsq@ut-sally.UUCP (John Quarterman) (03/29/84)

It's rather amazing how long an argument can continue over the relative
merits of - and + based almost solely on one being shifted on most keyboards.
People use <, >, !, &, $, etc. all the time with no complaints about shifts.
If dealing with a keyboard is *really* that much of a problem, wouldn't it
be more appropriate to go to a mouse and menu system?
-- 
John Quarterman, CS Dept., University of Texas, Austin, Texas
jsq@ut-sally.ARPA, jsq@ut-sally.UUCP, {ihnp4,seismo,ctvax}!ut-sally!jsq

Laws@SRI-AI.ARPA (03/30/84)

From:  Ken Laws <Laws@SRI-AI.ARPA>

I have grown used to "-" introducing options, although it does play
havoc with programs that can take negative numbers or negated strings
as arguments (e.g., grep looking for "-1").  In my own parsing routines
I have kept the "-" convention, and have provided routines for checking
the number of terms following a flag (i.e., before the next flag) so
that the programmer can provide defaults or can print an error message
if the wrong number of arguments are given.  This is integrated with
my package for parsing ordinary (nonflag) command-line arguments.
The programmer-specified defaulting options can be rather complex, as
in: use the command-line argument if given and legal; if given and
illegal, use an interactive query to offer the default; if not given,
use the default [with or] without asking the user, unless no default
was specified in which case query the user with no default.  All this
is wrapped in a fairly simple C subroutine call interface that lets the
programmer specify the query string, criteria for legal parameters,
and optional help messages for illegal input and for user queries.

One thing that is missing in Unix syntax is the notion of nested arguments,
as in Lisp.  Strings provide the ability to group terms into a single
argument, but there is no other mechanism for passing down a list
(e.g., a coordinate pair) as a single parameter that the program can
parse later.  Such capabilities are needed for "intelligent" interfaces
that are able to recognize different ways of specifying an entity
from the form of the specification.  We'll just have to live with using
different flags to introduce different syntactic forms of an argument.
(Example: I use "-f 10 10 -n 15 15", and "-w 10 10 25 25" as two ways
of specifying a 15 x 15 screen window extending from (10,10) to
(25,25).  With an argument grouping capability one could specify
something like "((10 10) + (15 15))" or "((10 10) to (25 25))", or
"(pointA to pointB)", or "windowA", or whatever other syntax the program
was prepared to accept.  This can be done now with string arguments,
but only if the programmer is willing to write a fairly hairy ad hoc
parser.)

I have also come to prefer spaces to commas as argument separators, and would
recommend that the csh {a,b}.c notation be converted to use spaces.  There
needs to be a way to denote missing arguments, however, as well as a way
to test for them in programs.  I use "?", "??", and related entities
to trigger default values, prompting, help facilities, etc.  Unfortunately
the quotes are necessary to get these symbols past the shell syntax
expansion.  Perhaps "@" or a lone "-" could be used to stand for a missing
argument.

					-- Ken Laws
-------

geoff@callan.UUCP (Geoff Kuenning) (04/07/84)

>	It's rather amazing how long an argument can continue over the relative
>	merits of - and + based almost solely on one being shifted on most
>	keyboards.  People use <, >, !, &, $, etc. all the time with no
>	complaints about shifts. If dealing with a keyboard is *really* that
>	much of a problem, wouldn't it be more appropriate to go to a mouse
>	and menu system?
		- John Quarterman

The problem is that switch indicators and argument separators are typed much
more often than <, >, !, &, $, etc.  I can notice a definite speed difference
when I type the aforementioned characters (< and > are faster because they at
least don't require major hand motions even though they are shifted).  I don't
mind this with a character I type once per command at most like < and &, but
I mind it a lot with characters like the argument separator or the switch
indicator, which I type a lot more.

As to the mouse suggestion, I can only say "hear, hear".  What I *REALLY* want
(I think, I haven't tried it) is a two-handed chord keyboard where one or
even both chord keyboards are mounted on a mouse.  Then I could have the best
of all worlds.

	Geoff Kuenning
	Callan Data Systems
	...!ihnp4!sdcrdcf!trwrb!wlbr!callan!geoff

Vax?  Is that a 68000 with the bytes going the other way?

jsq@ut-sally.UUCP (John Quarterman) (04/08/84)

As to -option (or +option) being more common than >, <, or & in commands,

make >& errs &
(Bourne shell equivalent is "make >errs 2>&1 &")

is probably the most common command I type, with the possible
exception of

!!
(means replay last command to the Cshell)

so I'm not convinced.  Also, anything with more than one option per
command usually ends up in a makefile anyway, so where's the argument?


I'd kind of like to have a chording keyboard too:  it seems silly to
have to use two hands all the time.
-- 
John Quarterman, CS Dept., University of Texas, Austin, Texas
jsq@ut-sally.ARPA, jsq@ut-sally.UUCP, {ihnp4,seismo,ctvax}!ut-sally!jsq