[net.unix] Comments on UNIX command option syntax

perlman@wanginst.UUCP (Gary Perlman) (10/30/85)

I tried to post this to mod.std.unix, but it got bounced by the mailer.  Sigh.
------------------

                 Proposed Syntax Standard
                 For UNIX* System Commands

RULE  1:  Command names must be between 2 and 9 characters.

RULE  2:  Command names must include lower case letters and
          digits only.

RULE  3:  Option names must be a single character in length.

RULE  4:  All options must be delimited by ``-''.

RULE  5:  Options with no arguments may be grouped behind
          one delimiter.

RULE  6:  The first option-argument following an option
          must be preceded by white space.

RULE  7:  Option arguments cannot be optional.

RULE  8:  Groups of option-arguments following an option must be
          separated by commas or separated by white space and quoted.

RULE  9:  All options precede operands on the command line.

RULE 10:  ``--'' may be used to delimit the end of the options.

RULE 11:  The order of options relative to one another
          should not matter.

RULE 12:  The order of operands may matter and position-related
          interpretations should be determined on a
          command-specific basis.

RULE 13:  ``-'' preceded and followed by white space should be used
          only to mean the standard input.

                                                  November 1983
*UNIX is a trademark of AT&T Bell Laboratories

--------------------------------------------------------------------

The above  is a  direct quote  of the  quick reference  card
handed out in  conjunction with  a talk at  the 1984  Winter
Usenix conference by K. Hemenway & H. Armitage.  This set of
rules is sometimes called the H&A standard.  Any proposal of
a standard is  going to  cause controversy, and  this is  no
exception.  Although I at first was opposed to the standard,
I came to appreciate the thought that went into it.  In this
commentary, I hope to convey that to you.

General comments: The H&A standard tries to maintain as much
compatibility with  existing  programs while  improving  the
consistency of  UNIX  command line  syntax.   This  is  much
harder than designing  a command line  syntax from  scratch.
It is important to understand the rationale behind the whole
set of  conventions  before  making  judgements  about  them
individually.

H&A recorded the  syntax for  all the commands  in UNIX  (at
least System V UNIX).  They tried to come up with a standard
that was  as  close to  most  of the  existing  commands  as
possible.  Their analysis, summarized in their USENIX paper,
but much better covered in an unavailable internal Bell Labs
tech report, is an  excellent example of backing  statements
with facts.  The most common example of an objection to  the
standard is of the  form, "I don't like  RULE X. What  about
the zz command?"  to which  H&A could  say, "That  exception
happens in  only N  (few) commands."  Here are  my  comments
about the rules.  They contain my reaction to the rules  and
some of H&A's reasons for the rules.

I want to start by saying that this standard is much  better
than no standard.   If  I know  that a  command follows  the
standard, then there are no surprises about how options  are
requested and that makes life easier  for me.  I don't  have
to  worry  about  inconsistency,  and  that  overwhelms  the
quirks of the standard.

RULE 1:
          I see no reason for not allowing single  character
          command names like e, f,  w, and S, but there  are
          not many of  these.   There is  not much  mnemonic
          value to  single  character commands,  nor  for  2
          character commands, but there are a lot of those.

RULE 2:
          The restriction to lower case letters only is  for
          case insensitive systems.   One notable  exception
          is a.out, but that is  not really a command  name.
          Not allowing  special characters  like  underscore
          simplifies the rules.

RULE 3:
          Single  character  option   names  are  not   very
          mnemonic, but  they are  necessary to  be able  to
          bundle options.  They are also used in most of the
          commands.     Their   lack   of   mnemonicity   is
          compensated somewhat when on-line help is  readily
          available, which unfortunately is not common.

RULE 4:
          The convention of preceding options with - started
          to distinguish  options  from file  names.    Some
          commands that do not  take operands like files  or
          expressions  do  not  require  the  -  sign.    My
          experience is  that  this  is  an  extra  rule  to
          explain to new  users that is  not worth saving  a
          keystroke here and there.

RULE 5:
          Bundling of options  was a rule  demanded by  UNIX
          fans inside Bell Labs.  Once you accept this rule,
          you can't  have  multiple character  options,  and
          this is unfortunate.  Still,  I would not like  to
          have to type: ls -l -t -r.

RULE 6:
          Many programs  require  that  an  option  argument
          immediately follow the option (e.g., cc -lm, nroff
          -man) while  some require  a  space (e.g.,  cc  -o
          pgm).  This is  one inconsistency that causes  the
          most problems  for me, especially when  there  are
          inconsistencies inside a command  (cf.  cc,  which
          passes the  tightly  grouped  option-arguments  to
          other  programs).     Rather   than  deciding   on
          no-space, a space is required in the H&A standard.
          This is to make sure that filename expansion works
          properly.   For example,  if  the argument  to  an
          option is a  file like  "extralongname", then  the
          option -fextra*  would not  work, while  having  a
          space in there would.   You could make the  syntax
          "space-optional" but that  would require that  the
          documentation cover more  than one  case, which  I
          argue would make the syntax harder to learn.

RULE 7:
          Because option  arguments must  be separated  from
          options,  there  is  no  way  to  make  an  option
          argument optional, except for the special case  of
          at the end of a command line with no operands (but
          I think  this  rare  exception would  be  hard  to
          explain).   There  are  few  commands  that  allow
          optional  option-arguments  (e.g.,  pr  -h),   and
          supplying a  null argument  (ie.   "")  works  are
          well.

RULE 8:
          This rule does not allow for syntax like:
              pgm -i file1 file2 file3 -o file4 file5
          but this  is  not  very common.    Placing  quotes
          around the files is not too bad.

RULE 9:
          When options must  precede operands (e.g.,  files)
          several practices  are  not  supported.    One  is
          choosing a set  of options for  one file and  then
          some options for another.  Instead of this, two or
          more command lines are needed,  but this is not  a
          serious penalty  for  most  commands,  and  not  a
          common need.  The  second unsupported practice  is
          that of thinking of  options after typing most  of
          the  command   line;  if   options  must   precede
          operands, then they must be inserted.  While  this
          can be awkward  for some primitive  shells, it  is
          best handled with  command line  editing, such  as
          that in ksh.

RULE 10:
          You really need -- to  delimit the end of  options
          so that files or expressions that begin with - can
          be processed.  The string  -- was used because  of
          getopt's use.   This is not  a strong  motivation,
          because at the time of the standard, only about 40
          commands used getopt.  Still,  it seems as good  a
          delimiter as any.


RULE 11:
          I do not know why the order of options should  not
          matter.  It does matter  in commands like cc  (ie.
          ld) that requires a special ordering to libraries.

RULE 12:
          This rule says that the programmer can choose  any
          meaning to what follows the options.  Makes  sense
          to me.

RULE 13:
          There is some tradition and a definite need to  be
          able to insert the standard  input into a list  of
          files.  The - has been used in a few commands, and
          there were no likely contenders.

My impression is that  the H&A standard is  one we can  live
with.   It is  not the  sort of  syntax that  someone  might
design from scratch, but there  is a need for  compatibility
with old  syntax, not  just for  user comfort,  but also  to
avoid breaking  thousands  of shell  scripts  and  system(3)
calls to UNIX command lines.  Yes, there is some more typing
required, but I think it  is not a high  price to pay for  a
set of conventions you can fit on a small card.  To get  all
the  time-savers  we  like,   the  syntax  gets  much   more
complicated, which  I  think  is  one  reason  for  the  bad
reputation UNIX has earned.

What about existing commands?  Last I heard, the plan was to
first work on the  easy cases, the  commands that were  very
close to the standard.   Some commands would not be  changed
but be replaced  by new  programs that would  phase out  the
old.   Some  examples of commands  with extremely  difficult
syntax are "pr" and "sort".  The "test" command is  finessed
by saying  that  it does  not  use options,  but  a  special
expression language.  The "find" command could be dealt with
similarly.    The  "dd"  command,  with  name-value   format
options, originally  designed as  a parody  of the  IBM  DD,
would not change.
-- 
Gary Perlman  Wang Institute  Tyngsboro, MA 01879  (617) 649-9731
UUCP: decvax!wanginst!perlman             CSNET: perlman@wanginst

jsq@im4u.UUCP (John Quarterman) (11/04/85)

In article <1260@wanginst.UUCP> perlman@wanginst.UUCP (Gary Perlman) writes:
>I tried to post this to mod.std.unix, but it got bounced by the mailer.  Sigh.
>------------------

I'll put it in mod.std.unix.  It should fit nicely after the copy
of the public domain AT&T getopt(3) which I just posted.

The posting address for mod.std.unix is ut-sally!std-unix.
Sally talks to gatech, harvard, ihnp4, seismo, topaz, and others.
-- 
John Quarterman,   UUCP:  {ihnp4,seismo,harvard,gatech}!ut-sally!jsq
ARPA Internet and CSNET:  jsq@sally.UTEXAS.EDU, formerly jsq@ut-sally.ARPA

friesen@psivax.UUCP (Stanley Friesen) (11/05/85)

In article <1260@wanginst.UUCP> perlman@wanginst.UUCP (Gary Perlman) writes:
>
>                 Proposed Syntax Standard
>                 For UNIX* System Commands
>
>RULE  4:  All options must be delimited by ``-''.

	Doesn't allow on/off toggles.
>
>RULE  6:  The first option-argument following an option
>          must be preceded by white space.

	I feel that space optional is more compatible with existing
practice, too many programs do it both ways. I would certainly hate to
say "nroff -m e file" instead of "nroff -me file", since I think of
the 'me' as a unit, that is I think the latter is more readable.
>
>RULE  9:  All options precede operands on the command line.
>
	This cannot be applied to compilers and loaders, where the
library options *must* come after or among the files to get the
correct semantics. It also makes switching options during processing
impossible, which is necessary for some sorts of sequential
processing.

>RULE 11:  The order of options relative to one another
>          should not matter.

	Rules out sequential processing.
-- 

				Sarima (Stanley Friesen)

UUCP: {ttidca|ihnp4|sdcrdcf|quad1|nrcvax|bellcore|logico}!psivax!friesen
ARPA: ttidca!psivax!friesen@rand-unix.arpa

peter@graffiti.UUCP (Peter da Silva) (11/09/85)

Re: Stanley Friesen's comments about the proposed UNIX command syntax standard.

I'm glad that there's someone else on the net who believes that you can't
improve UNIX by adding more restrictions!
-- 
Name: Peter da Silva
Graphic: `-_-'
UUCP: ...!shell!{graffiti,baylor}!peter
IAEF: ...!kitty!baylor!peter

mike@whuxl.UUCP (BALDWIN) (11/11/85)

> Re: Stanley Friesen's comments about proposed UNIX command syntax standard.
> 
> I'm glad that there's someone else on the net who believes that you can't
> improve UNIX by adding more restrictions!
> -- 
> Name: Peter da Silva

<flame on>
Oh, bug off!  The purpose of the syntax standard is to make things more
uniform and easier to use.  Would you rather every command parses args
differently for no good reason??  Do you really want to remember FOR EACH
COMMAND whether it uses -, single letter options, lets you bundle args,
and allows/disallows/doesn't_care about whitespace for option arguments??

The syntax standard DOESN'T PUT RESTRICTIONS on how you want to parse args;
your program won't be silently removed if it doesn't adhere to it.  But if
you want to be CONSISTENT and not have to tell people YET ANOTHER WAY TO
PASS OPTIONS, then you can use getopt(3C) and not worry.  Besides, adding
getopt to programs usually REMOVES restrictions; lots of programs do it
one way or another (bundle/no_bundle, space/no_space), but not both because
it is just a pain to take care of all the cases.

There are always going to be commands that don't adhere to the standard,
like cc, pr, and sort.  But that's OK.  The purpose of writing down the
standard is that when you go to write a NEW command, you have something
to aim for.  And it's not intended to be the best way ever to do option
handling; it is meant to encapsulate the most prevalent means of option
handling currently found in UNIX.

It is NOT meant to encompass every single way options are dealt with in
UNIX.  If it DID, it would be so vague and wishy-washy that it would be
useless.

Do YOU have a counter proposal?  If you do, I'd like to hear it.  It should
NOT be radically different from the way things are done now.  I.e., programs
like ls, cat, ed, grep, sed, etc., should be able to use it without any
problems.  It should be easy to teach people.  If you DON'T have a proposal,
then just SHUT UP.
<flame off>
-- 
						Michael Baldwin
						{at&t}!whuxl!mike

friesen@psivax.UUCP (Stanley Friesen) (11/13/85)

In article <791@whuxl.UUCP> mike@whuxl.UUCP (BALDWIN) writes:
>> Re: Stanley Friesen's comments about proposed UNIX command syntax standard.
>
><flame on>
>Oh, bug off!  The purpose of the syntax standard is to make things more
>uniform and easier to use.  Would you rather every command parses args
>differently for no good reason?? 

	I agree, but I think it could do this better than it does.
Uniformity is *nice*, but it can be carried too far. Actually you seem
to have misunderstood the purpose of my remarks. I was trying to
propose a few minor changes in the standard which I feel would improve
its utility and flexibility, without harming its value as a standard.
>
>The syntax standard DOESN'T PUT RESTRICTIONS on how you want to parse args;
>your program won't be silently removed if it doesn't adhere to it.  But if
>you want to be CONSISTENT and not have to tell people YET ANOTHER WAY TO
>PASS OPTIONS, then you can use getopt(3C) and not worry.

	Oh, I intend to use getopt(3C) as much as possible, but I
would hate to see it enforce *all* of the proposed rules. And yes the
full set of rules in the standard is restrictive, too restrictive. For
me to be willing to actually use the standard it would have to be much
less restrictive. As it stands, I will ignore the standard whenever it
is convenient to do so! It wi8ll not get very far as a standard, and
will not produce much uniformity, if loads of people ignore it as
being too restrictive. Whoever is in charge of it should rewrite it to
be more flexible.

>getopt to programs usually REMOVES restrictions; lots of programs do it
>one way or another (bundle/no_bundle, space/no_space), but not both because
>it is just a pain to take care of all the cases.
>
	Agreed, this is why I will probably use getopt(), but *not*
the syntax standard!

>There are always going to be commands that don't adhere to the standard,
>like cc, pr, and sort.  But that's OK.  The purpose of writing down the
>standard is that when you go to write a NEW command, you have something
>to aim for.

	Now, when I invent a new language, say 'Blaise', and write a
compiler for it(a NEW command) I must either cripple it by leaving
out the '-l' option capability, or violate the standard by allowing
post argument options with position dependent meaning! The standard
should include all *major* ways of using options in current comamnds.
I consider the compilers to be *major* commands, even though there are
only a few of them. The standard needs to include them!

> And it's not intended to be the best way ever to do option
>handling; it is meant to encapsulate the most prevalent means of option
>handling currently found in UNIX.
>
	And it misses, it fails to incorporaste the very commonly used
option handling syntax of every existing compiler on UNIX! Just
because it covers the syntax of the greatest *number* of seperate
commands does not mean it covers the syntax of the *most* *used*
commands. It needs to do *both*.

>It is NOT meant to encompass every single way options are dealt with in
>UNIX.  If it DID, it would be so vague and wishy-washy that it would be
>useless.
>
	Agreed, but see above, it misses some *important* current uses.

>Do YOU have a counter proposal?  If you do, I'd like to hear it.  It should
>NOT be radically different from the way things are done now.  I.e., programs
>like ls, cat, ed, grep, sed, etc., should be able to use it without any
>problems.

	Yes, I do have some suggestions, and I gave them in my
original article. To start with drop the rule about *all* option
preceding *all* arguments.

-- 

				Sarima (Stanley Friesen)

UUCP: {ttidca|ihnp4|sdcrdcf|quad1|nrcvax|bellcore|logico}!psivax!friesen
ARPA: ttidca!psivax!friesen@rand-unix.arpa