[mod.std.unix] a bit more on getopt

jsq@ut-sally.UUCP (John Quarterman) (07/16/85)

From: John Quarterman <std-unix-request@ut-sally.UUCP> (moderator)
Topic: command line arguments (getopt)

There's getting to be a bit of repetition and a certain amount of
flamage on this subject.  Several things seem clear to me, at least:

1) Keith Bostic's getopt is the de facto standard public domain getopt
	because a) it implements everything the System V one does,
	down to specific error messages and global variables, and
	b) it's going to be in 4.3BSD.  It also may or may not be
	more efficient or smaller when seen in perspective.
2) Henry Spencer's getopt, in the version that I posted that Ian Darwin
	sent, is just about as good, since its earlier bug is fixed,
	though it lacks some undocumented System V features which
	Bostic's includes.
3) There are numerous minor functional changes which might be desirable
	for one reason or another, but they would *be* changes, and are
	not in the "standard" getopt.  The existing getopt is good
	enough for most purposes, and is widely available:  there is
	no need for another implementation.

While these are my personal opinions, they appear to agree with
those of the two getopt authors mentioned above.  Since I have
to go by something in moderating the newsgroup, I will discourage
from now on submissions which merely argue the above points again.
In other words, let's try to think of something new to say, or
go on to something else.


----------------------------------------------------------------------

From: ihnp4!utzoo!henry (Henry Spencer)
Date: 15 Jul 85 16:10:12 CDT (Mon)
To: ihnp4!ut-sally!std-unix
Subject: Re: What to do about extraneous arguments?
References: <251@mcc-db.UUCP>

> Common practice seems to be to ignore extraneous arguments.  A user here
> has requested that cmp(1) be modified to generate a diagnostic if more
> than 2 filenames are provided.  ...

The standard program skeleton for using getopt(3) includes, after the
big while-switch for the options, the code:

	if (errflg) {
		fprintf(stderr, "Usage: ...\n");
		exit(2);
	}

It's pretty simple to change that "if (errflg) {" to something like
"if (errflg || optind+2 != argc) {".  This is what we do in such cases.
Easy and effective.
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

------------------------------

From: ihnp4!utzoo!henry (Henry Spencer)
Date: 15 Jul 85 16:09:35 CDT (Mon)
To: ihnp4!ut-sally!std-unix
Subject: Re: command line arguments
References: <2210@ut-sally.UUCP>, <244@mcc-db.UUCP>

> Regarding getopts and the 'all arguments are proceded by a "-"':
> What about arguments that can be on/off switches, or can be positive or
> negative numbers? In other words, what is wrong with allowing '+' as an
> indicator of arguments? There are some commands that use it already.

The AT&T people considered this.  (It would be nice to see a posting of
the AT&T paper, in fact, if someone has it machine-readable [I don't]
[ I don't either, but if someone would mail it to me, I would post it
(if it's reasonably short) -mod ]; it would shorten this discussion
considerably.)  They thought the following were reasonable criteria for
accepting + :

1. Clear, simple, well-defined rules for when + should be used.
2. Rules should be applicable to more than just a few atypical commands.
3. Use of + should complement, not conflict with, the general use of -
	as an option marker (*not* as a "negation" indicator, note).

Their observations were that the existing uses of + meet none of these
criteria, that compatibility would prevent cleaning up existing uses,
and that criterion #3 seemed impossible to satisfy.  So, no +.

> Incidently, what happens with getopts if the command line was
> command -n -30
> and:
> Option n is fetched
> option 3 is fetched
> option 0 is fetched
> 
> (No well written program would do all this, but essentially, what happens
> if an argument looks like a flag? Or have you never grep'ed for a string
> beginning with -?)

If -n is an option taking a value, then the next argument is its value,
even if it happens to have a - on the front.  The apparent ambiguity is
resolved by knowing which options take values and which don't.

				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

----------------------------------------------------------------------

The moderated newsgroup mod.std.unix is for discussions of UNIX standards,
in particular of the draft standard in progress by the IEEE P1003
"UNIX Standards" Committee.

Submissions to the newsgroup to:	ut-sally!std-unix
Comments about the newsgroup to:	ut-sally!std-unix-request
Permission to post to the newsgroup is assumed for mail to the former address,
but not for mail to the latter address, nor for mail to my personal addresses.
-- 

John Quarterman,   UUCP:  {ihnp4,seismo,harvard,gatech}!ut-sally!jsq
ARPA Internet and CSNET:  jsq@ut-sally.ARPA, soon to be jsq@sally.UTEXAS.EDU

jsq@ut-sally.UUCP (John Quarterman) (07/18/85)

From: John Quarterman <std-unix-request@ut-sally.UUCP> (moderator)
Topic: command line arguments (getopt)

> There's getting to be a bit of repetition and a certain amount of
> flamage on this subject.  Several things seem clear to me, at least:
> 
> 1) Keith Bostic's getopt is the de facto standard public domain getopt

Well, it seems I may have been hasty to state that.  More follows.

----------------------------------------------------------------------

Date: Tue, 16 Jul 85 19:10:43 edt
From: ihnp4!cmcl2!edler (Jan Edler)
To: ihnp4!ut-sally!std-unix
Subject: another getopt

Another important public-domain implementation of getopt(3)
is the one from AT&T available from the UNIX system toolchest.

	Jan Edler
	New York University
	ihnp4!cmcl2!edler
	edler@nyu.arpa

[ The toolchest number is 201-522-6900, where you can log in as guest.
Getopt is listed as for $0.00, though there is evidently a $100.00
registration fee, a transfer fee ($10?) and tax.

If the source for this was actually published in the Dallas /usr/group
meeting proceedings, could someone who has them please type it in and
submit it to this newsgroup?  I could assume that the getopt in my
System V sources is the same as that published at Dallas and post it,
but it might not be.  -mod ]

----------------------------------------------------------------------

From: seismo!cbosgd!pegasus!hansen (Tony Hansen)
Date: Thu, 18 Jul 85 01:29:42 EDT
To: ut-sally!std-unix, cbosgd!seismo!keith
Subject: Re: getopt(3) (again...)
In-Reply-To: <250@mcc-db.UUCP>
Organization: AT&T-IS Labs, Lincroft, NJ

In article <250@mcc-db.UUCP> you write:
>Date: Thu, 11 Jul 85 14:07:41 EDT
>From: Keith Bostic <keith@seismo.CSS.GOV>
>Subject: getopt(3) (again...)
>
>Just when I thought it was safe to read my mail...
>
>> From: harvard!talcott!wjh12!mirror!rs@ut-sally.ARPA (Rich Salz)
>>
>> i made a couple of changes.  esthetics, absolutely no stdio if
>> desired, and the opterr variable.  here's my revision:
>
>I'm getting pretty tired of this whole issue -- in fact, I kept starting
>to reply to your mail and then deciding not to, figuring that if I didn't
>maybe the whole thing would die off.  *sigh*  Well, my friend, here's
>a reply.  The content is simple.  You are wrong.  Pure-d, absolutely,
>wrong.

Actually, the recently posted rewrite by Rich Salz is closer to AT&T's code
than is yours and his is more accurate.

>Point by point:
>
    ... (no comment on aesthetics)
>
>absolutely no stdio if desired:
>	Well, for an error condition that's going to happen once before the
>program exits, it's gonna be faster.  You saved about 2 routine calls, near
>as I can figure.  You didn't save any library space, which is why I changed
>the original fprintf() calls to fputs() calls.

Actually this is important in some applications which do not already use
stdio and do not wish to load in the 10k or so overhead that using stdio
incurs. AT&T's code does not use stdio in getopt(3).

>the opterr variable:
>	The other two items, I could live with.  Here, on the other hand,
>you have single-handedly created a real pain in the ass in terms of
>portability.
>
>Scenario #1:
>	Bell Labs doesn't ever decide to use opterr.  Fine and dandy,
>	except that people who use your new flag will find that their
>	code will not run as expected on USG UNIX.

Sigh. Here's the real crux of the matter: USG UNIX already has and uses
opterr exactly as used by Rich's code. It is poorly documented,
unfortunately.

>I would have been much more amenable to changes two months ago; if you
>can get Mike Karels to use your version rather than mine, I will again
>be much more amenable to changes.  Well, with the exception of your use
>of opterr.

I thought UCB had a System V license. Couldn't they have checked your
public-domain version against the code that was in the System V source
easily enough for incompatibilities?

In fact, why go with yours or Rich's version at all and not use the
public-domain version that AT&T published at January's Uni-Forum in Dallas?
That would have gotten rid of all thought of incompatiblity!
[ They may not have been aware of it:  few other people seem to be.
Perhaps you could type it in and submit it to the newsgroup? -mod ]

>  The world does not need another getopt.

You're right. Why'd you bother adding one? :-)

>	..., or, of course, we could just diverge the two systems
>	again.  Fun, fun!

I hope 4.3BSD picks up AT&T's public-domain version of getopt(3) for use
rather than creating yet-another branching of ways by using yours, Keith, or
Rich's.  [ You could submit the AT&T source to Berkeley as a bug fix.  -mod ]

					Tony Hansen
					ihnp4!pegasus!hansen

----------------------------------------------------------------------

From: Dave Berry <seismo!cstvax.ed.ac.uk!mcvax!db>
Date: Tue, 16 Jul 85 15:43:56 bst
To: ut-sally!std-unix
Subject: Command lines

It's probably way too late for this to be suggested, but the longer it's
left, the worse it will be.
How about completely revamping the unix command line syntax to be
	command {{-option ...} {file ...} ...}
with command names & option names being full words (e.g. remove, not rm)
and using command (and argument) completion a la VMS?  I used UNIX for three
years before using VMS, and I far prefer this approach to command line syntax
(though VMS filenames & DCL are appalling!).
	A couple of MMI articles I've read (in CACM I think) report evidence
that users far prefer command completion to cryptic abbreviations in the
UNIX tradition.
	The rest of UNIX is being dragged kicking & screaming into the
"real world" - I'd like to see this change happen too.

[ Command and file name completion has been added to the C shell in
several steps and posted to net.sources over the last couple of years.
4.3BSD will include both (made quite a bit more efficient) as an option
in the distributed C shell (according to what the Berkeley CSRG people
said at the 4BSD BOF at the Portland USENIX).  I don't know if such
has been done in any version of the Bourne or Korn shells.  -mod ]

----------------------------------------------------------------------

From: jsq@ut-sally.ARPA (John Quarterman)
Date: Thu Jul 18 10:51:48 CDT 1985
To: ut-sally!std-unix
Subject: Re: Command lines

It seems to me that general command argument completion would have to
be implemented in each command, and would require each command to be
able to interact with terminals.  Doesn't seem worth it to me, but then
I've always thought argument completion to be one of TENEX/TOPS-20/VMS's
most annoying features.  Argument completion would also seem to rule
out multiple options in the same word, which is a bit of a compatibility
problem.

----------------------------------------------------------------------

This moderated newsgroup, mod.std.unix, is for discussions of UNIX standards,
in particular of the draft standard in progress by the IEEE P1003 Committee.
The newsgroup is also gatewayed to an ARPA Internet mailing list.

Submissions to:	ut-sally!std-unix	  or std-unix@ut-sally.ARPA
Comments to:	ut-sally!std-unix-request or std-unix-request@ut-sally.ARPA
Permission to post to the newsgroup is assumed for mail to std-unix,
but not for mail to std-unix-request, nor for mail to my personal addresses.
-- 

John Quarterman,   UUCP:  {ihnp4,seismo,harvard,gatech}!ut-sally!jsq
ARPA Internet and CSNET:  jsq@ut-sally.ARPA, soon to be jsq@sally.UTEXAS.EDU

jsq@ut-sally.UUCP (John Quarterman) (07/19/85)

From: John Quarterman (moderator) <jsq@ut-sally.UUCP>
Topic: still more on command line arguments (getopt)

----------------------------------------------------------------------

From: seismo!nsc!idi!bene!luke!itkin (Steven List)
Date: Thu, 18 Jul 85 09:20:51 pdt
To: ut-sally.ARPA!std-unix
Subject: extraneous arguments
Organization: Benetics Corp, Mt. View, CA

>From: ihnp4!tektronix!uucp@ut-sally.ARPA
>Date: Saturday, 13 Jul 85 18:43:47 PDT
>Subject: What to do about extraneous arguments?
>
>Another aspect of command arguments is: after all the necessary arguments
>have been processed, what if some are left?
>

I'm in agreement with tektronix!rdoty.  I believe no program should produce
unexpected results without some explanation.  In the case of programs
like cmp and diff, a diagnostic AND a nonzero exit status would seem to
be appropriate.  The diagnostic message would tend to satisfy checks on
the size of the output being nonzero, and the status would satisfy
status checks.

----------------------------------------------------------------------

Date: Wed, 17 Jul 85 12:00:54 cdt
From: neuro1!baylor!peter@rice.uucp (Peter da Silva)
Subject: Re: Re: command line arguments
References: <245@mcc-db.UUCP>

> Date: Mon, 8 Jul 85 00:52:46 pdt
> From: nsc!turtlevax!ken@ihnp4.UUCP (Ken Turkowski)
> Subject: Re: command line arguments
> 
> Someone suggested that parsing arguments in shell scripts was difficult.
> I include the following shell scripts, one for the Bourne shell and one
> for the C-shell, which parse arguments of the form:
> 	-v -O -o outfile file1 file2 file3
> as well as
> 	-vOooutfile file1 file2 file3
> 

Sure, you can make shell scripts do almost anything. When I get a source with
that sort of stuff in it I generally rip it out & put up with weirdness. Why?
Well, our system is badly overloaded. Commands like that take 30 seconds to
a minute to start up!

----------------------------------------------------------------------

Date: Wed, 17 Jul 85 12:04:53 cdt
From: neuro1!baylor!peter@rice.uucp (Peter da Silva)
Subject: Re: Re: command line arguments
References: <246@mcc-db.UUCP>

> > I doubt the necessity and even the wisdom of seperating an argument from
> > the option by whitespace.
> 
> As I recall it, the AT&T standard does it this way on the grounds of
> readability, not necessity.  The "-t/dev/tty" example is an easy one
> to pick out, but what about "-dfaglop"?  Which of those letters are
> options, and which are an option argument?

OK, instead of forcing whitespace, how about requiring that there only be one
flag if you are going to do this sort of stuff? I have had shell scripts
totally broken by this requirement, and workarounds take up so much overhead
(yes, some people have systems smaller than vaxen) that it's not worth the
hassle.

----------------------------------------------------------------------

This moderated newsgroup, mod.std.unix, and the corresponding ARPA Internet
mailing list, is for discussions of UNIX standards; specifically
the draft standard in progress by the IEEE P1003 Committee.
Submissions to:	ut-sally!std-unix	  or std-unix@ut-sally.ARPA
Comments to:	ut-sally!std-unix-request or std-unix-request@ut-sally.ARPA
-- 

John Quarterman,   UUCP:  {ihnp4,seismo,harvard,gatech}!ut-sally!jsq
ARPA Internet and CSNET:  jsq@ut-sally.ARPA, soon to be jsq@sally.UTEXAS.EDU

jsq@ut-sally.UUCP (John Quarterman) (07/19/85)

Date: Thu, 18 Jul 85 20:29:59 EDT
From: Keith Bostic <seismo!keith>
To: /dev/null
Subject: Re: getopt(3) (again...)
Cc: pegasus!hansen, ut-sally!std-unix

> Actually, the recently posted rewrite by Rich Salz is closer to AT&T's code
> than is yours and his is more accurate.

You're right, I apologize.  I totally missed the USG use of opterr and have
updated my code appropriately.  I am currently trying to get 4.3 to use
the correct code.

> Actually this is important in some applications which do not already use
> stdio and do not wish to load in the 10k or so overhead that using stdio
> incurs. AT&T's code does not use stdio in getopt(3).

Not true.  The size difference between:

	main() { puts("foo"); }
and
	main() { write(0,"foo",3); }

is exactly zero.

> In fact, why go with yours or Rich's version at all and not use the
> public-domain version that AT&T published at January's Uni-Forum in Dallas?
> That would have gotten rid of all thought of incompatiblity!

Amen, I didn't know about it in January or I would have said something when
Berkeley asked to use mine.

--keith

jsq@ut-sally.UUCP (John Quarterman) (07/20/85)

bit more on getopt
Date: 19 Jul 85 11:32:03 EDT (Fri)

>  > Actually this is important in some applications which do not already use
>  > stdio and do not wish to load in the 10k or so overhead that using stdio
>  > incurs. AT&T's code does not use stdio in getopt(3).
> 
>  Not true.  The size difference between:
> 
>  	main() { puts("foo"); }
>  and
>  	main() { write(0,"foo",3); }
> 
>  is exactly zero.

Your second one-liner is still using stdio.  The difference between
       main() { puts("foo"); }
and
       main() { write(1, "foo", 3); }   exit(n) { _exit(n); }
on the other hand, is substantial, at least on my 4.2 VAX system (and, in my
experience, on other UNIX systems as well):

text	data	bss	dec	hex
2048	1024	15988	19060	4a74	stdio
1024	1024	0	 2048	 800	nostdio

1024       0    25988   17012           difference

The point about not using stdio in a library routine if it's not necessary
still stands.

	Dan Franklin

jsq@ut-sally.UUCP (John Quarterman) (07/21/85)

guments (getopt)
	Is the AT&T getopt public domain, or just published?
	AT&T getopt(3) man page is inaccurate.
	Inclusion of stdio and size of programs.
	Options, white space, and shell scripts.
	Full word command names and arguments, and completion.

----------------------------------------------------------------------

From: ihnp4!utzoo!henry (Henry Spencer)
Date: 19 Jul 85 20:14:44 CDT (Fri)
To: ihnp4!seismo!ut-sally!std-unix
Subject: Is AT&T getopt public-domain?  Not clear!

The document I have from the /usr/group standards meeting at Dallas
does *not* say that the AT&T getopt is being made public domain.  What
it says is:

	The [getopt] source code is being published by AT&T Bell
	Laboratories to encourage adherence to the command syntax
	standard and to satisfy requests from [everyone in sight].

Note that publishing something does *not* put it into the public domain
unless this is stated explicitly.  That may have been AT&T's intent, but
they didn't say it that way.  The document in question includes the AT&T
source, but I am reluctant to submit it to mod.std.unix until its status
is clarified.

				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

------------------------------

Date: 20 Jul 85 11:03:28 EDT (Sat)
From: topaz!packard!nomad!ggr (Guy Riddle)
Subject: getopt: the Code vs. the Manual Page
To: ut-sally!std-unix
Cc: seismo!keith

I hope you haven't been modelling the 4.3 version of getopt(3) too
closely after the SVR2 manual page, for it lies about the details.

It states

	"Because optind is external, it is normally initialized to
	zero automatically before the first call to getopt."

Well, I'll grant that optind is external, but it is initialized to one.

Also, the claim that

	"This error message may be disabled by setting opterr
	to a non-zero value."

is also a lie.  Opterr is initialized to one, and to disable the error
message you must set it to zero.

		=== Guy Riddle == AT&T Bell Laboratories, New Jersey ===

----------
|Rebuttal
|Corner
|
	Keith's assertion that

	> Not true.  The size difference between:
	> 
	> 	main() { puts("foo"); }
	> and
	> 	main() { write(0,"foo",3); }
	> 
	> is exactly zero.

	might be valid for 4.2, but it's not for SVR2, where the size of the
	puts(3) version is

		2432 + 456 + 2232 = 5120

	while the write(2) version is only

		128 + 12 + 0 = 140

------------------------------

From: ihnp4!decvax!borman (Dave Borman)
Date: Sat, 20 Jul 85 21:01:42 edt
To: decvax!ihnp4!ut-sally!std-unix
Subject: getopt(3) & stdio

>  >  > Actually this is important in some applications which do not already use
>  >  > stdio and do not wish to load in the 10k or so overhead that using stdio
>  >  > incurs. AT&T's code does not use stdio in getopt(3).
>  > 
>  >  Not true.  The size difference between:
>  > 
>  >  	main() { puts("foo"); }
>  >  and
>  >  	main() { write(0,"foo",3); }
>  > 
>  >  is exactly zero.
>  
>  Your second one-liner is still using stdio.  The difference between
>     main() { puts("foo"); }
>  and
>     main() { write(1, "foo", 3); }   exit(n) { _exit(n); }
>  on the other hand, is substantial, at least on my 4.2 VAX system (and, in my
>  experience, on other UNIX systems as well):

The first two examples are different, the puts() will pull in stdio and
the write() should not.  If you have to explicitly re-declare exit() to
avoid pulling in the stdio package, then your /lib/libc.a is mucked up.
exit() calls _cleanup, of which there are two copies in the /lib/libc.a.
The stdio package has a function _cleanup which flushes all the buffers.
There is also a dummy _cleanup routine (usually fakcu.s) which just
returns.  In /lib/libc.a, exit() must be archived after all the stdio
functions, and the dummy _cleanup must be archived after exit.  If you
have exit() before the stdio functions, then the reference to _cleanup
will pull in the whole stdio package.  If exit() is after the stdio
package and the dummy _cleanup after exit(), then if you don't reference
any stdio functions you will only pull in the dummy cleanup routine.

		-Dave Borman, Digital UNIX Engineering Group.
		decvax!borman

------------------------------

Date: Sat, 20 Jul 85 16:31:33 PDT
From: seismo!sun!guy (Guy Harris)
To: ut-sally!jsq
Subject: Re: a bit more on getopt

> Not true.  The size difference between:
>
>	main() { puts("foo"); }
> and
>	main() { write(0,"foo",3); }
>
> is exactly zero.

Only true on certain UNIX implementations.  Under Sun UNIX 2.0
(4.2BSD-based), there is a slight difference.  Under System V there is a
significant difference.  The problem is that 4.2BSD *always* drags in the
Standard I/O Library while System V doesn't.  4.xBSD could be changed, with
about 30 minutes work, to work the way System V does here, so the assumption
should be made that the Standard I/O Library does consume a nonzero amount
of code and data space.  (About 13788 bytes in one test I did; this doesn't
count buffers which are "malloc"ed when the first read/write is done.)

	Guy Harris

------------------------------

From: ihnp4!utzoo!henry (Henry Spencer)
Date: 20 Jul 85 20:45:50 CDT (Sat)
To: ihnp4!seismo!ut-sally!std-unix
Subject: Re: a bit more on getopt
References: <251@mcc-db.UUCP> <2365@ut-sally.UUCP> <2392@ut-sally.UUCP>, <2399@ut-sally.UUCP>

> > ...  The "-t/dev/tty" example is an easy one
> > to pick out, but what about "-dfaglop"?  Which of those letters are
> > options, and which are an option argument?
> 
> OK, instead of forcing whitespace, how about requiring that there only be one
> flag if you are going to do this sort of stuff? I have had shell scripts
> totally broken by this requirement, and workarounds take up so much overhead
> (yes, some people have systems smaller than vaxen) that it's not worth the
> hassle.

We do a *lot* of shell programming, and our experience is that the
separating blank makes life easier, not harder.  Of course, we generally
use getopt(1) for the argument parsing, which makes life simpler.  utzoo
is a PDP11, by the way.

				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

------------------------------

From: ihnp4!utzoo!henry (Henry Spencer)
Date: 19 Jul 85 20:15:23 CDT (Fri)
To: ihnp4!seismo!ut-sally!std-unix
Subject: Command lines

> It's probably way too late for this to be suggested, but the longer it's
> left, the worse it will be.
> How about completely revamping the unix command line syntax to be
> 	command {{-option ...} {file ...} ...}
> with command names & option names being full words (e.g. remove, not rm)...

The AT&T command-line-syntax people have alluded to attempts to do this
in the past at AT&T.  They were failures.  It is not enough to decree a
standard; one must also convince people to accept it.  The getopt standard
has been widely accepted precisely *because* it tidies up and standardizes
the existing situation, rather than trying to impose radical change.

There are also problems with full-word options, and worse problems with
full-word options that can be arbitrarily abbreviated, but I won't get
into that since it seems a digression.

I've thought about this at considerable length, and concluded that radical
change will require more incentive than a simplistic revision of command
syntax would provide.  VMS-style "completion" isn't enough.  What one wants
is much more sophisticated help in command construction, including things
like interactive "help" information for options, knowledge of the semantics
of arguments so that error repair can be applied, etc.  Imbedding this into
every program seems dubious; it would seem better to have a sophisticated
shell which uses a database describing the commands.  Note that such an
interface could completely hide the details of the *actual* command syntax.
Someday...
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

------------------------------

jsq@ut-sally.UUCP (John Quarterman) (07/23/85)

----------------------------------------------------------------------

From: seismo!BBN-LABS-B.ARPA!dan (Dan Franklin)
To: ut-sally!BBN-LABS-B.ARPA!std-unix
Subject: Re: a bit more on getopt
Date: 19 Jul 85 11:32:03 EDT (Fri)

>  > Actually this is important in some applications which do not already use
>  > stdio and do not wish to load in the 10k or so overhead that using stdio
>  > incurs. AT&T's code does not use stdio in getopt(3).
> 
>  Not true.  The size difference between:
> 
>  	main() { puts("foo"); }
>  and
>  	main() { write(0,"foo",3); }
> 
>  is exactly zero.

Your second one-liner is still using stdio.  The difference between
       main() { puts("foo"); }
and
       main() { write(1, "foo", 3); }   exit(n) { _exit(n); }
on the other hand, is substantial, at least on my 4.2 VAX system (and, in my
experience, on other UNIX systems as well):

text	data	bss	dec	hex
2048	1024	15988	19060	4a74	stdio
1024	1024	0	 2048	 800	nostdio

1024       0    25988   17012           difference

The point about not using stdio in a library routine if it's not necessary
still stands.

	Dan Franklin

------------------------------

Discussions-Of: UNIX standards, particularly the IEEE P1003 draft standard.
Submissions-To:	ut-sally!std-unix	or std-unix@ut-sally.ARPA
Comments-To: ut-sally!std-unix-request	or std-unix-request@ut-sally.ARPA
UUCP-Routes: {ihnp4,seismo,harvard,gatech}!ut-sally!std-unix
Archives-In: ~ftp/pub/mod.std.unix on ut-sally.ARPA (soon sally.UTEXAS.EDU)

jsq@ut-sally.UUCP (John Quarterman) (07/23/85)

From: John Quarterman (moderator) <std-unix-request@ut-sally>
Topic: command line arguments (getopt) (retransmission of earlier article)
	Is the AT&T getopt public domain, or just published?
	AT&T getopt(3) man page is inaccurate.
	Inclusion of stdio and size of programs.
	Options, white space, and shell scripts.
	Full word command names and arguments, and completion.

----------------------------------------------------------------------

From: ihnp4!utzoo!henry (Henry Spencer)
Date: 19 Jul 85 20:14:44 CDT (Fri)
To: ihnp4!seismo!ut-sally!std-unix
Subject: Is AT&T getopt public-domain?  Not clear!

The document I have from the /usr/group standards meeting at Dallas
does *not* say that the AT&T getopt is being made public domain.  What
it says is:

	The [getopt] source code is being published by AT&T Bell
	Laboratories to encourage adherence to the command syntax
	standard and to satisfy requests from [everyone in sight].

Note that publishing something does *not* put it into the public domain
unless this is stated explicitly.  That may have been AT&T's intent, but
they didn't say it that way.  The document in question includes the AT&T
source, but I am reluctant to submit it to mod.std.unix until its status
is clarified.

				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

------------------------------

Date: 20 Jul 85 11:03:28 EDT (Sat)
From: topaz!packard!nomad!ggr (Guy Riddle)
Subject: getopt: the Code vs. the Manual Page
To: ut-sally!std-unix
Cc: seismo!keith

I hope you haven't been modelling the 4.3 version of getopt(3) too
closely after the SVR2 manual page, for it lies about the details.

It states

	"Because optind is external, it is normally initialized to
	zero automatically before the first call to getopt."

Well, I'll grant that optind is external, but it is initialized to one.

Also, the claim that

	"This error message may be disabled by setting opterr
	to a non-zero value."

is also a lie.  Opterr is initialized to one, and to disable the error
message you must set it to zero.

		=== Guy Riddle == AT&T Bell Laboratories, New Jersey ===

----------
|Rebuttal
|Corner
|
	Keith's assertion that

	> Not true.  The size difference between:
	> 
	> 	main() { puts("foo"); }
	> and
	> 	main() { write(0,"foo",3); }
	> 
	> is exactly zero.

	might be valid for 4.2, but it's not for SVR2, where the size of the
	puts(3) version is

		2432 + 456 + 2232 = 5120

	while the write(2) version is only

		128 + 12 + 0 = 140

------------------------------

From: ihnp4!decvax!borman (Dave Borman)
Date: Sat, 20 Jul 85 21:01:42 edt
To: decvax!ihnp4!ut-sally!std-unix
Subject: getopt(3) & stdio

>  >  > Actually this is important in some applications which do not already use
>  >  > stdio and do not wish to load in the 10k or so overhead that using stdio
>  >  > incurs. AT&T's code does not use stdio in getopt(3).
>  > 
>  >  Not true.  The size difference between:
>  > 
>  >  	main() { puts("foo"); }
>  >  and
>  >  	main() { write(0,"foo",3); }
>  > 
>  >  is exactly zero.
>  
>  Your second one-liner is still using stdio.  The difference between
>     main() { puts("foo"); }
>  and
>     main() { write(1, "foo", 3); }   exit(n) { _exit(n); }
>  on the other hand, is substantial, at least on my 4.2 VAX system (and, in my
>  experience, on other UNIX systems as well):

The first two examples are different, the puts() will pull in stdio and
the write() should not.  If you have to explicitly re-declare exit() to
avoid pulling in the stdio package, then your /lib/libc.a is mucked up.
exit() calls _cleanup, of which there are two copies in the /lib/libc.a.
The stdio package has a function _cleanup which flushes all the buffers.
There is also a dummy _cleanup routine (usually fakcu.s) which just
returns.  In /lib/libc.a, exit() must be archived after all the stdio
functions, and the dummy _cleanup must be archived after exit.  If you
have exit() before the stdio functions, then the reference to _cleanup
will pull in the whole stdio package.  If exit() is after the stdio
package and the dummy _cleanup after exit(), then if you don't reference
any stdio functions you will only pull in the dummy cleanup routine.

		-Dave Borman, Digital UNIX Engineering Group.
		decvax!borman

------------------------------

Date: Sat, 20 Jul 85 16:31:33 PDT
From: seismo!sun!guy (Guy Harris)
To: ut-sally!jsq
Subject: Re: a bit more on getopt

> Not true.  The size difference between:
>
>	main() { puts("foo"); }
> and
>	main() { write(0,"foo",3); }
>
> is exactly zero.

Only true on certain UNIX implementations.  Under Sun UNIX 2.0
(4.2BSD-based), there is a slight difference.  Under System V there is a
significant difference.  The problem is that 4.2BSD *always* drags in the
Standard I/O Library while System V doesn't.  4.xBSD could be changed, with
about 30 minutes work, to work the way System V does here, so the assumption
should be made that the Standard I/O Library does consume a nonzero amount
of code and data space.  (About 13788 bytes in one test I did; this doesn't
count buffers which are "malloc"ed when the first read/write is done.)

	Guy Harris

------------------------------

From: ihnp4!utzoo!henry (Henry Spencer)
Date: 20 Jul 85 20:45:50 CDT (Sat)
To: ihnp4!seismo!ut-sally!std-unix
Subject: Re: a bit more on getopt
References: <251@mcc-db.UUCP> <2365@ut-sally.UUCP> <2392@ut-sally.UUCP>, <2399@ut-sally.UUCP>

> > ...  The "-t/dev/tty" example is an easy one
> > to pick out, but what about "-dfaglop"?  Which of those letters are
> > options, and which are an option argument?
> 
> OK, instead of forcing whitespace, how about requiring that there only be one
> flag if you are going to do this sort of stuff? I have had shell scripts
> totally broken by this requirement, and workarounds take up so much overhead
> (yes, some people have systems smaller than vaxen) that it's not worth the
> hassle.

We do a *lot* of shell programming, and our experience is that the
separating blank makes life easier, not harder.  Of course, we generally
use getopt(1) for the argument parsing, which makes life simpler.  utzoo
is a PDP11, by the way.

				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

------------------------------

From: ihnp4!utzoo!henry (Henry Spencer)
Date: 19 Jul 85 20:15:23 CDT (Fri)
To: ihnp4!seismo!ut-sally!std-unix
Subject: Command lines

> It's probably way too late for this to be suggested, but the longer it's
> left, the worse it will be.
> How about completely revamping the unix command line syntax to be
> 	command {{-option ...} {file ...} ...}
> with command names & option names being full words (e.g. remove, not rm)...

The AT&T command-line-syntax people have alluded to attempts to do this
in the past at AT&T.  They were failures.  It is not enough to decree a
standard; one must also convince people to accept it.  The getopt standard
has been widely accepted precisely *because* it tidies up and standardizes
the existing situation, rather than trying to impose radical change.

There are also problems with full-word options, and worse problems with
full-word options that can be arbitrarily abbreviated, but I won't get
into that since it seems a digression.

I've thought about this at considerable length, and concluded that radical
change will require more incentive than a simplistic revision of command
syntax would provide.  VMS-style "completion" isn't enough.  What one wants
is much more sophisticated help in command construction, including things
like interactive "help" information for options, knowledge of the semantics
of arguments so that error repair can be applied, etc.  Imbedding this into
every program seems dubious; it would seem better to have a sophisticated
shell which uses a database describing the commands.  Note that such an
interface could completely hide the details of the *actual* command syntax.
Someday...
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

------------------------------

Discussions-Of: UNIX standards, particularly the IEEE P1003 draft standard.
Submissions-To:	ut-sally!std-unix	or std-unix@ut-sally.ARPA
Comments-To: ut-sally!std-unix-request	or std-unix-request@ut-sally.ARPA
UUCP-Routes: {ihnp4,seismo,harvard,gatech}!ut-sally!std-unix
Archives-In: ~ftp/pub/mod.std.unix on ut-sally.ARPA (soon sally.UTEXAS.EDU)