[net.lang.c] C standard revisions

jbn@wdl1.UUCP (07/12/84)

        Some comments on the C revision follow; my previous posting of
	similar comments was reportedly garbled in transmission.

    1.	It is permissable to change the language incompatibly if machine 
	conversion is possible and straightforward, and it is better to 
	change the language incompatibly than to introduce painful mechanisms 
	for backward compatibility.  If this is unacceptable, the obsolete
	constructs should be explictly defined as features retained for
	backward compatiblity in this revision and subject to removal in
	a later revision, and compilers should issue warning messages
	when such constructs appear.  The way the transition from "=+" to
	"+=" was handled is an example of the right way to do this.
    
    2.  If syntax for checking procedure calls across compilation units
	is to be introduced at all, it should be mandatory.  Otherwise it
	is almost useless as a mechanism for preventing bugs.  Also, if
	it is mandatory, compilers can generate better code when passing
	small types; the present definition requries that all chars, for
	example, be expanded to ints in parameter passing, which makes
	calling functions more expensive for C than it should be.

    3.  The problem of functions with variable numbers of arguments is
	an important one.  The (argc,argv) mechanism has proved very
	effective in dealing with variable numbers of arguments on the
	command line, and something like this is probably better than
	an ommitted-argument approach.  The type of the entries in argv
	remains a problem.  But the idea of passing an array of pointers
	and a count is a good one.  Few languages deal with this problem
	in a clean way; in most, I/O is a hack for this reason.  Lisp
	nlambdas are probably the cleanest solution in a widely used language.
	Ada has a sound approach to ommitted arguments, but it is complex
	and unsuited to C.

	The basic test for this mechanism is, of course, whether "stdio" can
	be written portably in C.  The ommitted argument approach fails this
	test, because there is going to be some limit on the number of
	arguments to the "printf" family of routines.  Worse, the approach
	of defining "printf" with a large number of arguments, some of
	which are ommitted, doesn't allow all the "printf"-like routines
	to use the same common routine ("doprint" in some implementations)
	in a sound way, because it implies references to parameters that
	weren't passed.  This is formally an error, and may be an actual
	error if your system doesn't allow stack growth on fetches, as well
	as stores.

	The (argc,argv) approach, on the other hand, handles this quite
	nicely.  This is in fact near to the way the present routines for UNIX 
	work, except that they use an implementation-dependent trick to access
	the stack directly, something which depends upon knowledge of 
	exactly how parameters are passed.  (There are other ways to
	pass parameters to subroutines than pushing them on the stack;
	there are compilers, for example, that put the first three arguments
	in registers and only use the stack if there are more than three
	arguments.  This provides a significant performance improvement
	on many machines.)  In fact, the UNIX library routines here are
	not portable across machines which grow the stack in different
	directions.

Other than the above, the standard revision looks good from the point of
view of someone concerned with portability.

					John Nagle

henry@utzoo.UUCP (Henry Spencer) (07/15/84)

Some replies to some of wdl1!jbn's comments:

    1.	It is permissable to change the language incompatibly if machine 
	conversion is possible and straightforward, and it is better to 
	change the language incompatibly than to introduce painful mechanisms 
	for backward compatibility.  If this is unacceptable, the obsolete
	constructs should be explictly defined as features retained for
	backward compatiblity in this revision and subject to removal in
	a later revision, and compilers should issue warning messages
	when such constructs appear.  The way the transition from "=+" to
	"+=" was handled is an example of the right way to do this.

In fact, this is very difficult.  The transition from "=+" to "+=" took
perhaps five years, and I gather it caused considerable unhappiness
among some of the customers.  Yet this was pretty trivial, and it was
exactly the sort of thing where machine translation can do the whole
job for you.  I fear that the tremendous body of existing C code already
has huge inertia; we are stuck with backward compatibility whether we
like it or not.  If you want a (semi)formal proof of this, consider:

(a) The C standard is most unlikely to succeed if it faces
	active opposition from AT&T.  Unix is still the home
	of C, although many C implementations have left home,
	and AT&T's C compilers remain very influential.
(b) AT&T is most unlikely to endorse something for external
	release if it's unacceptable for internal use.  This
	implies that the new C standard has to pass muster
	within AT&T.  (Please don't tell me that this is really
	distasteful.  True, but beside the point.)
(c) AT&T won't endorse something for internal use that breaks
	huge masses of existing applications code.  Note that
	I didn't waffle with "most unlikely" on this one, because there
	is already one case on record where a seemingly-desirable
	change in AT&T C compilers had to be withdrawn because the
	internal customers absolutely refused to accept it -- it
	broke too much code.

Changes can be made, but even for trivia like "+=" it's already a very
slow and painful process.  Radical change is out.
    
    2.  If syntax for checking procedure calls across compilation units
	is to be introduced at all, it should be mandatory.  Otherwise it
	is almost useless as a mechanism for preventing bugs.  Also, if
	it is mandatory, compilers can generate better code when passing
	small types; the present definition requries that all chars, for
	example, be expanded to ints in parameter passing, which makes
	calling functions more expensive for C than it should be.

Problems with mixing old and new function-declaration arrangements worry
many people.  But the break-all-the-old-code problem is severe here.

    3.  The problem of functions with variable numbers of arguments is
	an important one...

Agreed.  The committee is interested in trying to define a standard way
of handling this problem.  It's very hard.  (Please don't tell me that
it's really easy because there is a simple solution, namely X; believe
me, I have seen all the simple ideas, and none of them work everywhere.)
(If you *must* send me your latest brainstorm in the belief that I've
never seen it before, **PLEASE** do so by private mail, not by posting it
to the whole newsgroup!)  I suspect that the only fully satisfactory way to
do it is to assume that variable-argument procedures are fully declared
(e.g. "extern printf(char *,);") before use, so that the compiler can
do something special about calls to them.  Given this, something like
the "(argc, argv)" convention would be possible.  I don't know if the
committee is planning to do exactly that, but it seems the right direction.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry