[net.lang.c] X3J11 thoughts

dmr@research.UUCP (07/23/84)

A few comments on issues raised in net.lang.c.  (I do read it).
First, I am in general pleased with the work of the
X3J11 ANSI committee, and am especially content that they are now the
ones who have to worry about the kinds of questions raised in this group,
and that I don't have to rewrite the manual.

Henry Spencer's summary of Larry Rosler's Usenix presentation
on the current state of things was excellent.

I have only one serious qualm about the way things are preceding in X3J11:
it concerns argument declarations in function declarations.  This
is the only important change in the language as usually implemented.
Let's concede that this should have been done long ago; the only
interesting question is whether it is useful to do it now.
Recognizing that it was not practical to demand that all existing
programs be changed to declare function arguments, the committee
leans to allowing declarations but not requiring them.  For example:

   OLD					   NEW

extern	double sin();			extern double sin(double);
...					...
main() {				main() {
	double x = sin(1);			double x = sin(1);
}					}

In the new version the argument of sin will be coerced to double.
In the old version (still legal syntactically) there is no such
coercion, the program is just wrong.  Lint will tell you so, but
not everyone uses lint, and not everyone even has it.  The problem
is that because both programs will be accepted by most new compilers,
there is ample opportunity for confusion.  Thus, users of new
compilers will soon come to have include files that
nicely declare the arguments of sin() and other functions, and to depend
on the coercions.  Often their programs will appear to compile
properly under old compilers, but won't work.

The other problem with the proposal is that by allowing a mixture
of the old and new syntax, the compiler can't be sure whether actual
arguments were were declared and coerced at all call sites; this cuts
off some useful optimizations.  For example, the float->double widening
in arguments is very costly if you use a software implementation
of IEEE floating point.  If the compiler could be absolutely sure
that each caller knew that a function argument was declared float
everywhere, it wouldn't have to convert.

The committee had three choices:

1) Leaving things alone, with a subchoice of allowing argument declarations
   for checking purposes (no coercion);
2) The proposed scheme;
3) Requiring functions to be fully declared including all arguments
   (like Pascal, A68, indeed all other modern languages in which
   the question comes up).

Choice 1 with advisory declarations is not worth the trouble.
Leaving things completely alone was quite possible.  We managed to muddle
along so far.

Choice 3 is in an obvious way the correct one.  It has some costs
in complexity (see below).  The only problem is that it is
utterly impractical because it breaks every C program in existence.

Choice 2's problem is that it is neither fish nor fowl; it trips on
the same technical complexities of variable argument lists encountered
by choice 3 and complicates the language (e.g. "int f(void)" had to be
invented to make it work).  Rather than clearly stating
that getting argument types right is the programmer's responsibility
(as with 1) or that a mandatory previous declaration will coerce
and check the actual arguments (as with 3), it leaves everyone somewhat
confused as to what will happen at any particular call.	

All in all, I have to think that 3 is best but impossible, and that 1
is marginally better than 2.  Supporters of 2 are secretly hoping to be
able to go for 3 in the future.  Unfortunately, I suspect that
instead of having either 1 or 3 forever we will have a mishmash forever.
Appendix B of the current draft says that the compiler is entitled
to warn you if "a function is called but no prototype has been supplied";
this still seems to let you say   extern double sin();  ... sin(1);
with no warning.

Variable argument lists:
One of the problems you get into when argument types are supposed
to be known in advance, and even when they're not, is what to
say about the printf family in the language description.
The old manual said that actual arguments (after some widening)
had to agree in type and number with formals.  Unstated, but implicit
in all implementations, was that somehow printf had to work.
I don't know of any way to formalize this in the context of C.  A good
side effect of having syntax to notify the compiler that a function has an
unknown number and type of argument is that everyone is on notice
that something funny is going on.  A bad effect is that programmers
will come to believe that such things are in any way portable.
They are not.  Unless someone comes up with a brilliant invention,
neither ANSI nor I will promise anything in writing.  Suppliers
of C compilers and libraries are responsible for making printf work.
Users can't expect to do it themselves reliably.  Macros like va_args
(which came from BTL, not Berkeley) improve things in practice,
and can often make variable-argument functions more exportable.
What you will not get is a description of the complete semantics.
It's just too machine dependent.

Open-ended structures:
	Someone else asked about structure declarations with
things like  char x[1];  at the end, where the intent is to have
a fixed header with variable stuff tacked on at the end.
Once again, you are not likely to find a discussion of this in a C
reference manual.  Writers of language descriptions (even me) like
to have a firm idea of what various declarations of objects mean
and what operations can be performed on the objects.  Unless
people have a good alternative to unions or PL/I's iSUB or a better
idea please don't ask for formal blessing on this.

Enums:
	Was this the place I grossed out at Usenix?  I did say "botch."
In the current proposed standard, enum types are ints and there is no
restriction on their use, except that Appendix B says that a compiler
may warn you if you assign something to an enum variable except
a value of that enum type.  This is very close to my original design.
The choice with enums (as has been reported) was between

1) making them a neater way of specifying integer constants

2) making each a unique type as in Pascal

I decided against the second choice because to make them useful would
have required larger language changes than I was prepared for
(arrays indexed by enum values, arithmetic on sparse
values, that sort of thing).  I proceeded to put enums as integers
into the PDP-11 compiler, while publicly worrying about the choice,
and saying that it might be nice if lint warned about implausible
enum assignments.  At just that instant the Sys III compiler was being
completed, the same program that took a trip to California
to become the BSD compiler.  And unfortunately, it incorporated
halfway thoughts about what enums should be.  (So did the
the manual; it said that enums were a unique type but also were ints).
Let it be recorded that earlier Sys III and BSD compilers are buggy and
incorporate no useful realization of enums.

typeof:
	is a good deal.  Write the committee.

Grace Hopper:
	I think Kuenning meant Jean Sammet.

How to complain:
	If you feel strongly about something in the standard, it is
advisable to write to the committee instead of grousing here.
Some of them read this group, but I doubt if they save the submissions
away and take them to meetings.  Try a real, paper letter. Pick one of

	Larry Rosler
	Room 1337
	AT&T Bell Laboratories
	190 River Rd.
	Summit NJ 07901

	X3 Secretariat: CBEMA
	Suite 500
	311 First St NW
	Washington DC 20001

Letters to X3 should probably refer prominently to X3J11.
Suggestions should be specific and to the point.

		Dennis Ritchie

bprice@bmcg.UUCP (08/09/84)

As a 'charter member' of X3J9 and of the IEEE Pascal Standards committee, and 
thus of the Joint X3J9/IEEE Pascal Standards Committee, I recognize a few 
inaccuracies in Dunn's essay on the control variable standardization 
considerations.
These are my own personal recollections and opinions, not representing anything
(except as noted) that ANSI, X3, X3J9, or any IEEE body has passed on--but the
record is available for anyone who finds fault in my recollection.

>From: rcd@opus.UUCP
>This quite strongly resembles something that happened in the Pascal
>standardization effort:  In Pascal, the controlled variable of a `for'
>statement was originally just a variable.  
Although the original language definition required "The control variable...
must not be altered by the repeated statement."  This requirement was clarified
and used in the development of the standard.
 
>                                           The standards committee
>recognized (mostly correctly) that this can introduce some very odd
>effects; it also makes loop optimization very difficult.  
As the requirement was originally stated, loop optimization was both easy
and dangerous:  there was no way, in practice, to verify that the control
variable had not been altered.  Without such verification, though, really
strange bugs could have hidden out for long periods.

>                                                         The "ideal"
>solution would have been to have the controlled variable be local to, and
>declared by, the `for' statement itself.  
The alternative ways of typing the control variable made this solution far
from ideal.  If the type were given in the for-statement, then all for-
statements would have been broken.  If the new variable were given the type
of another variable, say one having the same spelling for its identifier, 
other obscure bugs could be hidden.  If the new variable were given a type
derived from the types and values of the initial and final values, then a
runtime determination of program validity would be required--if, say, the
control variable were passed to a variable parameter.  The other alternatives,
and kludges to allow one of the above to work, were even more unacceptable
to user and implementor alike on the committee.

>                                           However, this represented too
>much of a change, 
See above.  Breaking all existing for-statements or all existing compilers
was indeed "too much of a change."

>                  SO they took a middle ground and required only that it be
>a local variable. 
Not quite so--the 'middle ground' requires it to be a local variable, to be 
sure, but it also requires that there be no alteration of it (e.g., 
assignment to it) in any procedure which the forstatement can call, and that
the forstatement itself may not alter it.  The effect of this requirement is
that full optimizability is achieved without breaking too many existing
programs.  The requirement is not difficult to implement in most pre-existing
compilers, either.

>                   Thus they removed the possibly useful effects of an
>Algol 60 style of general use (which we also have in C, of course) - you
>can't count with a var parameter or a global.  Yet they failed to provide
>the useful clarity and optimizability of a completely local ALGOL 68 style
>of controlled variable.
Since clarity is strictly subjective, I can't respond to that point in an
objective fashion.  In my experience, the clarity of the Pascal design is
sufficient to be useful.  The optimizability is no less than that of ALGOL 68,
since the restrained control-variable approach of Pascal is, in the respects
that matter, indistinguishable from the local approach.

>                         Finally, they broke some programs in the process
>of making the change.
But the number of programs broken was the smallest that we could manage.  Any
other approach would have broken more.  The response that I got, personally,
to the compiler that I retrofitted was "thanks!".  This is because every one
of the programs that was "broken" by the change was wrong!  The change pointed
out lurking bugs, that hadn't been seen yet, in every instance.

-- 
--Bill Price    uucp:   {decvax!ucbvax  philabs}!sdcsvax!bmcg!bprice
                arpa:?  sdcsvax!bmcg!bprice@nosc

andrew@orca.UUCP (Andrew Klossner) (08/09/84)

> >                   Thus they removed the possibly useful effects of an
> >Algol 60 style of general use (which we also have in C, of course) - you
> >can't count with a var parameter or a global.  Yet they failed to provide
> >the useful clarity and optimizability of a completely local ALGOL 68 style
> >of controlled variable.
> Since clarity is strictly subjective, I can't respond to that point in an
> objective fashion.  In my experience, the clarity of the Pascal design is
> sufficient to be useful.  The optimizability is no less than that of ALGOL 68,
> since the restrained control-variable approach of Pascal is, in the respects
> that matter, indistinguishable from the local approach.

Algol 68 is still more optimizable.  Since the index variable is
strictly local to the loop, it can be held in a register and not
assigned a memory location, as its value upon loop termination is not
available.  In Pascal, it is much more difficult to assign a register
to the index variable, since its scope is the entire block, and so,
even if computations are maintained in the register, the computed value
must be stored to memory so as to be available outside the loop.

However, I agree that making a change of this magnitude to Pascal would
have been inappropriate.  The Algol 68 committee had the luxury of
designing a language for which there were no existing programs (once
they decided not to pursue compatibility with Algol 60).

  -- Andrew Klossner   (decvax!tektronix!orca!andrew)      [UUCP]
                       (orca!andrew.tektronix@rand-relay)  [ARPA]

rcd@opus.UUCP (Dick Dunn) (08/16/84)

Responding to Bill Price, on the effects of standardization wrt the Pascal
`for' statement.  Bill offered to correct a few inaccuracies.  >>=me, >=Price.
>>In Pascal, the controlled variable of a `for'
>>statement was originally just a variable.  
>Although the original language definition required "The control variable...
>must not be altered by the repeated statement."  This requirement was clarified
>and used in the development of the standard.

Bill is right, but there's nothing inaccurate about what I said--the
controlled variable WAS just a variable.  The requirement about not
altering the controlled variable quite naturally needed to be clarified,
since it was impossible to detect in any reasonable fashion as originally
stated.
 
>>                                                         The "ideal"
>>solution would have been to have the controlled variable be local to, and
>>declared by, the `for' statement itself.  
>The alternative ways of typing the control variable made this solution far
>from ideal.  If the type were given in the for-statement, then all for-
>statements would have been broken...<<and other difficulties>>

I didn't mean to imply that this would have been a viable approach for an
existing language.  It wouldn't.  What I meant was an "ideal" way to do
things in a new language (according to certain goals).  Standards efforts
(except for Ada:-) are not supposed to develop new languages.  It IS
possible to surmount the difficulties of declaring the controlled variable
implicitly at the opening of the for-loop.  As I said before, see ALGOL 68
for an example.

>>                  SO they took a middle ground and required only that it be
>>a local variable. 
>Not quite so--the 'middle ground' requires it to be a local variable, to be 
>sure, but it also requires that there be no alteration of it (e.g., 
>assignment to it) in any procedure which the forstatement can call, and that
>the forstatement itself may not alter it...

This IS a middle ground between making the controlled variable something
which is entirely owned by the for-statement and unalterable (i.e., not
really a variable, in the ALGOL 68 style) and something which is a general
variable (in the Algol 60 style or conventional C usage).  It is not a
middle ground on the alteration issue.  The original Pascal was MORE
permissive here--it only required that there be no alteration of the
controlled variable within the loop.  The standard (unless it changed since
the last time I saw it; Bill may correct me) requires, in effect, that
there be no potential for alteration of the variable--e.g., that it not be
passed as a VAR parameter.  The standard is more restrictive in principle
but in practice it hurts very little here, and it makes violations easily
detectable at compilation time.  I find it hard to fault this aspect of the
standard.

>>                         Finally, they broke some programs in the process
>>of making the change.
>But the number of programs broken was the smallest that we could manage.  Any
>other approach would have broken more...

This is emphatically false.  Had they not changed the definition to
restrict the controlled variable to a variable local to the procedure in
which the for statement occurs, fewer programs (probably none) would have
been broken as a result of the for-statement definition in the standard.
And if a couple of my programs hadn't been among the ones that got broken,
I wouldn't be so pissy about it.  But they did break, and the fixes weren't
entirely trivial.  My original point was about the occasional halfway-fix
that sometimes comes out of standardization efforts, and I stand by it.
-- 
Dick Dunn	{hao,ucbvax,allegra}!nbires!rcd		(303)444-5710 x3086
	...Never attribute to malice what can be explained by stupidity.

rcd@opus.UUCP (Dick Dunn) (08/29/84)

>Consider the analysis that a compiler would have to perform
>to deduce that subprograms called within a for loop did not
>change a non-locally declared control variable (by "side effect").
>Speaking as a user, I would rather have a restriction that has a
>chance of being implemeted according to the standard than make
>a requirement that few compilers are ever likely to comply with.

I'll try to state my case again:  The (arguably detrimental) change to the
language made during standardization was one which required that the
control variable of a "for" be a local variable of the procedure in which
the "for" occurs.  It is this change (and NOT the constraint on changing
the variable within the loop) which was significant to users and which
broke some programs.

The questions of (a) where the control variable is declared and (b) whether
a modifying reference to it might exist within the loop are separate (but
slightly related) questions.

Of course, all of this is still somewhat peripheral; I intended it as an
example of the standardization process making changes in the language being
standardized.

>As a member of the British Standards Institute committee
>that drafted the BSI/ISO Pascal standard  (the ANSI standard
>being essentially identical to the ISO level 0 subset), I can
>assure him that the restriction of control variables to be local
>variables is due neither to malice, nor to stupidity.

I didn't apply the malice/stupidity analysis.  If you want to interpret my
.signature as part of the content of the article, that's your problem.  If
you would like to avoid doing so, I'll give you a clue:  I (as most people)
do not sign articles in the middle.
-- 
Dick Dunn	{hao,ucbvax,allegra}!nbires!rcd		(303)444-5710 x3086
   ...I'm not cynical - just experienced.

wf@glasgow.UUCP (Bill Findlay) (09/03/84)

As a member of the British Standards Institute committee
that drafted the BSI/ISO Pascal standard  (the ANSI standard
being essentially identical to the ISO level 0 subset), I can
assure him that the restriction of control variables to be local
variables is due neither to malice, nor to stupidity.
Consider the analysis that a compiler would have to perform
to deduce that subprograms called within a for loop did not
change a non-locally declared control variable (by "side effect").
Speaking as a user, I would rather have a restriction that has a
chance of being implemeted according to the standard than make
a requirement that few compilers are ever likely to comply with.