[comp.lang.c] draft ANSI standard: needs your tomatoes

gnu@hoptoad.uucp (John Gilmore) (12/02/86)

[This is posted to comp.lang.c because mod.std.c seems to be dead.  Love
those mod groups!]

I am afraid that, since this is the last mandatory comment period for
the draft proposed ANSI C standard, the current document, or something
like it, will become the first and only C standard.  I'm calling
on all of you out there to read it and pick it to pieces, because it is
not up to the quality of a national standard.  My comments alone will
not carry enough weight to stop it.  If they get a flood of negative
comments, they will be forced to revise it and go through another
public review, when we will possibly get to comment on something close
enough to a real standard to matter.

Be sure to print out your comments and send them in; don't just post
them.  The first page of the standard specifies that comments should be
sent to:
	X3 Secretariat/CBEMA, 311 First St NW #500, Washington DC 20001
with a copy to:
	Board of Standards Review, ANSI, 1430 Broadway, NY NY 10018
The committee requests that you number your comments, starting with 1,
and indicate the relevant section number in each comment.

I think that the low quality of the standard is partly due to its not
getting enough early review.  I think that if earlier drafts had been
posted to the net, it would be in much better shape.

In general, I find the draft standard to be the least precise proposed
standard I have seen.  Most standards start off by defining the model
of the language environment (e.g. what attributes a name or value can
have) and then clearly define for every language construct what
attributes are relevant and how they are modified.  In this standard,
you have to read the whole thing to make sense of it.  It's like the
K&R "C reference manual" but 200 pages long instead of 40 pages, and
written by committee.  For example, to determine what constraints are
on order of execution, you can turn to 3.3, but there are exceptions in
3.5.2.4 under "volatile", commentary in 3.6 under "statements", more
constraints in 4.7 under "signal", and others elsewhere (that I can't
find right now!  That's the problem).

The preceding 4 messages in comp.lang.c are just the tip of the iceberg;
they were found within about 5 hours of reading (partly with Guy Harris),
and we only covered a tenth of the document.

Other things that I will comment on when I've researched them better:

 * Many terms are used but are not well defined, or are misused
   (e.g. "full expression", "lvalue", "object".  Is a character string
   an lvalue?  Is it an object?)

 * you can compare (int *) == (void *)
	   but not (int *) >= (void *).

 * you can declare a function to be const or volatile.
 
 * There seems to be no automatic conversion of const to normal or volatile
to normal, e.g. you can't pass a const char * or a char *const or a
volatile char * or a char *volatile to a function expecting a char *.
I presume this is why the type of string constants was not made "const".

 * you can't cast a void to type (void).

 * sizeof (2+2) is valid, as is sizeof ("abcd"+2).

 * sizeof (array) returns the size of a pointer to the first element.
(sec. 3.2.2.1 switches it to a pointer before sec. 3.3.3.4 can use the array)

 * multiple character char constants (e.g. 'abc') are legal and encouraged

 * empty arrays are explicitly disallowed

I wish X3J11 had offerred prizes like the POSIX committee, but I don't
think they could afford to.
-- 
John Gilmore  {sun,ptsfa,lll-crg,ihnp4}!hoptoad!gnu   jgilmore@lll-crg.arpa
Call +1 800 854 7179 or +1 714 540 9870 and order X3.159-198x (ANSI C) for $65.
Then spend two weeks reading it and weeping.  THEN send in formal comments! 

gwyn@brl-smoke.ARPA (Doug Gwyn ) (12/03/86)

In article <1384@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes:
>...  I'm calling
>on all of you out there to read it and pick it to pieces, because it is
>not up to the quality of a national standard.  My comments alone will
>not carry enough weight to stop it.  If they get a flood of negative
>comments, they will be forced to revise it and go through another
>public review, when we will possibly get to comment on something close
>enough to a real standard to matter.

John, this is amazingly irresponsible of you.  X3J11 has not been
trying to railroad anything or sneak garbage past the unsuspecting
public, which is what the tone of your remarks seems to imply.

The major purpose of public review is indeed to obtain feedback to
help improve the quality of the standard.  HOWEVER, for this to work,
it is essential that criticisms be basically CONSTRUCTIVE in nature.
A "flood of negative comments" will really not help the process;
careful evaluations accompanied by specific proposals will.

>I think that the low quality of the standard is partly due to its not
>getting enough early review.  I think that if earlier drafts had been
>posted to the net, it would be in much better shape.

Every interested party, including you, had the opportunity to
participate in the drafting process.  Judging by the generally
low caliber of net.lang.c postings over the past few years, I
doubt very much that input from the newsgroup readership would
have helped much.  Most people who really cared about the process
managed to obtain draft copies somehow, and several X3J11 members
monitored the net.lang.c discussions.  I should also point out
that X3J11 includes a large number of top-notch C experts, so
flaws in the draft are not due to ignorance or inexperience.

>In general, I find the draft standard to be the least precise proposed
>standard I have seen.  Most standards start off by defining the model
>of the language environment (e.g. what attributes a name or value can
>have) and then clearly define for every language construct what
>attributes are relevant and how they are modified.  In this standard,
>you have to read the whole thing to make sense of it.  It's like the
>K&R "C reference manual" but 200 pages long instead of 40 pages, and
>written by committee.  ...

Judgements about "precision" are rather subjective.  The draft C
standard seems to me rather precise in comparison to other
standards I have seen.  Your simple model of what is appropriate
breaks down pretty badly if you actually try to follow it for C.

The idea of a formal semantic specification was considered and
decided against; the K&R Appendix A model was chosen deliberately
to make the eventual standard maximally useful to a wide audience.

The fact that a linear sequential reading of the draft can't be
used to specify everything fully is a feature of the C language
itself.  The "Forward references" in the draft are an attempt to
help the reader cope with this.  If the document were a tutorial,
which it explicitly is not, then a different presentation would
be appropriate.

The additional size is due to several factors:
	(1) K&R had ambiguities, omissions, etc. which required
	additional text to clarify.
	(2) X3J11 included additional facilities such as run-time
	library routines, enums, (void *), and function prototypes.
	(3) Appendices, an index, and so forth make the page count
	greater.

> * Many terms are used but are not well defined, or are misused
>   (e.g. "full expression", "lvalue", "object".  Is a character string
>   an lvalue?  Is it an object?)

The quoted terms are defined in the draft.  If you have better
definitions, by all means propose them.

A string is an array of characters (including null terminator) and
is indeed an object.  So is a string literal, which is what I assume
you meant by "character string".  In Section 3.3.1 one finds that a
string literal is an lvalue (among other things).  I determined all
this in less than a minute, simply by using the index (part of the
200 pages you complained about).

> * you can compare (int *) == (void *)
>	   but not (int *) >= (void *).

That's correct.  (void)s have no size.  (void *)s therefore have
limited use.  If you want to argue that sizeof(void) should be 0,
I would agree with you, but it still is not clear how to compare
"magnitude" of pointers to data of different types.

> * you can declare a function to be const or volatile.

That appears to be correct, but it's harmless since they are
meaningful only for expressions that are lvalues.

> * There seems to be no automatic conversion of const to normal or volatile
>to normal, e.g. you can't pass a const char * or a char *const or a
>volatile char * or a char *volatile to a function expecting a char *.
>I presume this is why the type of string constants was not made "const".

If a function formal parameter is non-const (char *), that's because
the function expects to be able to modify the data, so of course
feeding it a const actual parameter (which may be in ROM!) is an
error (actually, "undefined" behavior).  Why should an "automatic"
(usually erroneous!) conversion be silently performed?

The main reason string literals are not const is that some applications
really do overwrite their contents (e.g. mktemp() on UNIX).  The way
the draft ended up, an implementation can choose to make string literal
data ROMmable or not; this was deliberately unspecified.

> * you can't cast a void to type (void).

This is a necessary consequence of the decision that a void expression
has no value (Section 3.2.2.2).  This is essentially the same decision
as that (void)s are not objects.  If you want to dispute that decision,
you need to address all the problems that are raised by having (void)s
be objects.  Do you really think X3J11 didn't consider this issue?

> * sizeof (2+2) is valid, as is sizeof ("abcd"+2).

Yes, they are valid and (at least in the second case) possibly useful.
My pre-X3J11 compiler accepts these, too.

> * sizeof (array) returns the size of a pointer to the first element.
>(sec. 3.2.2.1 switches it to a pointer before sec. 3.3.3.4 can use the array)

That's not what Section 3.2.2.1 says.  The conversion to pointer is
done ONLY in contexts where an lvalue is not permitted (but not for
a string literal used as a char[] initializer).

> * multiple character char constants (e.g. 'abc') are legal and encouraged

They always have been legal in C.  Since they are legal, their
properties had to be defined.  They are NOT "encouraged"; Section
3.1.3.4.Semantics even warns that the value of such a construction is
"implementation-defined", which is like waving a red flag for the
programmer concerned about portability.

I have a problem with the assumed maximum number of bits in a char,
and have proposed an alternative as part of my multi-byte character
proposal.

> * empty arrays are explicitly disallowed

I don't think they were ever legal.

>I wish X3J11 had offerred prizes like the POSIX committee, but I don't
>think they could afford to.

X3J11 is not so frivolous.  If they were to offer prizes, it would
take more careful reading of the draft proposed standard than you
have exhibited to win one.

I'm sure there are errors, oversights, etc.  But let's get some
constructive suggestions based on a thorough understanding of what
is actually in the draft, rather than shoot-from-the-hip flames.

faustus@ucbcad.BERKELEY.EDU (Wayne A. Christopher) (12/04/86)

I haven't seen the latest C draft, but I remember a few things from the
older one that bothered me, namely some of the things defined in
limits.h.  The constants EXTERNAL_NAME_LENGTH (6), INCLUDE_FILES_NEST
(4), and SOURCE_LINE_LENGTH (509) in particular seem to be very
inadequate.  How were they decided upon?  Did the committee find the
implementation with the lowest limits for each of these, and then use
the values for the minimum allowable values, thus giving the vendors an
excuse not to fix their compilers?  A compiler that won't accept a
600-line input file is clearly sub-standard and almost useless, and I
object to the ANSII committee approving of this limitation.  The
argument that the standard is only codifying existing practice and not
dictating to vendors how to write their compilers doesn't hold water --
how many compilers supported function prototyping before X3J11?
Anyway, maybe it is too late for me to make this objection...  If
anybody can enlighten me on the rationale behind this I would
appreciate it.

	Wayne

henry@utzoo.UUCP (Henry Spencer) (12/04/86)

> ... The constants EXTERNAL_NAME_LENGTH (6), INCLUDE_FILES_NEST
> (4), and SOURCE_LINE_LENGTH (509) in particular seem to be very
> inadequate...

The EXTERNAL_NAME_LENGTH minimum is a consequence of the extreme political
undesirability of making it impossible to implement conforming compilers
on systems that have prehistoric object-module formats.  (Let us not get
into the war about the desirability of that again; this *is* the reason,
however inadequate it may seem to some.)  I consider INCLUDE_FILES_NEST
adequate -- multiply nested include files become unmanageable quickly --
but it is a bit low.  SOURCE_LINE_LENGTH is a bit curious, and the choice
of minimum value for it is truly bizarre.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,decvax,pyramid}!utzoo!henry

gwyn@brl-smoke.ARPA (Doug Gwyn ) (12/04/86)

In article <1155@ucbcad.BERKELEY.EDU> faustus@ucbcad.BERKELEY.EDU (Wayne A. Christopher) writes:
>...  A compiler that won't accept a
>600-line input file is clearly sub-standard and almost useless, and I
>object to the ANSII committee approving of this limitation.

They never did.

markb@sdcrdcf.UUCP (Mark Biggar) (12/05/86)

In article <1155@ucbcad.BERKELEY.EDU> faustus@ucbcad.BERKELEY.EDU (Wayne A. Christopher) writes:
>(4), and SOURCE_LINE_LENGTH (509) in particular seem to be very
>...
>excuse not to fix their compilers?  A compiler that won't accept a
>600-line input file is clearly sub-standard and almost useless, and I

SOURCE_LINE_LENGTH is the length of 1 line of input (i.e., the number of
chars between newlines) not the number of line in the input file.
I believe the 509 comes from taking 512 (2**9) and taking away the end of
line sequence which can, on some machines, be as much as 3 chars long (e.g.,
the DEC RSTS system that has 3 different line terminating sequences,
"\027" "^[", "\015\012" "\r\n" and "\012\015\000" "\n\r\0".)

Mark Biggar
{allegra,burdvax,cbosgd,hplabs,ihnp4,akgua,sdcsvax}!sdcrdcf!markb

faustus@ucbcad.BERKELEY.EDU (Wayne A. Christopher) (12/08/86)

Numerous people have replied to my SOURCE_LINE_LENGTH comment and said that
this is in fact the maximum length of one source line.  In the draft of about
a year ago (it's on my desk now and I'm at home) the description was "the
maximum number of lines in a source input file" or something very similar.
Was this a misprint?  It would make a lot more sense if it were.

	Wayne

throopw@dg_rtp.UUCP (Wayne Throop) (12/08/86)

> henry@utzoo.UUCP (Henry Spencer)

>> ... The constants EXTERNAL_NAME_LENGTH (6), INCLUDE_FILES_NEST
>> (4), and SOURCE_LINE_LENGTH (509) in particular seem to be very
>> inadequate...
> SOURCE_LINE_LENGTH is a bit curious, and the choice
> of minimum value for it is truly bizarre.

"Bizarre?" Really?  Isn't that the longest ANSI varying length format
record which will fit into a 512-byte block?  Oops, no that would be
508... hmmmmmmmm... where DOES that limit come from?

--
A LISP programmer knows the value of everything, but the cost of nothing.
                                --- Alan J. Perlis
-- 
Wayne Throop      <the-known-world>!mcnc!rti-sel!dg_rtp!throopw

tim@ism780c.UUCP (Tim Smith) (12/12/86)

> * you can declare a function to be const or volatile.

Hey, this could be nice for those people who write code that changes
itself.  If a function is declared volatile, then the compiler could
follow each instruction with a "flush instruction cache" instruction.

Or the compiler could arrange to put the function in a page that is
marked non-cached if the mmu supports that sort of thing.

I can hardly wait for the next obfuscated C code contest...
-- 
emordnilapregnolanalpanama

Tim Smith       USENET: sdcrdcf!ism780c!tim   Compuserve: 72257,3706
                Delphi or GEnie: mnementh

meissner@dg_rtp.UUCP (Michael Meissner) (12/13/86)

In article <738@dg_rtp.UUCP> throopw@dg_rtp.UUCP (Wayne Throop) writes:
>> henry@utzoo.UUCP (Henry Spencer)
>
>>> ... The constants EXTERNAL_NAME_LENGTH (6), INCLUDE_FILES_NEST
>>> (4), and SOURCE_LINE_LENGTH (509) in particular seem to be very
>>> inadequate...
>> SOURCE_LINE_LENGTH is a bit curious, and the choice
>> of minimum value for it is truly bizarre.
>
>"Bizarre?" Really?  Isn't that the longest ANSI varying length format
>record which will fit into a 512-byte block?  Oops, no that would be
>508... hmmmmmmmm... where DOES that limit come from?
>
>Wayne Throop      <the-known-world>!mcnc!rti-sel!dg_rtp!throopw

As I remember it, it came from a record oriented system, that stored a length
word (2 bytes) at the front, and some sort of trailer at the end.  The last
byte (trailer) may have been the fortran carriage control character (0, 1, +,
etc.) that is prepended.

Michael Meissner   <the-known-world>!mcnc!rti-sel!dg_rtp!meissner

decot@hpisod2.HP (Dave Decot) (12/16/86)

> The constants EXTERNAL_NAME_LENGTH (6), INCLUDE_FILES_NEST
> (4), and SOURCE_LINE_LENGTH (509) in particular seem to be very
> inadequate.  ...  		A compiler that won't accept a
> 600-line input file is clearly sub-standard and almost useless, and I
> object to the ANSII committee approving of this limitation.

I don't know about the others, but SOURCE_LINE_LENGTH is not the minimum
allowable maximum number of lines in a source file, it's the minimum allowable
maximum length of *each* source line.  Personally, I'd like to see that
particular value set to 128, so I can fit listings on my lineprinter.

Dave Decot
hpda!decot

dragheb@isis.UUCP (Darius Ragheb) (12/20/86)

In article <2550002@hpisod2.HP> decot@hpisod2.HP (Dave Decot) writes:
>> The constants EXTERNAL_NAME_LENGTH (6), INCLUDE_FILES_NEST
>> (4), and SOURCE_LINE_LENGTH (509) in particular seem to be very
>> object to the ANSII committee approving of this limitation.

>
>maximum length of *each* source line.  Personally, I'd like to see that
>particular value set to 128, so I can fit listings on my lineprinter.
>

Hmm. Why not just stick to the zero, one, infinite principle: if you are
going to support something (like the length of a line (:-) or the nesting
levels), why, either allow one level, or an infinite number (obviously there
will be an upper limit, probably machine dependant that will never be reached,
but why build an upper limit that is some arbitrary number like 4 or 6....
that is as bad as the old FORTRAN limit of 7 dimensions for an array....
where did that number come from?)

-- 
Functionality, Efficiency, Luxury.

isis!dragheb  |  dragheb@isis.cs.du.edu

meissner@dg_rtp.UUCP (Michael Meissner) (12/22/86)

/* context is talking about line length */

In article <1502@isis.UUCP> dragheb@isis.UUCP (Darius Ragheb) writes:
>
> Hmm. Why not just stick to the zero, one, infinite principle: if you are
> going to support something (like the length of a line (:-) or the nesting
> levels), why, either allow one level, or an infinite number (obviously there
> will be an upper limit, probably machine dependant that will never be reached,
> but why build an upper limit that is some arbitrary number like 4 or 6....
> that is as bad as the old FORTRAN limit of 7 dimensions for an array....
> where did that number come from?)

There are MANY operating systems out there that have maximum line length
restrictions, because a line is a record, and records have maximum sizes.
As the rationale says, these limits are maxima minima and are a treaty point
between the compiler vendor and user.  Thus the limit says to the vendor
that s/he must support AT LEAST 509 bytes/line (the minima part), and at the
same time tells the users that the maximum line size they can count on is
509 for portable programs (the maxima part).  Programs can still have more
than 509 bytes/line, but they are not maximally portable.
-- 
	Michael Meissner, Data General
	...mcnc!rti-sel!dg_rtp!meissner

karl@haddock.UUCP (Karl Heuer) (12/25/86)

In article <2550002@hpisod2.HP> decot@hpisod2.HP (Dave Decot) writes:
>I don't know about the others, but SOURCE_LINE_LENGTH is not the minimum
>allowable maximum number of lines in a source file, it's the minimum allowable
>maximum length of *each* source line.  Personally, I'd like to see that
>particular value set to 128, so I can fit listings on my lineprinter.

Note that this constant refers to the line length *after* macro expansion.  I
just tried running the preprocessor on the line "putchar(getchar());" (with
<stdio.h>) and found it to be 642 characters.

This may be a pathological case, but I think it illustrates the need for a
moderately large value for this constant, even if "real" source files are 80
columns wide.

If you want to fit *processed* listings on your printer, I suggest you obtain
(or write) a filter to fold long lines.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint