[comp.std.c] X3J11 interpretations

gwyn@smoke.BRL.MIL (Doug Gwyn) (04/07/90)

Since a month has elapsed with nobody else posting the information,
I thought I'd give you a summary of what I think X3J11 decided
during the first meeting of our "interpretations phase".  However,
NOTE THAT THESE ARE MY OWN NOTIONS AND MAY NOT AGREE WITH OFFICIAL
X3J11 POSITIONS.  In particular, don't risk anything valuable based
on what I say; if you need an interpretation ruling, send your
request to X3J11 via CBEMA X3.  (If you already submitted such a
request, don't take what I say as necessarily reflecting the
official X3J11 response that you should receive some day.)

As previously reported, there were some additional edits other than
simple formatting changes made to the C standard between the
previous time that most of X3J11 saw it and the time that it became
an ANSI standard.  At the New York X3J11 meeting last month, all
the changes were unanimously approved as truly editorial, meaning
that nobody on the committee believed that any of them had changed
what we had thought we were trying to say all along.  Some of the
editorial changes addressed issues raised by ISO WG14 members, most
notably a clarification of the distinction between "undefined" and
"unspecified" behavior, which so far as I can see is the only real
point that needed to be addressed by the so-called "BSI concerns
about instances of undefined behavior" that were previously reported.
To reassure you that the unanimous approval of the changes wasn't
simply rubber-stamping a fait accompli, I came to the meeting with a
list of specific concerns about the changes, and met with the draft
editor to review them; I became satisfied that even the "borderline
editorial" changes did in fact agree with previous committee intent.
Also, there was a motion to officially commend the X3J11 members
who had worked on these changes, but there were objections from the
floor to this, and it was withdrawn.  I was one of the objectors:
while I agreed with the changes that were made, I disliked the
process that had occurred, especially the fact that it seemed mainly
motivated by the desire to cater to ISO comments, given that WG14
members had been engaging in political moves counter to the decisions
reached by formal vote at WG14 meetings and counter to agreements we
had established at the last joint X3J11/WG14 meeting, now effectively
thrown out the window by this maneuvering, which appears to be
occurring contrary to civilized procedures.  Instead of accepting the
results of the established procedures, some members of WG14 are
acting like spoiled children who didn't get the specific toys they
clamored for.  I'm utterly disgusted with their antics; the fact that
they are now trying political maneuvers indicates that they must
realize that they don't have convincing technical arguments.  More
on this near the end of this posting, when I recover my aplomb.

Anyway, there were a handful of official requests for interpretation
discussed at the X3J11 New York meeting last month, plus some
unofficial ones, including several that I had collected from various
past Usenet discussions.  I'll report what I believe the committee
position was on as many of these as I can recall, but note that I am
NOT guaranteeing correctness of any of these.  Some of the answers
turned out to be surprising to many of us -- note that we now are
obliged to back what the standard explicitly states, when it IS
explicit, not what we wished we had made it say.  There is room for
interpretation of intent only where the official standard wording
leaves room for interpretation.

The following were discussed at the New York meeting, involving
anywhere from a few members to full committee debate, depending
on how controversial each interpretation seemed to the groups
reviewing the issues.

+ Behavior is "undefined" only where explicitly labeled as such.

+ To satisfy BSI, a footnote was added (in the document before it
reached ANSI) to the definition of null pointer constant to refer
to the NULL macro.  Note that this is misleading, since while NULL
must be defined as a null pointer constant, the converse is not true.
Null pointer constants are completely specified as intended without
the footnote; programmers do not have to ever use the NULL macro if
they prefer to roll their own using "0" etc..

+ Similarly, to satisfy ITSCJ, "header name" was explicitly added
to a list in 2.2.1 in the ANSI standard, even though it was already
logically covered.  This might lead to more confusion than it
addresses; I think a footnote would have been preferable.

+ Bit fields may have type qualifiers.

+ Initializers disregard the qualifiers.

+ Incomplete typedefs are not completed by using them to declare
objects that are eventually completed.

+ If an external identifier is never used, there shall not be more
than one definition for it.

+ Preprocessors take the subsequent source file context into account,
they don't have to look backward.  ("rest of the source file" issue)

+ fscanf() numeric field widths do not include the leading white
space.  This is unfortunate, but it's the way existing implementations
actually work.

+ Most of the BSI comments on the ISO ballot were off the mark; those
that had merit were addressed via editorial changes that appear in
the ANSI standard.

+ Most of the ITSCJ comments on the ISO ballot were addressed by
editorial changes in the ANSI standard.

+ Trigraphs are only for convenience in transporting code, not for
routine use while typing in programs -- where the problems are far
better addressed by solutions specific to the equipment, operating
systems, etc. that are involved than by some mechanism in the C
language itself.

+ The sentence in parentheses on line 22 page 69 of the Dec. 1988
draft refers to the whole paragraph.

+ Composite types are NOT formed across scopes; surprise!

+ Default initializers for arrays are all elements zero value.

+ Default initializers for incomplete objects don't matter, because
no strictly conforming program could depend on the behavior anyway.

+ Considerable discussion on whether struct-valued functions must in
effect return values by copying failed to lead to a conclusive
agreement yet; however, I'm of the opinion that the license granted
for simple assignment where the RHS and LHS involve imperfectly
overlapping objects cannot reasonably be said to apply across
sequence points outside the assignment expression, in particular
the one at the full expression constituting the return statement.
I.e. I think GCC is wrong here.  The author of the request on this
issue is to be invited to attend the Sep 1990 X3J11 meeting where
discussion will resume.

+ The \ pp-token is not handled specially in stringizing, only the
\ parts of string literals and character constants are specially
handled.

+ Proposals for changes in the definition of pp-numbers are too late,
and, yes, the standard says what X3J11 deliberately intended here.

+ It's also too late for changes in macro replacement rules.

+ And it's too late to allow preprocessor directives within macros.
even if X3J11 wanted to require this, which we didn't

+ In fact, all proposals for changes will be rejected at this point.

+ Implementations are deliberately allowed to give meaning to empty
arguments to parameterized macros (in fact AT&T's SVR4 "cc" does),
which is why we made it undefined behavior instead of requiring a
diagnostic.

+ Conforming implementations can accept $ etc. in identifiers so
long as at least one diagnostic is produced.  ("Thank you for using
Digital's superior features!" would suffice.)

+ Copies of a document extracted apparently from "BSI Quality
Assurance 1989, C Validation Report" were circulated; it listed
what BSI thought were the implementation-defined features that
vendors were required to document.  However, when I started to check
it I discovered that wording had been changed in ways that resulted
in meaning changes, specific questions were asked that the C standard
does NOT require be answered, section headings were misassigned, at
least one requirement was left out, etc.  Consequently I recommend
vendors simply use the Appendixes of the ANSI C standard for this list.

+ NCEG is proceeding with X3J11's blessing.  Judging by Rex's last
issue of The Journal of C Language Translation, he still hasn't fully
understood the objection I raised during the X3J11 ballot on the
admission of NCEG as a working group within X3J11; he certainly
misrepresented it in the Journal.  Perhaps I should submit an article
to the Journal about this.

+ FOPEN_MAX must take into account additional "preopened" streams

+ Discussion about required floating-point precision and equality
testing was eventually tabled.  (Basically the issue was when guard
bits have to be stripped.)

+ Strictly conforming programs cannot contain #pragma (surprise!)...

+ ... because #pragma may affect semantics.  (#pragma Ada, anyone?)

+ strtoul() overflow can occur only on conversion of the magnitude,
not on the negation phase of the conversion.

+ A question about constraining optimizations and setjmp() relates
to an ongoing off-line discussion I've recently been involved in,
where the stakes have been raised by considering parallel threads
with shared memory and heavy per-thread caching.  While X3J11 has
not responded to this latter issue, my current impression is that
optimization is controllable by proper use of volatile, especially
via casts for fine-grained synchronization, although it's not
particularly convenient for the programmer to exploit this technique.
X3J11 did vote on the interaction between volatile and setjmp()/
longjmp(), however, and agreed unanimously that the standard clearly
specifies the requirements there.

+ An ambiguity in the use of typedef names in parameter declarations
is to be resolved by "if it could be validly taken as a typedef, do
so".  This is the first real case where I think an interpretation
was truly necessary, due to an unintended ambiguity in the spec.

+ The problem of the "type" in the offsetof() spec is correctly
resolved by using the Rationale as a guide to what was intended;
NOT every arbitrary literal substitution for "type" is intended.
The type is meant to be formally a type-specifier, as well as a
struct or union type.

+ main() must support recursive use, like any normal C function.
If VMS C doesn't support this then it is non-conforming (it should
be easy to fix though).

+ const foo_t array[][] consists at the inner level of const foo_t
members, since the rule about distributing const applies recursively.

+ Diagnostics for implementation-defined behavior do not have to
look different from ones for constraint/syntax violations.  This
is a "quality of implementation" issue: at any time, any form of
diagnostics MAY be produced, but useful ones would be preferable.

+ Use of LC_TIME environment variable as part of the implementation's
secret method for determining how to set is_dst is beyond the scope
of the standard.  (I.e. it is allowed.)

+ If sizeof(int)==sizeof(char), testing EOF can indeed be a problem
if if happens to be a valid data value, in which case feof() will
have to be used to tell the two cases apart.

+ There are no technical problems with using malloc()ed objects
I.e. my dirent implementation is portable: bounds checking is NOT
allowed in cases like the common struct record_header {...buffer[1];
/*more than 1 actually allocated*/} technique where the right size
is allocated for the buffer in the object via malloc() (as per
Karl Heuer's argument).

+ "Blue paint" persists for the duration of the translation.
The example in the standard IS right, and furthermore such
examples are not supposed to reflect only one implementation
choice, but work as shown for all conforming implementations.

+ The problems Ed Keizer mentioned with <limits.h> definitions
aren't real problems; note the spec says "integral promotions".
It doesn't matter that an unwary programmer could be surprised by
what happens if he uses these incautiously, they're specified the
way we intended after discussing the issues some time ago.  (Given
the way C integral types get promoted when int and unsigned
collide, there is no fully satisfactory solution possible.)

+ Semantic violations are not forced simply by use of expressions
as operands of sizeof, in cases where the expressions would produce
semantic violations if actually executed.  I.e. sizeof(&a[N]), which
expands to sizeof(&*(a+N)) is valid since a+N is a valid pointer,
even though *(a+N) cannot be executed in a strictly conforming
program.  Strange but true: semantically invalid expressions may
occur in strictly conforming programs in some contexts so long as
they are never executed!

The following weren't discussed at the meeting as I recall, but
here are my tentative interpretations based on other discussions:

+ There seems to be some sentiment that fscanf fails to match "1.2e-x"
altogether with the %f format, even though "1.2x" would presumably
match three characters.  I don't think this issue has been resolved
yet, although an example in the standard indicates this interpretation.
I think the example is simply wrong, if you carefully trace through
the specs for strtod() that are supposed to apply here.  However, I
see a way around the need for three characters of pushback, since it
is the "conflicting input character" that is "left unread", not all
the characters that aren't part of the match, which lets Chris Torek
push back the 'x' in the above example and drop the 'e' and '-' into
the trash can.  I wish a formal request for interpretation would be
filed so the committee can rule officially on this one.

+ strncpy() must transfer all requested bytes when the source string
is longer than the request.  ("not more than n" does not mean "less
than n is okay")

The following are miscellaneous interesting things form the NY meeting:

+ Tom Plum volunteered to be the X3J16 (C++) liaison.

+ Russell Hansberry filed another protest at the ANSI level, but after
the 15-day window had passed, so it was too late to delay ANSI approval.
There is now an official ANSI C standard.

+ X3 lawyers have discovered that X3 has no copyright interest in the
standard.  ANSI may not realize this yet.  Someone should ping them.

+ The ANSI C standard is supposed to include the Rationale in some
form, although we were having trouble contacting its editor to get
the best master copy for the printers.

+ ISO DP9899 was approved with no "no" votes, producing a DIS which
will be considered by ISO SC22 JTC1 and if no "no" votes there it
would automatically become an IS.  At the Sep 1989 Berlin plenary,
the Brits and Danes in the absence of e.g. the U.S. WG14 member tried
to block ISO approval, but Follett got SC22 to proceed with DIS
balloting with "work items" split off to produce "normative addenda"
for the "undefined behavior" and "digraphs" issues.  Work item 01 is
DP/DIS continued, and 02 is "integrity addendums" (ironic, given the
lack of integrity shown in the behavior of the WG14 members).  The
6-month DIS ballot started in Dec 1989 and ends 21 Jun 1990.
Meanwhile the Brits and Danes are reported to have been circulating
draft documents to a selected subset of WG14, excluding comment from
those who know what is wrong with their ideas.  There is no longer
any plan to produce the joint "information bulletin" that X3J11 had
agreed to work with BSI on.  Personally I think that the editorial
change made in section 1.6 of the ANSI standard takes care of the
only thing that might have been a real issue here.  The digraph issue
is simply a situation where the Danes appear to have not been
listening to the responses but simply insisting that they're right,
even though they were given careful attention then outvoted on this
in both WG14 and X3J11.  I think SC22 made a serious error in
this closed issue to be reopened, especially as the proposal cannot
be included in the IS without conflicting the the ANS (so that a
mutually-conforming implementation would be impossible).

+ Some ISO representatives are going to comment on their ballots that
they require that the IS be technically identical to the ANS, and not
to the earlier draft that ISO picked for DP9899.

+ CBEMA is going to start handling X3J11 mailings and billing members
for mailing costs, instead of relying on members to volunteer their
companies to cover the costs as in the past.

+ The next X3J11 meeting will be in Pleasanton CA, 24-25 Sep 1990,
and tentatively the one after that would be 04-05 Mar 1991 although
we have no sponsor signed up for it yet.  More interpretations...

richard@aiai.ed.ac.uk (Richard Tobin) (04/11/90)

In article <12547@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
>Meanwhile the Brits and Danes are reported to have been circulating
>draft documents to a selected subset of WG14, excluding comment from
>those who know what is wrong with their ideas.

Could someone tell us a bit more about what's going on here?  If I had
realised that there was significant disagreement between BSI and ANSI,
I might have gone to some BSI meetings, but unfortunately I didn't.

Anyone from BSI out there to put the other point of view?

-- Richard
-- 
Richard Tobin,                       JANET: R.Tobin@uk.ac.ed             
AI Applications Institute,           ARPA:  R.Tobin%uk.ac.ed@nsfnet-relay.ac.uk
Edinburgh University.                UUCP:  ...!ukc!ed.ac.uk!R.Tobin