gnu@hoptoad.uucp (John Gilmore) (12/24/87)
While testing the GNU C compiler using the MetaWare C Validation Suite,
I found a bunch of things that are not clearly marked in the Oct 86 copy
of the C draft standard. I'm interested in clarification from the group
and/or from the standards committee on these:
* I've heard a rumor that in newer drafts, hex escape sequences inside
strings are no longer limited to 3 characters, e.g. "abc\x00345"
produces "abcE" since 'E' (0x45) is (char)0x00345. This strikes me as odd.
* Is a "const void *" a void pointer? Is a "volatile void *"? Howabout
a "const volatile void *"? Howabout a "void *const"? A "pointer to void"
gets special treatment in a few places (e.g. in assignment) but it's not
clear whether these are "pointers to void". (Howabout "volatile void
*noalias const foo", just for fun?)
* It appears from the text in section 3.5.3.3 that variable-argument-list
functions can ONLY be defined in the
foo(int c, float bar, ...)
syntax, and not in the
foo(c, bar, ...)
int c;
float bar;
syntax. GCC implements it this way. However, this makes it impossible
to support <varargs.h> since no existing code uses the new declaration
method. It also seems to be a silly inconsistency.
* Though a union can now be initialized, you can only initialize one
member, but you have to surround it with { } anyway. To correctly
initialize the structure below, ALL the braces used are required:
struct s{union {int x,y; int z;} u; int q;} s[2] =
{{{1}, 2}, {{3}, 4}};
I would have expected:
{1, 2, 3, 4};
to work, but the wording of the standard does not support it. Furthermore,
the standard does not allow extra braces (e.g. around 2 and 4, or around the
whole thing), so you have to get it exactly right. Is this what was intended?
(These next two aren't from the validation suite.)
* When calling a function, side effects caused by evaluating the arguments
must be complete before the call takes place. What about side effects
caused by evaluating the function name? Ron Light gave this example:
typedef int (*Inst)(); /* machine instruction */
Inst *pc; /* program counter during execution */
execute(p) /* run the machine */
Inst *p;
{
for(pc = p;;)
(*(*pc++))();
}
If I reference "pc" from inside a function called from the forloop, is
its value guaranteed to be incremented, to not be incremented, or not
guaranteed?
* Are null statements (extra semicolons) allowed between declarations?
Between struct/union members' declarations?
--
{pyramid,ptsfa,amdahl,sun,ihnp4}!hoptoad!gnu gnu@toad.com
I forsee a day when there are two kinds of C compilers: standard ones and
useful ones ... just like Pascal and Fortran. Are we making progress yet?
-- ASC:GUTHERY%slb-test.csnet
gnu@hoptoad.uucp (John Gilmore) (12/27/87)
I wrote: > * When calling a function, side effects caused by evaluating the arguments > must be complete before the call takes place. What about side effects > caused by evaluating the function name? I found this answer myself: section 3.3.2.2 says: The order of evaluation of the function designator, the arguments, and subexpressions within the arguments is unspecified, but there is a sequence point before the actual call. Thus all side effects in the function name and/or arguments must take place before the call. -- {pyramid,ptsfa,amdahl,sun,ihnp4}!hoptoad!gnu gnu@toad.com I forsee a day when there are two kinds of C compilers: standard ones and useful ones ... just like Pascal and Fortran. Are we making progress yet? -- ASC:GUTHERY%slb-test.csnet
mnc@m10ux.UUCP (Michael Condict) (12/28/87)
In article <3725@hoptoad.uucp>, gnu@hoptoad.uucp (John Gilmore) writes: > While testing the GNU C compiler using the MetaWare C Validation Suite, > I found a bunch of things that are not clearly marked in the Oct 86 copy > of the C draft standard. I'm interested in clarification from the group > and/or from the standards committee on these: > > . . . > > * It appears from the text in section 3.5.3.3 that variable-argument-list > functions can ONLY be defined in the > > foo(int c, float bar, ...) > > syntax, and not in the > > foo(c, bar, ...) > int c; > float bar; > > syntax. GCC implements it this way. However, this makes it impossible > to support <varargs.h> since no existing code uses the new declaration > method. It also seems to be a silly inconsistency. Another silly inconsistency (although it is probably too late to fix in the current standard) is the use of comma instead of semicolon in the new declaration syntax. There already were at least three places in the existing language where one could write a sequence of declarations of names, e.g. in struct declarations between { and }, and in the old-style declaration of the formal args of a function. In all these places, semicolons are used to terminate each declaration, not comma. This is crucial, because comma is also allowed in these constructs and serves to concisely declare a list of identifiers of the same type: int a,b,c; The ANSI committee's adopted syntax introduces two defects in the language: (1) It confuses users by being inconsistent with these other, highly analogous syntactic constructs, and for the same reason makes parsers unnecessarily complex. (2) It eliminates the possibility of allowing the concise declaration syntax shown above in the new declaration syntax. My guess is that the committee's choice of syntax was based on reasoning something like: "currently, commas are required between formal arguments in the header of the function (i.e., between '(' and ')'), so we must preserve that to avoid confusing users." This argument however is less than persuasive if we note that the addition of the new declaration syntax so radically alters what is allowed inside the () that some users are bound to be confused anyway. And besides, it is easy to describe my advocated version of the new declaration syntax in words that make it out to be a natural extension of the old syntax: (1) In K&R C, the construct "f(a,b,c) ... {" is an abbreviation for "f(int a,b,c;) ... {". (Note that these same two abbreviations are allowed elsewhere in the language.) The meaning is that a, b and c are all declared to be ints, as is the case elsewhere in the language, except that their declaration(s) can be overridden by a redeclaration in the ... stuff between ')' and '{'. This is consistent with how things work now. (2) In ANSI C, the type word "int" may not be omitted, just as it may no longer be omitted elsewhere in the language. The trailing ";" is still optional. (3) Furthermore, in ANSI C, we extend the syntax and semantics to allow an arbitrary sequence of declarations inside the (), with the semicolon optional for the last one. E.g.: f(int a,b; struct {float i,r;}) { (We should probably also allow bit field declarations, since our syntax and semantics is essentially equivalent to the case where every function takes one argument, but that argument is a struct type. No new implementation difficulties arise, since it is already legal to declare a function that takes as argument a struct with bitfields.) (4) Now, since any types of arguments can be declared without putting stuff between ')' and '{', we don't allow you to redeclare args there, with one exception: for backward compatibility, we allow the old style declaration, at least until the next version of the standard, but it is a depecrated feature. That is, you can still abbreviate "int a,b,c" to "a,b,c", and if your entire declaration sequence within () consists of such an abbreviation, you can redeclare the types of some or all of the args, using declarations occurring between ')' and '{'. Described this way, it is clear why (my version of) the new syntax is to be preferred to K&R syntax: it doesn't make sense to be declaring the args as ints inside of the () then redeclaring them afterwards. Am I the only one bothered by this? I've noticed no other discussion of it. -- Michael Condict {ihnp4|vax135|cuae2}!m10ux!mnc AT&T Bell Labs (201)582-5911 MH 3B-416 Murray Hill, NJ
msb@sq.uucp (Mark Brader) (12/29/87)
Michael Condict (mnc@m10ux.UUCP) expresses regret that the new function
prototype syntax uses commas rather than semicolons as delimiters, and asks:
> Am I the only one bothered by this? I've noticed no other discussion of it.
No, I asked the same question well over a year ago. As I recall, the answer
given was that if semicolons were allowed then error recovery became very hard.
Notice that the following would be VALID input:
int f (int a, b;
float c;
char *p, s[20];
int p (int q, r;);
);
Now that I think of it, the force of this argument seems somewhat weakened
since, if I understand correctly (my copy of the latest Draft being at the
office, and me not), even under the existing syntax a declaration such as
int f (struct {int p, q;} r);
is legal and does contain embedded semicolons. Hmm.
Mark Brader "C takes the point of view
SoftQuad Inc., Toronto that the programmer is always right"
utzoo!sq!msb, msb@sq.com -- Michael DeCorte
OWENSJ%VTVM1.BITNET@CUNYVM.CUNY.EDU (John Owens) (12/30/87)
[Michael Condict suggests allowing declaration syntax, separated by semicolons, in the function definition argument lists.] While this may sound clean from a language-design perspective, it makes the resulting definitions harder to use. With the current proposed syntax, someone wanting to know the types and number of arguments can see them easily, separated by commas, just as they are specified in the calling sequence. Michael's syntax would lose the one-to-one correspondence both with the calling sequence *and* the function prototypes. I think this correspondence is important to preserve, unless we want to see C go the way of Algol 68.... -John Owens +1 703 961 7827 Virginia Tech Communications Network Services OWENSJ@VTVM1.CC.VT.EDU OWENSJ@VTVM1.BITNET
gwyn@brl-smoke.ARPA (Doug Gwyn ) (01/06/88)
In article <3725@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes: > * I've heard a rumor that in newer drafts, hex escape sequences inside > strings are no longer limited to 3 characters, e.g. "abc\x00345" >produces "abcE" since 'E' (0x45) is (char)0x00345. This strikes me as odd. Yes, hex escapes are arbitrarily long now. I initiated the action that ended up with this, although it's not what I originally proposed. The problem was that a 3-character limit is not enough for implementations with char sizes > 12 bits. I proposed that the implementation define what the limit is, but the committee preferred to remove the limit (for hex escapes only; it's too late to change octals). \x00345 is no weirder than \x345 on an 8-bit machine. The problem of wanting to follow a hex sequence with a digit character can be solved by using string concatenation: "\x003""45". > * Is a "const void *" a void pointer? Is a "volatile void *"? Howabout >a "const volatile void *"? Howabout a "void *const"? A "pointer to void" >gets special treatment in a few places (e.g. in assignment) but it's not >clear whether these are "pointers to void". I think they're all just "void pointers". I didn't find any special meaning specified for qualified void pointer types. > * It appears from the text in section 3.5.3.3 that variable-argument-list >functions can ONLY be defined in the > foo(int c, float bar, ...) >syntax, and not in the > foo(c, bar, ...) > int c; > float bar; >syntax. GCC implements it this way. However, this makes it impossible >to support <varargs.h> since no existing code uses the new declaration >method. It also seems to be a silly inconsistency. Yes, the ", ..." is not existing practice. Existing practice (non- prototype declarations) was retained simply to "grandfather" in existing code, but it has been flagged "obsolescent" to permit its removal in some future revision of the standard. There was little sentiment for propping up the obsolescent syntax by adding ", ..." to it. If you have to add the variadic indicator ", ..." to a declaration, you might as well convert it to prototype form at the same time. > * Though a union can now be initialized, you can only initialize one >member, but you have to surround it with { } anyway. To correctly >initialize the structure below, ALL the braces used are required: ... Logically, it could have been made more convenient for unions, but it apparently didn't occur to anyone to do so. > Furthermore, >the standard does not allow extra braces (e.g. around 2 and 4, or around the >whole thing), so you have to get it exactly right. Is this what was intended? I think extra {} are allowed by the grammar. There was some small change made to the bracketing wording at the December meeting, but I don't recall what it was. (It seemed correct at the time, so I quit worrying about it.) > * When calling a function, side effects caused by evaluating the arguments >must be complete before the call takes place. What about side effects >caused by evaluating the function name? Ron Light gave this example: ... Using a pointer to a function does not constitute evaluating the function name. Quoting the latest draft: "The order of evaluation of the function designator [postfix-expression], the arguments, and subexpressions within the arguments is unspecified, but there is a sequence point before the actual call." I added the [] remark for clarity. It seems simple enough to me: because of the sequence point, the increment must occur before the actual call. >If I reference "pc" from inside a function called from the forloop, is >its value guaranteed to be incremented, to not be incremented, or not >guaranteed? Guaranteed to be incremented. > * Are null statements (extra semicolons) allowed between declarations? >Between struct/union members' declarations? I don't see how it can be; a null statement is an expression-statement, which involves "evaluation". P.S. Of course, the above are merely my own opinions. Send in comments to X3J11 during the next formal public review if you remain unsatisfied about any of these issues.
gwyn@brl-smoke.ARPA (Doug Gwyn ) (01/06/88)
In article <458@m10ux.UUCP> mnc@m10ux.UUCP (Michael Condict) writes: >Another silly inconsistency (although it is probably too late to fix in the >current standard) is the use of comma instead of semicolon in the new >declaration syntax. In C, semicolon has always been a statement terminator (it can be considered as such even in the "for(;;)" kludge), while comma has been used as a separator in lists (and, much less often, as a sequencing operator). If you're going to make "consistency" arguments, you should also deal with this one.