minow@decvax.UUCP (Martin Minow) (12/14/86)
This is one of a collection of comments on the Draft Standard, posted to comp.lang.c for discussion before I mail a final draft to the Ansi C committee. Each message discusses one problem I have found with the Draft Standard that I feel warrants a "no" vote. Note that this message is my personal opinion, and does not reflect on the opinions of my employer. This message lists concerns -- these are questions or problems, but are not sufficiently serious as to preclude my acceptance of the standard. ---- Page 1, line 14. The standard should specify the total list of words reserved to the compiler and its libraries. Page 6, line 40ff. It is unclear whether the main() function may be declared or invoked with more than 2 parameters. One common extension is to invoke main with a third parameter which specifies a list of "environment variables." Page 7, line 12. Must the string in argv[0] be modifiable? Page 10, 27. The horizontal tab, vertical tab, and form feed characters are not needed by the language. The standard should declare that horizontal tab is identical to space except in character and string literals, and vertical tab and form feed are everywhere identical to newline. Page 11, lines 29ff. The standard should specify the internal representations for the predefined escape sequences for implementations that use the USASCII or Latin 1 alphabets, Page 12, line 29. The minimum significance for external identifiers should be changed to ``6 significant monocase initial characters in an external identifier.'' Page 14, line 20ff. FLT should be FLOAT. DBL should be DOUBLE. etc. As the first 31 characters of macro definitions are significant, there is no need to sacrifice legibility (and maintainability) for consiseness. Page 26, line 13. The exceptions (the characters that may not appear in string literals) should include the vertical tab character and the form feed character, as these are equivalent to newlines. Page 74, line 28. Horizontal tab does not have an independent existance during preprocessing. The example should note that comments may preceed or follow the # that introduces a preprocessing directive. Page 75, line 36. An arithmetic error in an #if expression (such as divide by zero) shall result in a diagnostic error message. However, a sequence such as: #if (foo == 0) ? 0 : (10 / foo) should not result in a diagnostic error message for any value of foo. Page 82, line 24ff. I would recommend the following clarifications to the definition of the predefined macro names: __LINE__ The line number shall be as defined in section 3.8.4, page 81, line 30. __FILE__ There is no presumption that this string can be used to open a file during execution of the program. __DATE__ Neither this value nor the value of __TIME__ change during compilation. A predefined name should be redefinable (by #undef). (The identifier "defined" may not be redefined.) Page 83, line 15ff. Function prototypes with separate parameter identifier and declaration lists offer a better environment for documentation than the more concise function prototype format. I would strongly recommend that they not be marked obsolescent. Page 85, line 35. The ability to redefine any function declared in a header as a macro may break existing programs that write, e.g., #include <stdlib.h> extern long rand(); If rand() is declared as a macro, Page 89, line 13 (footnote 64): The Standard should note that, in an implementation that uses the Latin 1 character set, the printing characters are those whose values lie from 0x20 through 0x7E or from 0xA0 through 0xFF. Control characters are those whose values lie from 0x00 through 0x1F, 0x7F, or from 0x80 through 0x9F. The ranges for the other <ctype.h> macros should be similarly extended. Page 91, line 46ff. Note that, in a Latin 1 environment, the ispunct() and isspace() functions should test for the non-breaking space at 0xA0. Page 102, line 46ff. If longjmp() is called from a signal handler, volatile objects may have indeterminate values as they cannot always be updated by atomic (one machine cycle) operations. It is unrealistic to require an implementation to lock interrupts before modifying a volatile object. The Standard should note that volatile objects are indeterminate when longjmp() is called from an interrupt or signal handler. Page 128, line 7. Is one character of pushback guaranteed even before anything has been read from the stream or after end of file or error? The standard should be clarified on this point. (I don't care either way, but would prefer permitting one character pushback at any time.) Page 140, line 21ff. Predefined values for "successful termination" and "unsuccessful termination" (argumemts to exit()) should be provided. Page 142, line 16ff. An unsigned division function analogous to div() would be useful. ---- Martin Minow decvax!minow
gwyn@brl-smoke.ARPA (Doug Gwyn ) (12/15/86)
In article <112@decvax.UUCP> minow@decvax.UUCP (Martin Minow) writes: >Page 1, line 14. The standard should specify the total list of words >reserved to the compiler and its libraries. While this would be "nice", one can pretty much find this out from the index, and the standard isn't intended to be either a tutorial or a user reference manual. I would hope that vendors and textbook authors will consider providing such a list. >Page 6, line 40ff. It is unclear whether the main() function may be >declared or invoked with more than 2 parameters. I thought this was clear: main() can be defined with either 0 or 2 parameters. Other schemes are not defined, which allows extensions such as UNIX's envp but does not mandate them for all implementations. (Note that envp is not normally necessary, given getenv().) >Page 7, line 12. Must the string in argv[0] be modifiable? That's what the draft says. Is this a problem? >Page 10, 27. The horizontal tab, vertical tab, and form feed characters >are not needed by the language. The standard should declare that >horizontal tab is identical to space except in character and string >literals, and vertical tab and form feed are everywhere identical to newline. There are several flavors of whitespace in C (including the preprocessor). Some generalization was done where possible; did we miss any? >Page 11, lines 29ff. The standard should specify the internal representations >for the predefined escape sequences for implementations that use the >USASCII or Latin 1 alphabets, So long as we don't mandate ASCII/ISO character sets, this is infeasible. >Page 12, line 29. The minimum significance for external identifiers >should be changed to ``6 significant monocase initial characters in >an external identifier.'' Section 3.1.2 permits implementations to ignore case distinctions. 2.2.4.1 is merely to establish that at least 6 significant characters can be used in external identifiers simultaneously with meeting other implementation limit requirements, and nothing is gained by mentioning case-mapping in this context. >Page 14, line 20ff. FLT should be FLOAT. DBL should be DOUBLE. etc. As >the first 31 characters of macro definitions are significant, there is no >need to sacrifice legibility (and maintainability) for consiseness. That would be nice, but we also have SHRT_MAX, for example, which is defined in a header that is shared between two standards bodies and is therefore difficult to redefine. (It's also possible that these names were chosen to agree with the new Fortran standard; I forget.) >Page 26, line 13. The exceptions (the characters that may not appear >in string literals) should include the vertical tab character >and the form feed character, as these are equivalent to newlines. Where are these characters declared to be "equivalent to newlines"? >Page 74, line 28. Horizontal tab does not have an independent existance >during preprocessing. The example should note that comments may preceed >or follow the # that introduces a preprocessing directive. Section 2.1.1.2 (Translation phases) states that an implementation MAY retain distinct white-space characters at the point of preprocessing. However, comments must have been turned into single space characters at that point. >Page 75, line 36. An arithmetic error in an #if expression (such as divide >by zero) shall result in a diagnostic error message. However, a sequence >such as: > > #if (foo == 0) ? 0 : (10 / foo) > >should not result in a diagnostic error message for any value of foo. [The page/line reference seems wrong.] I think the error handling is already implied by the syntax, but perhaps explicit wording would help. (Note that the example is correct code and should not cause a diagnostic in any case.) >Page 82, line 24ff. I would recommend the following clarifications to >the definition of the predefined macro names: > > __LINE__ The line number shall be as defined in section 3.8.4, > page 81, line 30. That's already my understanding of the draft. > __FILE__ There is no presumption that this string can be used to > open a file during execution of the program. That's the way it is now. The sources clearly need not even exist in the run-timem environment! > __DATE__ Neither this value nor the value of __TIME__ change during > compilation. That might be nice, but how important is such a constraint on implementations? I bet there even are people who would prefer the __TIME__ clock to continue to tick during compilation. >A predefined name should be redefinable (by #undef). (The identifier >"defined" may not be redefined.) No, since these names begin with underscore, the user cannot safely redefine them anyway; they're not in his "allowable name space". >Page 83, line 15ff. Function prototypes with separate parameter identifier >and declaration lists offer a better environment for documentation than >the more concise function prototype format. I would strongly recommend >that they not be marked obsolescent. The intent is to eliminate any requirement that old-style function parameter declarations be supported in a future revision of the standard. The only way (it appears) that we can do that is by calling them "obsolescent" in a previous draft. >Page 85, line 35. The ability to redefine any function declared in >a header as a macro may break existing programs that write, e.g., > > #include <stdlib.h> > extern long rand(); > >If rand() is declared as a macro, First of all, I doubt that existing programs #include <stdlib.h>. When adding such an #include to existing source, you should also remove any explicit redundant declarations (except when they are really necessary, in which case use #undef or one of the other usual tricks to force use of a genuine function). I'll be among the first to admit that this approach has its problems, but I don't know of anything better. If you can suggest a better way to handle this, please write it up and mail it in to ANSI. >Page 89, line 13 (footnote 64): The Standard should note that, in an >implementation that uses the Latin 1 character set, the printing >characters are those whose values lie from 0x20 through 0x7E or from >0xA0 through 0xFF. Control characters are those whose values lie from >0x00 through 0x1F, 0x7F, or from 0x80 through 0x9F. The ranges for the other ><ctype.h> macros should be similarly extended. > >Page 91, line 46ff. Note that, in a Latin 1 environment, the ispunct() and >isspace() functions should test for the non-breaking space at 0xA0. No particular character set is required, so we can't make such remarks in the standard itself. Perhaps the Rationale should give such examples. >Page 102, line 46ff. If longjmp() is called from a signal handler, volatile >objects may have indeterminate values as they cannot always be updated by >atomic (one machine cycle) operations. It is unrealistic to require an >implementation to lock interrupts before modifying a volatile object. The >Standard should note that volatile objects are indeterminate when longjmp() >is called from an interrupt or signal handler. I don't know that anything needs to be said about this. The only object for which atomic operations is guaranteed is sig_atomic_t. [longjmp() vs. signal handlers was discussed in a previous note] >Page 128, line 7. Is one character of pushback guaranteed even before >anything has been read from the stream or after end of file or error? >The standard should be clarified on this point. (I don't care either way, >but would prefer permitting one character pushback at any time.) Yes, since this is not specifically excepted it is required. >Page 140, line 21ff. Predefined values for "successful termination" >and "unsuccessful termination" (argumemts to exit()) should be provided. Done at last week's meeting, via a compromise solution that requires that 0 also always be taken to mean success. >Page 142, line 16ff. An unsigned division function analogous to >div() would be useful. This keeps getting proposed and defeated. Basically, the only reason div() etc. are defined is because we didn't want to insist that / and % work "correctly"; that's not an issue for unsigned integers. (It's also nice that both the quotient and remainder are returned simultaneously; this can be exploited by some implementations to improve efficiency in the frequent situation where both values are needed.) A lot of proposals for new features have been rejected in an attempt to keep the size of the language and its environment relatively small. (This attempt hasn't been totally successful, but it's certainly a worthwhile goal.) Therefore, please don't interpret failure to adopt a suggestion as necessarily implying that there is something wrong with the idea, although often there is (in which case the response should point out what). Reminder: Current public review period ends 07-Mar-1986. There WILL be another, 2-month, public review, since X3J11 has decided to make substantive changes to the current draft [as reported in another note].
faustus@ucbcad.BERKELEY.EDU (Wayne A. Christopher) (12/17/86)
Regarding the requirement that exit(0) be success -- this will break a lot of VMS C programs, which use 1 for success and 0 for "undefined error" (I think -- I'm not a big VMS fan...) Wayne
minow@decvax.UUCP (Martin Minow) (12/17/86)
Sorry about the length of this, but my original comments apparently require clarification. I'm greatful to Doug Gwyn (@ brl-smoke.arpa) for his comments. 1. I recommend that the total list of words be standardized (excepting those defined with an initial underscore. This (hopefully) prevents proliferation of new quasi-reserved words. I.e. I want a guarantee that Ansi will never add a foo() function to <math.h>. 2. Horizontal tab has an independent existance in the pre-processor (page 74, lines 26-29). It shouldn't (if I understand translation phases). 3. (Specifying internal representations linked to Latin 1) -- I understand that is infeasable to *require* ANSI (or Latin 1), but I would recommend defining Latin 1 as a reference, and stating that, for implementations supporting Latin 1, the internal representations of the specified characters *shall* be that given by Latin 1. (You are also free to give representations for EBCDIC, if you can find a standard.) 4. Doug asks where VT and FF are declared "equivalent to newlines." That's my reading of section 2.2.2 (page 11) defining character display semantics. If this is not the case, perhaps a clarification is in order. 5. I note that a sequence such as #define foo 0 #if 0 && 10 / foo int this; #else int that; should not result in an error. Doug seems to agree. Unfortunately, this bugchecks at least one C compiler. (And I had to work hard in Decus cpp to prevent it.) The problem is that some preprocessors do not properly "short-circuit" evaluate && || and ?:. Also, the standard should clarify just what happens if you do write #if 10 / 0 I doubt that bugchecking is correct behavior. I don't see anything in section 3.8.1 (pp. 75ff) discussing this. Martin Minow decvax!minow
joemu@nscpdc.NSC.COM (Joe Mueller) (12/17/86)
In article <1171@ucbcad.BERKELEY.EDU>, faustus@ucbcad.BERKELEY.EDU (Wayne A. Christopher) writes: > Regarding the requirement that exit(0) be success -- this will break a lot > of VMS C programs, which use 1 for success and 0 for "undefined error" > (I think -- I'm not a big VMS fan...) The question of exit status came up again during the last meeting. The position the committee eventually adopted is this: exit(0) always indicates success (for unix code) exit(EXIT_SUCCESS) always indicates success exit(EXIT_FAILURE) always indicates failure exit(anything else) implementation defined The EXIT* macros will be defined in (I believe) stddefs.h. Joe Mueller ...!nsc!nscpdc!joemu
bzs@bu-cs.BU.EDU (Barry Shein) (12/18/86)
>Regarding the requirement that exit(0) be success -- this will break a lot >of VMS C programs, which use 1 for success and 0 for "undefined error" >(I think -- I'm not a big VMS fan...) > > Wayne There's no reason that the run-time support for VMS/C can't return 1 to the O/S if the program exits 0. Unfortunately, there's really no other resolution. Unix and IBM systems both treat zero exits as success, lord knows why VMS decided to be different, but the problem is not a problem, the O/S can be handed whatever's correct. -Barry Shein, Boston University