osd7@homxa.UUCP (Orlando Sotomayor-Diaz) (01/09/85)
ANSI Draft of Proposed C Language Std. Mail your replies to the author(s) below or to cbosgd!std-c. Cbosgd is reachable via most of the USENET nodes, including ihnp4, ucbvax, decvax, hou3c.... Administrivia should be mailed to cbosgd!std-c-request. ARPA -> mail to cbosgd!std-c@BERKELEY.ARPA (+++ NOT TO INFO-C +++) **************** mod.std.c Vol. 2 No. 6 1/8/85 ******************** Today's Topics: reply to Arnold's comments on the X3J11/84-161 draft ---------------------------------------------------------------------- Date: 8 Jan 85 06:20:06 CST (Tue) From: cbosgd!ihnp4!utzoo!henry Subject: reply to Ken Arnold's comments on the X3J11/84-161 draft References: <596@homxa.UUCP> Toward the end of his contribution, Ken makes a key observation that will be relevant when discussing his specific points (emphasis added): > ... The standard should not break existing > code, *except where such code takes advantage of bugs, non-standard > extensions, or implementation or machine dependent code (such as asm > statements)*. I think we are all in agreement on that. Not breaking correct code was a major objective of the committee; it is my contention that they have in fact achieved it, and Ken's points are adequately rebutted by his own observation. In detail... > Not scanning strings in the program for macro names is proper and > current. However, not scanning strings in the token sequence in the > macro definition will break several current programs, including the > 4.2bsd operating system (c.f. "CTRL" in <sys/ttychars.h>). ... To quote the K&R C reference manual (henceforth "CRM"), section 12.1 (emphasis added): A [#define] causes the preprocessor to replace subsequent instances of the identifier with the given string of tokens... Each occurrence of a [macro parameter] is replaced by the corresponding token string from the call... *Text inside a string or a character constant is not subject to replacement*. In other words, replacement inside strings -- be it for macros or macro parameters -- is a non-standard extension. It's a "feature" of the Reiser C preprocessor, which is omnipresent in Unix C compilers but not in others. The closest thing we have to an implementation- independent standard for C is the CRM, which explicitly outlaws replacement inside strings. I agree that this will break a number of things, including 4.2BSD. How sad. Those programs, including 4.2BSD, were implementation-dependent to begin with, and the authors have no right to cry about it. It should be clear from this that I disagree with the committee's expressed intent to add such a capability later. The current draft standard's neat new string-concatenation convention (adjacent string literals -- note this is literals only -- are concatenated at compile time) eliminates the need for in-string replacement as a way to build filenames out of #defined pieces, which to my mind was the only real need for in-string replacement. > "As indicated by the syntax, a token must not follow a #else or > #endif directive before the terminating new-line character. > However, comments may appear anywhere on any source line, > including on a preprocessor directive." > > This breaks many existing programs, including rmail, deroff, diction, > efl, eqn, learn, lint, nroff, refer, struct, troff, uucp, and ingres. Interestingly enough, I find *no* occurrences of the trouble-causing syntax in rmail, deroff, eqn, learn, lint, nroff, refer, struct, troff, or uucp on my system. A quick inspection of the System V sources (we have, but don't run, System V) also comes up empty. So, this change breaks Berklix and only Berklix programs; everybody else has been following the CRM, which makes no provision for trailing tokens on #else and #endif. This is a non-standard and implementation-dependent extension. I have no personal objections to this one, although I think the syntax ought to be specific (i.e., one identifier only) rather than wide-open (any random tokens). > The implementation may further restrict the significance of an > *external name* (an identifier that has external linkage) to > six characters and may ignore distinctions of alphabetical case > for such names. ... > > ... If one is asking > someone to rewrite a compiler (and many of the extensions would require > some extensive modifications to existing compilers), asking them to > modify a loader is not too much to add. ... As various people (including me) have pointed out, modifying (say) the OS/360-aka-MVS linker is politically impossible, however desirable and technically-simple it may be. I also note that the CRM addresses this point with a (partial) list of implementations, and it is obvious at a glance that "six chars monocase" is the lowest common denominator. For those wishing to write portable code, the conclusion is clear. I agree that a lot of code was written under the old pdp11 assumptions, and minimally-conforming implementations of the standard will break such code. I observe, however, that minimally-conforming-to-the-CRM implemen- tations which break such code already exist. So the standard is not making the situation any worse. For the reason mentioned two paragraphs up, the standard is not in a position to make things any better. P.S.: To rebut an unpleasant misinterpretation that has been going around, I do *not* like identifier-length limits. Any identifier-length limits. The time has long since passed when there was any technical justification for them, if indeed there ever was. But it is of great importance that the ANSI C standard be widely accepted, and that cannot happen if none of the major manufacturers can implement it fully without breaking hundreds of other programs. Standards are necessarily compromises; "can I live with it?" is a much more important question than "do I like it?". I don't like it, but I think we can live with it. I'm getting tired of people who refuse to grasp the distinction. Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry -------------------------------------- End of Vol. 2, No. 6. Std-C (Jan. 8, 1985 22:30:00) -- Orlando Sotomayor-Diaz /AT&T Bell Laboratories, Red Hill Road /Middletown, New Jersey, 07748 (HR 1B 316) Tel: 201-949-9230 /UUCP: {ihnp4, houxm}!homxa!osd7