osd7@homxa.UUCP (Orlando Sotomayor-Diaz) (01/29/85)
mod.std.c Digest Mon, 28 Jan 85 Volume 2 : Issue 12 Today's Topics: What is the current standard? (notes) Standard C Digest - V2 #6 #endif token-string (2) ---------------------------------------------------------------------- Date: Thu, 10 Jan 85 20:02:24 pst From: cbosgd!ucbvax!ucsfcgl!arnold (Ken Arnold) Subject: What is the current standard? To: @ucbvax.UCB-VAX:cbosgd!std-c In article Henry Spencer writes >To quote the K&R C reference manual (henceforth "CRM"), section 12.1 >(emphasis added): > > A [#define] causes the preprocessor to replace subsequent > instances of the identifier with the given string of tokens... > Each occurrence of a [macro parameter] is replaced by the > corresponding token string from the call... *Text inside a > string or a character constant is not subject to replacement*. > >In other words, replacement inside strings -- be it for macros or >macro parameters -- is a non-standard extension. It's a "feature" of >the Reiser C preprocessor, which is omnipresent in Unix C compilers >but not in others. The closest thing we have to an implementation- >independent standard for C is the CRM, which explicitly outlaws replacement >inside strings. > >I agree that this will break a number of things, including 4.2BSD. How >sad. Those programs, including 4.2BSD, were implementation-dependent >to begin with, and the authors have no right to cry about it. It should >be clear from this that I disagree with the committee's expressed intent >to add such a capability later. The current draft standard's neat new >string-concatenation convention (adjacent string literals -- note this >is literals only -- are concatenated at compile time) eliminates the >need for in-string replacement as a way to build filenames out of #defined >pieces, which to my mind was the only real need for in-string replacement. > Well, we all remember that K&R is not a standard, but it is an attempt to describe how the language works. At best it is *de-facto* standard. However, since this feature is "omnipresent in UNIX C compilers" (it is also true in DECUS C, and probably in others), that sort of seems like a de-facto standard, too. So, which de-facto standard are you going to follow? If all UNIX C compilers use it, and many other C compilers use it, too, it seems to me that we should be centering the standard around the language *as used*, not according to K&R. Note that, as it stands, the existence of parameter replacement inside strings cannot be a non-standard extension since there is no standard and it is extremely common. You might, at worst, call it a standard extension. But the ANSI standard ought to encompass normal usage, and this is part of normal usage. Also, if the committee has been willing to bend over backwards to accommodate people with 6 character loaders who don't want to work around the problem, we surely can do something as reasonable as not break C code written under UNIX. I consider myself an informed C person, but this is the first time it has come to my attention that this is not in K&R. How many other people do you think have used this? Shall we hew to a widely ignored statement in a non-standard, or to broad, nearly universal, actual usage? I must also disagree with the last statement about any replacement for this feature. There is no replacement in the standard. If there is no parameter substitution in strings, concatenation of strings doesn't solve anything. I have used replacement inside strings for many other valid purposes, not including concatenating existing strings. The "assert" macro, for example, can use this feature to print out which assertion it was that botched. # define assert(x) { \ if (!x) \ fprintf(stderr, \ "assertion \"x\" botched, line %d, \"%s\"\n", \ __LINE__, __FILE__); \ } >> "As indicated by the syntax, a token must not follow a #else or >> #endif directive before the terminating new-line character. >> However, comments may appear anywhere on any source line, >> including on a preprocessor directive." >> >> This breaks many existing programs, including rmail, deroff, diction, >> efl, eqn, learn, lint, nroff, refer, struct, troff, uucp, and ingres. > >Interestingly enough, I find *no* occurrences of the trouble-causing >syntax in rmail, deroff, eqn, learn, lint, nroff, refer, struct, troff, >or uucp on my system. A quick inspection of the System V sources (we >have, but don't run, System V) also comes up empty. So, this change >breaks Berklix and only Berklix programs; everybody else has been >following the CRM, which makes no provision for trailing tokens on >#else and #endif. This is a non-standard and implementation-dependent >extension. > >I have no personal objections to this one, although I think the syntax >ought to be specific (i.e., one identifier only) rather than wide-open >(any random tokens). > Again, the committee is willing to protect people with archaic loaders from working around it, but its okay to break a bunch of Berkeley 4BSD code? That's weird. This seems also to be part of the infamous :-> Reiser C Preprocessor (since it works on our System V system, and in DECUS C). See above discussion. Also, restricting it to one token doesn't handle a normal usage, which is: # if A || B ... # endif A || B Since this "addition" can break no existing code following the supposed "standard", let's just do it so it doesn't break anything at all... Ken Arnold ------------------------------ Date: Sun, 13 Jan 85 16:55:56 est From: cbosgd!ima.UUCP!haddock!ism780!ism780b!jim Subject: (notes) Standard C Digest - V2 #6 To: ima!cbosgd!std-c Henry Spencer writes: >[...] The current draft standard's neat new >string-concatenation convention (adjacent string literals -- note this >is literals only -- are concatenated at compile time) eliminates the >need for in-string replacement as a way to build filenames out of #defined >pieces, which to my mind was the only real need for in-string replacement. He has confused the in-string replacement problem with the string concatenation problem. String concatenation in macros has previously been achieved by foo/**/bar or #define IDENT(x)x IDENT(foo)bar and the standard's string concatenation does deal with this sufficiently. However, it does nothing for in-string replacement. Henry's point that the feature was not standard by K&R is well taken (however snidely put), although it is not reasonable to assert that K&R *outlaw* replacement within strings in the macro definition, since it is fairly clear that they were referring to strings in the running text. However, it is important to recognize the pragmatic side of Ken's code-breaking admonition. In particular, it would be impossible without in-string replacement to implement the UNIX assert library facility, which is a macro defined as #define assert(EX) if (EX) ; else _assert("EX", __FILE__, __LINE__) Therefore, I think the committee simply does not have the freedom to exclude the in-string replacement feature. As for >As various people (including me) have pointed out, modifying (say) the >OS/360-aka-MVS linker is politically impossible, however desirable and >technically-simple it may be. that linker does not restrict one to 6 character externals, and so I wish Henry would quit mentioning IBM (and DEC, which allows 32 character externs in VMS) in that context. People might question this limitation a bit more if they realized that it was accounting for GCOS, not MVS. Also, several proposals for accounting for such limited environments without modifying the host linker have been proposed that Henry has not addressed (mostly he has indulged in condescending attacks). The major criticism of these methods has been their inconvenience, especially for debugging. However, such inconvenience *on these specific systems* must be weighed against the costs to *all* C developers of a six-character limit. -- Jim Balter, INTERACTIVE Systems (ima!jim) ------------------------------ Date: Sat, 12 Jan 85 11:19:48 est From: cbosgd!pegasus.UUCP!hansen Subject: Standard C Digest - V2 #6 To: cbosgd!std-c, ihnp4!utzoo!henry Re: #endif token-string < Henry Spencer @ U of Toronto Zoology < Interestingly enough, I find *no* occurrences of the trouble-causing < syntax in rmail, deroff, eqn, learn, lint, nroff, refer, struct, troff, < or uucp on my system. A quick inspection of the System V sources (we < have, but don't run, System V) also comes up empty. So, this change < breaks Berklix and only Berklix programs; everybody else has been < following the CRM, which makes no provision for trailing tokens on < #else and #endif. This is a non-standard and implementation-dependent < extension. I disagree. I too did a grep of the System Vr2 sources and found a number of occurrences of this useful construct. In particular, the system header files curses.h, term.h and sys/xtproto.h all used this construct. So it isn't just Berklix programs that get broken. (Note that the curses.h is AT&T's (Mark Horton's) version of curses.h and NOT Berkley's version.) I too use this construct in most of my programs and would have to make considerable changes to get my code to compile under ANSI unless this restriction were lifted. Besides, what does it hurt to lift the restriction? Tony Hansen pegasus!hansen ------------------------------ Date: 12 Jan 85 23:59:14 CST (Sat) From: cbosgd!ihnp4!utzoo!henry Subject: #endif token-string To: ihnp4!pegasus!hansen > I disagree. I too did a grep of the System Vr2 sources and found a number of > occurrences of this useful construct. In particular, the system header files > curses.h, term.h and sys/xtproto.h all used this construct. My grep was on SysV, not SysV.2, since we don't have SysV.2 yet. I also observe that the examples you cite are in an area where the Berkeley influence on SysV.2 has been strongest. > I too use this construct in most of my programs and would have to make > considerable changes to get my code to compile under ANSI unless this > restriction were lifted. > > Besides, what does it hurt to lift the restriction? My point was not that I'm opposed to the trailing-tokens notion -- I tend to agree that it's a reasonable thing, although I would like to see some restrictions (e.g. "one identifier only") for error catching -- but that existing code which uses this construct is relying on an implementation- dependent local extension. While it would be nice if the ANSI standard didn't break your code, I don't see that you have a legitimate cause for complaint if it does. People who want their code to be portable simply have to pay attention to such issues, and avoid nonstandard extensions. Even if it hurts. Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry ------------------------------ End of mod.std.c Digest - Mon, 28 Jan 85 20:33:21 EST ****************************** USENET -> posting only through cbosgd!std-c. ARPA -> replies to cbosgd!std-c@BERKELEY.ARPA (NOT to INFO-C) In all cases, you may also reply to the author(s) below. -- Orlando Sotomayor-Diaz /AT&T Bell Laboratories, Red Hill Road /Middletown, New Jersey, 07748 (HR 1B 316) Tel: 201-949-9230 /UUCP: {ihnp4, houxm}!homxa!osd7