ok@cs.mu.oz.au (Richard O'Keefe) (09/04/89)
I've just tried to compile a medium-sized program using a compiler which tries to be ANSI-compliant. The program was full of large comments like #if 0 <oodles of English text and C examples> #endif The compiler sees words like "don't" in the English text and snarls that these are unterminated character constants. This used to be perfectly good C, and whatever the reason for ANSI C breaking well-commented programs, I personally think the cost is way too high. Spilled milk, alas. The editor I use has a "comment-out-region" command which inserts spaces between asterisks and slashes so that I can fix this by deleting the #if 0 and #endif and using that command. I'd rather not: having / * a nested comment * / in a file is asking for some clown to come along later and "correct" it. Is there a way of having mixed English text and C fragments (possibly including C comment delimiters) in a C program which _will_ work in ANSI C?
karl@haddock.ima.isc.com (Karl Heuer) (09/05/89)
In article <2014@munnari.oz.au> ok@cs.mu.oz.au (Richard O'Keefe) writes: > #if 0 ... [English text and C examples] ... #endif >The compiler sees words like "don't" in the English text and snarls that >these are unterminated character constants. This used to be perfectly >good C... No, what you really mean is that you used to use a compiler that happened not to complain about that, and now you use one that does. The introduction of ANSI C between the two events is largely coincidental. It never was good C. Generally, if you want to hide English text from the compiler, you should use /*...*/, whereas if you want to hide C code, you should use #if...#endif. I infer from your posting that this particular instance is English text that happens to include C code within the running text, i.e. this C code is meant to be read by humans rather than being a piece of former code that's been inactivated. Probably your best bet is to move the comments into the English text, omitting the comment delimiters. >Is there a way of having mixed English text and C fragments (possibly >including C comment delimiters) in a C program which _will_ work in ANSI C? Alas, there is no perfect commenting convention; unless it's all done with an awkward prefix like a Hollerith constant, there's always something that cannot appear literally in a comment. In C it's the string "*/", and I'm afraid you'll have to live with it, or find a new language. (C++ comes to mind. It has remainder-of-line comments, which are okay if you don't mind putting the prefix on each line.) Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
ok@cs.mu.oz.au (Richard O'Keefe) (09/05/89)
In article <2014@munnari.oz.au> ok@cs.mu.oz.au (Richard O'Keefe) writes: > #if 0 ... [English text and C examples] ... #endif >The compiler sees words like "don't" in the English text and snarls that >these are unterminated character constants. This used to be perfectly >good C... In article <14512@haddock.ima.isc.com>, karl@haddock.ima.isc.com (Karl Heuer) writes: > No, what you really mean is that you used to use a compiler that happened not > to complain about that, and now you use one that does. The introduction of > ANSI C between the two events is largely coincidental. It never was good C. It is not at all coincidental. There was never any suggestion in any book that #if 0-ed code had to have balanced single quotes, and the only C compiler I've ever come across (I've used about a dozen) that complains about it is the only one I've used that tries to be ANSI-compliant. I've even seen the device recommended in this newsgroup (I'd name names, but I've forgotten them). There are good reasons why a conditional inclusion facility should not treat omitted text as tokens (e.g. some new ANSI tokens were not legal before, like 2U, and some old ones, like 09, aren't legal now, and some compilers support extended syntax). For all programming languages X, good comments are good X.
gwyn@smoke.BRL.MIL (Doug Gwyn) (09/05/89)
In article <2014@munnari.oz.au> ok@cs.mu.oz.au (Richard O'Keefe) writes: > #if 0 > <oodles of English text and C examples> > #endif >The compiler sees words like "don't" in the English text and snarls that >these are unterminated character constants. This used to be perfectly >good C, and whatever the reason for ANSI C breaking well-commented programs, >I personally think the cost is way too high. Spilled milk, alas. There were C compilers that would have the same problem with such "code" before ANSI C. Generally, it's a problem for tokenizing preprocessors. >Is there a way of having mixed English text and C fragments (possibly >including C comment delimiters) in a C program which _will_ work in ANSI C? /* English stuff sample_code(); // sample comment more English stuff */ C is not WEB.
diomidis@ecrcvax.UUCP (Diomidis Spinellis) (09/06/89)
In article <2023@munnari.oz.au> ok@cs.mu.oz.au (Richard O'Keefe) writes: >In article <2014@munnari.oz.au> ok@cs.mu.oz.au (Richard O'Keefe) writes: >> #if 0 ... [English text and C examples] ... #endif >>The compiler sees words like "don't" in the English text and snarls that >>these are unterminated character constants. This used to be perfectly >>good C... [...] >It is not at all coincidental. There was never any suggestion in any book >that #if 0-ed code had to have balanced single quotes, and the only C >compiler I've ever come across (I've used about a dozen) that complains >about it is the only one I've used that tries to be ANSI-compliant. ecrcvax% cat t.c #if 0 #funny_text /* #endif ecrcvax% /bin/cc -c t.c t.c: 2: undefined control t.c: 5: unterminated comment Vanila pcc complains about unbalanced comments and undefined preprocessor controls inside #if 0 blocks. Commenting out arbitrary text with #if 0 is not a safe practice for most compilers (both ANSI and classic C). Diomidis -- Diomidis Spinellis European Computer-Industry Research Centre (ECRC) Arabellastrasse 17, D-8000 Muenchen 81, West Germany +49 (89) 92699199 USA: diomidis%ecrcvax.uucp@pyramid.pyramid.com ...!pyramid!ecrcvax!diomidis Europe: diomidis@ecrcvax.uucp ...!unido!ecrcvax!diomidis
henry@utzoo.uucp (Henry Spencer) (09/06/89)
In article <2014@munnari.oz.au> ok@cs.mu.oz.au (Richard O'Keefe) writes: >The compiler sees words like "don't" in the English text and snarls that >these are unterminated character constants. This used to be perfectly >good C... No, it used to be borderline C that some compilers would accept but some wouldn't. It is now definitely not legal C, X3J11 having cleared up most of the borderline areas. -- V7 /bin/mail source: 554 lines.| Henry Spencer at U of Toronto Zoology 1989 X.400 specs: 2200+ pages. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
henry@utzoo.uucp (Henry Spencer) (09/06/89)
In article <2023@munnari.oz.au> ok@cs.mu.oz.au (Richard O'Keefe) writes: >... There was never any suggestion in any book >that #if 0-ed code had to have balanced single quotes, ... There was never any suggestion that it didn't, either. One should avoid the mistake of assuming that books contain all the answers, and that the apparent lack of a definite answer can always be resolved by a sufficiently narrow and legalistic reading. The fact is, this little detail simply was never specified properly. >the only C >compiler I've ever come across (I've used about a dozen) that complains >about it is the only one I've used that tries to be ANSI-compliant. This is coincidence. Let me guess -- most of those dozen were Unix compilers, right? Then you haven't used a dozen compilers, you've used one or two, because most of the Unix ones are the same compiler under the hood. Compilers that would not accept your construct existed long before X3J11 got started. -- V7 /bin/mail source: 554 lines.| Henry Spencer at U of Toronto Zoology 1989 X.400 specs: 2200+ pages. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
oz@yunexus.UUCP (Ozan Yigit) (09/07/89)
In article <10935@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes: > >C is not WEB. This is extremely pompous. That peculiar (WEB-like) utilization of CPP is not new. It is unusable now, and all you can say is mumble about "tokenizing preprocessors". if you have something useful to say, why not say it like the respectable oldtimer you are, instead of being boorish ?? oz -- The king: If there's no meaning Usenet: oz@nexus.yorku.ca in it, that saves a world of trouble ......!uunet!utai!yunexus!oz you know, as we needn't try to find any. Bitnet: oz@[yulibra|yuyetti] Lewis Carroll (Alice in Wonderland) Phonet: +1 416 736-5257x3976
lehners@uniol.UUCP (Joerg Lehners) (09/07/89)
diomidis@ecrcvax.UUCP (Diomidis Spinellis) writes: >[some text deleted] >ecrcvax% cat t.c >#if 0 >#funny_text >/* >#endif >ecrcvax% /bin/cc -c t.c >t.c: 2: undefined control >t.c: 5: unterminated comment >Vanila pcc complains about unbalanced comments and undefined preprocessor >controls inside #if 0 blocks. Commenting out arbitrary text with #if 0 is >not a safe practice for most compilers (both ANSI and classic C). I think the preprocess must look at the code following the #if 0 because the preprocessor must look for the corresponding #endif. And the rules for the preprocessor are: don't do any substitutions an interpretation in constant character arrays (""), character constants ('') and comments (/* */). But the idea of non-interpreting #if 0 / #endif pairs would be very nice: to (totally) comment out a full chunk of nontestet, non " and ' balanced code but /* */ balanced code, without worrying about recursive comments. Joerg -- / Joerg Lehners | Fachbereich 10 Informatik ARBI \ | | Universitaet Oldenburg | | BITNET/EARN: 066065@DOLUNI1.BITNET | Ammerlaender Heerstrasse 114-118 | | UUCP/Eunet: lehners@uniol.uucp | D-2900 Oldenburg | +-------------------------------------+----------------------------------+ \ Unix-Wizards: let's zap away all stupid users. /
ok@cs.mu.oz.au (Richard O'Keefe) (09/07/89)
[I wrote] : >> #if 0 ... [English text and C examples] ... #endif : >>The compiler sees words like "don't" in the English text and snarls that ^^^^^ : >It is not at all coincidental. There was never any suggestion in any book : >that #if 0-ed code had to have balanced single quotes, and the only C ^^^^^^^^^^^^^^^^^^^^^^ In article <766@ecrcvax.UUCP>, diomidis@ecrcvax.UUCP (Diomidis Spinellis) writes: > Vanilla pcc complains about unbalanced comments and undefined preprocessor ^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^ > controls inside #if 0 blocks. I wasn't whining about unbalanced comments, just English text like "don't". The bottom line is that my belief that #if 0 used to work in C is wrong, I had just been lucky in never running into any of the nasty bits, and there isn't any safe way of including large chunks of text in a C program. Gad, this makes F-----N look good. To think that C is a descendant of BCPL, which had //end-of-line comments.
ok@cs.mu.oz.au (Richard O'Keefe) (09/07/89)
In article <1989Sep6.163608.20143@utzoo.uucp>, henry@utzoo.uucp (Henry Spencer) writes: > >the only C > >compiler I've ever come across (I've used about a dozen) that complains > >about it is the only one I've used that tries to be ANSI-compliant. > > This is coincidence. Let me guess -- most of those dozen were Unix > compilers, right? I guessed wrong about what the books didn't way (#if 0 is no good). You guessed wrong about what I didn't say (which compilers). Not all UNIX compilers are PCC, either. I've used four that weren't. Is Apollo's C a PCC?
gwyn@smoke.BRL.MIL (Doug Gwyn) (09/07/89)
In article <3626@yunexus.UUCP> oz@yunexus.UUCP (Ozan Yigit) writes: >In article <10935@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes: >>C is not WEB. >This is extremely pompous. That peculiar (WEB-like) utilization of CPP is >not new. It is unusable now, and all you can say is mumble about >"tokenizing preprocessors". if you have something useful to say, why not >say it like the respectable oldtimer you are, instead of being boorish ?? Excuse me -- I didn't realize that X3J11's printing of a bunch of pieces of paper caused your compiler to stop accepting your already nonportable abuse of the preprocessor. A lot of people say "the Standard broke [whatever]". The proposed Standard breaks nothing. Standard-conforming C implementations may well not produce the same results as others have been producing, but there has always been that degree of variation among C compilers. The one thing the Standard does is make it simpler to assure program portability among Standard-conforming compilers, for which the rules are relatively clearly defined. I think a lot of these gripes are the result of enhanced awareness about variations in (already existing!) C environments that discussion about the proposed Standard brings out.
news@ism780c.isc.com (News system) (09/08/89)
In article <1989Sep6.163608.20143@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes: >In article <2023@munnari.oz.au> ok@cs.mu.oz.au (Richard O'Keefe) writes: >>... There was never any suggestion in any book >>that #if 0-ed code had to have balanced single quotes, ... > >There was never any suggestion that it didn't, either. One should avoid >the mistake of assuming that books contain all the answers, ... This assertion puzzles me. Are you saying I won't be able to rely on the ANSI standard to answer questions about the language? I assume it will be in book form if and when it is published. Marv Rubinstein
news@ism780c.isc.com (News system) (09/08/89)
In article <10969@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes: >A lot of people say "the Standard broke [whatever]". The proposed >Standard breaks nothing. Standard-conforming C implementations may >well not produce the same results as others have been producing, but >there has always been that degree of variation among C compilers. Here is an example of *well defined* code that produces different results: Main() { unsigned char c=1; if (c-2>0) <true-part> else <false-part> The 'usual conversion rules' in K&R requires that the expression c-2 is unsigned and therfore the <true-part> executes. The ANSI 'usual conversion rules' requires that the expression c-2 be signed and therefore the <false- part> executes. This change and other similar ones are called 'quiet changes' These are direct quotes from the Rationale: "a string of the form "\078" is valid, but now has a different meaning" "A string of the form "\a" or "\x" now has a different meaning" "A program that depends upon unsigned preserving arithmetic conversions will behave differently, probably without complaint. This is considered the most serious semantic change made by the Committee to a wide spread current practice". The standard has clarified the semantics of many cases where different compilers produced different results. But it also changed the semantics of several cases where there was no ambiguity in the base document (K&R). The standard does indeed 'break' things. The only way I can read Doug's assertion (such that it is true) is to say: since there was no standard for C before THE standard, no programs had a well defined behavior so nothing got broken. Marv Rubinstein PS: I have heard that Unix release 5.4 uses two compilers. One to compile Unix and one (ANSII) for new programs. Does any one know if this is true?
gwyn@smoke.BRL.MIL (Doug Gwyn) (09/08/89)
In article <32905@ism780c.isc.com> marv@ism780.UUCP (Marvin Rubenstein) writes: >In article <10969@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes: >>A lot of people say "the Standard broke [whatever]". The proposed >>Standard breaks nothing. Standard-conforming C implementations may >>well not produce the same results as others have been producing, but >>there has always been that degree of variation among C compilers. >Here is an example of *well defined* code that produces different results: Different from K&R I, perhaps, but not necessarily from widespread practice. Even AT&T's own PCC didn't follow all the specifications in K&R I. Many times it was necessary for X3J11 to consider all the existing implementations and weigh them against the "base document". In the case of unsigned- preserving vs. value-preserving default promotion rules, both methods were in use, and the committee opted for the one that would cause the fewest unpleasant surprises for the programmer (i.e. value-preserving). > "a string of the form "\078" is valid, but now has a different > meaning" Yes, it now means what the programmer would reasonably expect, since 8 is no longer considered an "octal digit". There were implementations that did this right instead of the way Ritchie originally implemented it. > "A string of the form "\a" or "\x" now has a different meaning" However, nobody is likely to have written "a" as "\a" in existing code. >The standard has clarified the semantics of many cases where different >compilers produced different results. But it also changed the semantics of >several cases where there was no ambiguity in the base document (K&R). The >standard does indeed 'break' things. As I tried to make clear, these things were already "broken" one way or another, if no other way than by having widespread variation in existing implementations. >The only way I can read Doug's assertion (such that it is true) is to say: >since there was no standard for C before THE standard, no programs had >a well defined behavior so nothing got broken. That is not far from the truth. PCC was widely regarded as the criterion for "what C was", even though it differed from the spec in K&R in several ways. Other people would have argued that K&R I was "what C was", even though several important widely-accepted useful extensions occurred after the book was published. And so on. Anyway, you DROPPED THE CONTEXT to which my comment was a response. The complaint that existing nonportable code that exploited a specific interpretation of a "grey area" of the language spec was "broken" by the proposed Standard was unfounded. The grey area was clarified in the best way we could, weighing many factors in the process, in such a way that that particular exploitation has been shown up for the risky affair that it always was.
henry@utzoo.uucp (Henry Spencer) (09/08/89)
In article <32896@ism780c.isc.com> marv@ism780.UUCP (Marvin Rubenstein) writes: >>There was never any suggestion that it didn't, either. One should avoid >>the mistake of assuming that books contain all the answers, ... > >This assertion puzzles me. Are you saying I won't be able to rely on the >ANSI standard to answer questions about the language? I assume it will be in >book form if and when it is published. You will be able to rely on the ANSI standard to answer *most* questions about the language, since it has been prepared with far greater care than most books about C. However, there will undoubtedly still be questions that it won't answer, given that it was prepared by human beings and not by gods. (There's only one C god around -- DMR -- and he wasn't deeply involved. :-)) When you encounter such a question, you should not say "well, the standard doesn't answer this directly, but if I twist the wording and put strange interpretations on a few of the terms, I can construe the footnote on page 357 to be a partial answer". You should say "the standard does not answer this question". Period. At which point either you put a formal query into the ANSI "interpretation of standards" queue, or you conclude that your code should not rely on any specific answer to that question. -- V7 /bin/mail source: 554 lines.| Henry Spencer at U of Toronto Zoology 1989 X.400 specs: 2200+ pages. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
tneff@bfmny0.UU.NET (Tom Neff) (09/09/89)
In article <1989Sep8.154522.17068@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes: >You will be able to rely on the ANSI standard to answer *most* questions >about the language, since it has been prepared with far greater care than >most books about C. However, there will undoubtedly still be questions >that it won't answer, given that it was prepared by human beings and not >by gods... So X3J11 spent all those years to be able to answer *most* questions about C. I thought we could already do that when they started.
scjones@sdrc.UUCP (Larry Jones) (09/09/89)
In article <32905@ism780c.isc.com>, news@ism780c.isc.com (Marv Rubinstein) writes: > Here is an example of *well defined* code that produces different results: > > Main() > { > unsigned char c=1; > if (c-2>0) <true-part> else <false-part> > > The 'usual conversion rules' in K&R requires that the expression c-2 is > unsigned and therfore the <true-part> executes. The ANSI 'usual conversion > rules' requires that the expression c-2 be signed and therefore the <false- > part> executes. This change and other similar ones are called 'quiet changes' On the contrary, K&R has no concept of unsigned char, only unsigned int. That's why different implementers pick different promotion rules for unsigned char and unsigned short, and thus why X3J11 had to decide one way or the other. Value preserving promotion rules were not just made up on the spot! > PS: I have heard that Unix release 5.4 uses two compilers. One to compile > Unix and one (ANSII) for new programs. Does any one know if this is true? AT&T compiler people are fond of saying that every command line switch creates a new compiler. I suspect that is the genesis of the rumor you heard. ---- Larry Jones UUCP: uunet!sdrc!scjones SDRC scjones@SDRC.UU.NET 2000 Eastman Dr. BIX: ltl Milford, OH 45150-2789 AT&T: (513) 576-2070 "I have plenty of good sense. I just choose to ignore it." -Calvin
gwyn@smoke.BRL.MIL (Doug Gwyn) (09/09/89)
In article <14640@bfmny0.UU.NET> tneff@bfmny0.UU.NET (0000-Admin(0000)) writes: -In article <1989Sep8.154522.17068@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes: ->You will be able to rely on the ANSI standard to answer *most* questions ->about the language, since it has been prepared with far greater care than ->most books about C. However, there will undoubtedly still be questions ->that it won't answer, given that it was prepared by human beings and not ->by gods... -So X3J11 spent all those years to be able to answer *most* questions -about C. -I thought we could already do that when they started. Maybe "most" is now 99% instead of 51%.
ok@cs.mu.oz.au (Richard O'Keefe) (09/09/89)
In article <1989Sep8.154522.17068@utzoo.uucp>, henry@utzoo.uucp (Henry Spencer) writes: > ... When you encounter such a question, ... At which > point either you put a formal query into the ANSI "interpretation of > standards" queue, How (once the standard comes out) will we do that? comp.std.c? send paper-mail somewhere? Is this going to be done the way Ada was? [I imagine the practical answer for many people will continue to be "ask comp.lang.c", but if there's a right way I'd like to do it that way.]
gwyn@smoke.BRL.MIL (Doug Gwyn) (09/09/89)
In article <2066@munnari.oz.au> ok@cs.mu.oz.au (Richard O'Keefe) writes: >In article <1989Sep8.154522.17068@utzoo.uucp>, henry@utzoo.uucp (Henry Spencer) writes: >> you put a formal query into the ANSI "interpretation of standards" queue, >How (once the standard comes out) will we do that? Instructions will be posted, but basically you'll send a request for interpretation of a particular point in the C Standard to the X3 Secretariat of CBEMA, referring to ANSI Std X3.159-1989 (we hope it's -1989!). Make sure you actually do need an interpretation, because many of the questions people have been answering have clear answers in the Standard and may receive essentially a "RTFS" reply, or perhaps "RTFR" (R=Rationale).
henry@utzoo.uucp (Henry Spencer) (09/10/89)
In article <14640@bfmny0.UU.NET> tneff@bfmny0.UU.NET (0000-Admin(0000)) writes: >>You will be able to rely on the ANSI standard to answer *most* questions... > >So X3J11 spent all those years to be able to answer *most* questions >about C. > >I thought we could already do that when they started. The semantics of "most" have changed. :-) It is undoubtedly true that the standard will not answer all questions about the language. However, it gives single, specific, unambiguous answers to a far greater fraction of the questions than any previous document. Just *try* using K&R1 to figure out the exact output from a tricky preprocessor example. -- V7 /bin/mail source: 554 lines.| Henry Spencer at U of Toronto Zoology 1989 X.400 specs: 2200+ pages. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
seanf@sco.COM (Sean Fagan) (09/10/89)
In article <14640@bfmny0.UU.NET> tneff@bfmny0.UU.NET (0000-Admin(0000)) writes: >So X3J11 spent all those years to be able to answer *most* questions >about C. >I thought we could already do that when they started. X3J11 *created* a standard; they did not recognize one. In order to increase acceptance, they tried to make their standard as "standard" as possible. However, since C has been implemented on many different machines and operating systems, of various architectures (from the sane CDC Cyber 170-state machines to the insane nameless ones 8-)), and has features that few other languages offer (other than assembly), making an instantly portable standard is (and was) quite impossible. For the *most* part, the language which X3J11 created is upwardly compatable with a sort of merging of most of the then-currently existing C compilers. It is not downward compatable. Do not expect non-conforming compilers to behave as the dpANS dictates, although, in most cases, it will be close enough that the only C reference I use is the draft (and the sources, of course 8-)). X3J11 spent all of those years trying to come up with something that could (and, hopefully, would) be implemented by people who had already written compilers, so that it could be in use soon enough. Changing the language is Bad: consider FORTRAN 8x. There. Now I feel better 8-). -- Sean Eric Fagan | "Time has little to do with infinity and jelly donuts." seanf@sco.COM | -- Thomas Magnum (Tom Selleck), _Magnum, P.I._ (408) 458-1422 | Any opinions expressed are my own, not my employers'.
ckl@uwbln.UUCP (Christoph Kuenkel) (09/11/89)
In article <32905@ism780c.isc.com>, news@ism780c.isc.com (News system) writes: > PS: I have heard that Unix release 5.4 uses two compilers. One to compile > Unix and one (ANSII) for new programs. Does any one know if this is true? It seems to be one compiler that supports three modes: - transitional mode (supports ``full ansi'', accepts any k&r source, uses k&r semantics, warns if semantics differ) - ansi mode (supports ``full ansi'', uses ansi semantics, warns if semantics differ) - ansi conforming mode (supports ``full ansi'', uses ansi semantics, warns if semantics differ, no non ansi names in namespace) modes are switched to with compiler flag (-X{t,a,c}). this information is from the at&t/sun software developer conference. christoph