sdm@cs.brown.edu (Scott Meyers) (05/22/89)
Consider the following C++ source line:
//**********************
How should this be treated by the C++ compiler? The GNU g++ compiler
treats this as a comment-to-EOL followed by a bunch of asterisks, but the
AT&T compiler treats it as a slash followed by an open-comment delimiter.
I want the former interpretation, and I can't find anything in Stroustrup's
book which indicates that any other interpretation is to be expected.
Actually, compiling -E quickly shows that the culprit is the preprocessor,
so my questions are:
1. Is this a bug in the AT&T preprocessor? If not, why not? If so,
will it be fixed in 2.0, or are we stuck with it?
2. Is it a bug in the GNU preprocessor? If so, why?
Scott Meyers
sdm@cs.brown.educhapman@eris.berkeley.edu (Brent Chapman) (05/23/89)
In article <6957@brunix.UUCP> sdm@cs.brown.edu (Scott Meyers) writes: >Consider the following C++ source line: > > //********************** > >How should this be treated by the C++ compiler? The GNU g++ compiler >treats this as a comment-to-EOL followed by a bunch of asterisks, but the >AT&T compiler treats it as a slash followed by an open-comment delimiter. >I want the former interpretation, and I can't find anything in Stroustrup's >book which indicates that any other interpretation is to be expected. I'm new to C++, but C has always used a "greedy" parsing algorithm (that is, it always takes the longest possible next token); I don't know why C++ would do otherwise. From K&R, p. 179: If the input stream has been parsed into tokens up to a given character, the next token is taken to include the longest string of characters which could possibly constitute a token. g++ is doing the right thing, and the AT&T compiler is wrong. -Brent -- Brent Chapman Capital Market Technology, Inc. Computer Operations Manager 1995 University Ave., Suite 390 {lll-tis,ucbvax!cogsci}!capmkt!brent Berkeley, CA 94704 capmkt!brent@{lll-tis.arpa,cogsci.berkeley.edu} Phone: 415/540-6400
ark@alice.UUCP (Andrew Koenig) (05/23/89)
In article <6957@brunix.UUCP>, sdm@cs.brown.edu (Scott Meyers) writes: > Consider the following C++ source line: > //********************** > How should this be treated by the C++ compiler? It's a // comment followed by a bunch of *'s. However, many C preprocessors strip comments out of your program and also don't recognize C++ comments. Thus by the time the C++ compiler sees this, it looks like this: / AT&T does not supply a preprocessor with its C++ translator, any more than it supplies a linker, assembler, or C compiler. It's up to whoever ports C++ to deal with the preprocessor. -- --Andrew Koenig ark@europa.att.com
hansen@pegasus.ATT.COM (Tony L. Hansen) (05/23/89)
<>Consider the following C++ source line: <> <> //********************** <> <>How should this be treated by the C++ compiler? The GNU g++ compiler <>treats this as a comment-to-EOL followed by a bunch of asterisks, but the <>AT&T compiler treats it as a slash followed by an open-comment delimiter. <>I want the former interpretation, and I can't find anything in Stroustrup's <>book which indicates that any other interpretation is to be expected. < <I'm new to C++, but C has always used a "greedy" parsing algorithm (that is, <it always takes the longest possible next token); I don't know why C++ would <do otherwise. From K&R, p. 179: < < If the input stream has been parsed into tokens up to a given character, < the next token is taken to include the longest string of characters which < could possibly constitute a token. < <g++ is doing the right thing, and the AT&T compiler is wrong. Actually, the problem is with the C preprocessor being used with the cfront compiler, not with the AT&T compiler. The G++ preprocessor deals with // comments. Whatever vendor supplied your port of cfront should have also used a preprocessor which understands // comments. Tony Hansen att!pegasus!hansen, attmail!tony hansen@pegasus.att.com
diamond@diamond.csl.sony.junet (Norman Diamond) (05/23/89)
In article <6957@brunix.UUCP> sdm@cs.brown.edu (Scott Meyers) writes: >Consider: //********************** >The GNU g++ compiler >treats this as a comment-to-EOL followed by a bunch of asterisks, but the >AT&T compiler treats it as a slash followed by an open-comment delimiter. >I want the former interpretation, Greedy lexing suggests that you should get what you want. Carelessness (imprecision) in specifying grammars permits the AT&T interpretation, but it really should not be allowed, especially if they still can't parse a+++++b. -- Norman Diamond, Sony Computer Science Lab (diamond%csl.sony.co.jp@relay.cs.net) The above opinions are my own. | Why are programmers criticized for If they're also your opinions, | re-implementing the wheel, when car you're infringing my copyright. | manufacturers are praised for it?
skinner@saturn.ucsc.edu (Robert Skinner) (05/23/89)
In article <2900@pegasus.ATT.COM>, hansen@pegasus.ATT.COM (Tony L. Hansen) writes: > <>Consider the following C++ source line: > <> > <> //********************** > <> > <>How should this be treated by the C++ compiler? The GNU g++ compiler > < > < If the input stream has been parsed into tokens up to a given character, > < the next token is taken to include the longest string of characters which > < could possibly constitute a token. > < > <g++ is doing the right thing, and the AT&T compiler is wrong. > > Actually, the problem is with the C preprocessor being used with the cfront > compiler, not with the AT&T compiler. The G++ preprocessor deals with // > comments. Whatever vendor supplied your port of cfront should have also used > a preprocessor which understands // comments. the AT&T compiler uses /lib/cpp, which knows nothing about the // comment. This is great for portability, every C compiler has a preprocessor. Unfortunately, it is a royal PAIN when using a package like curses that uses lots #define macros. You can't put the name of the macro in a // comment, e.g. // this routine uses move or // move the object down without getting an argument mismatch error from cpp. Such is the price we pay for portability and building on previous tools. Robert skinner@saturn.ucsc.edu
shap@polya.Stanford.EDU (Jonathan S. Shapiro) (05/23/89)
In article <24700@agate.BERKELEY.EDU> chapman@eris.berkeley.edu (Brent Chapman) writes: >In article <6957@brunix.UUCP> sdm@cs.brown.edu (Scott Meyers) writes: >>Consider the following C++ source line: >> >> //********************** >> >>How should this be treated by the C++ compiler? According to the C++ lang. definition, this is a single comment extending to the end of the line. Remember, however, that a translator-based implementation applies the C preprocessor first, which sees the /*... and eliminates it before the compiler gets a shot at the input. Moral of the story is: don't do this. Whether it's right or wrong, it isn't portable. Jon
hansen@pegasus.ATT.COM (Tony L. Hansen) (05/23/89)
< the AT&T compiler uses /lib/cpp, which knows nothing about the // comment. No it doesn't! The AT&T compiler, as sold, is a shell script (CC), a compilation pass (cfront), some post-compilation pass programs (patch and munch), and some libraries. It is up to the vendor who buys AT&T's compiler to add in the preprocessor, C compiler and linker, and to modify CC accordingly to find those pieces. It sounds like your vendor chose to use /lib/cpp. Tony Hansen att!pegasus!hansen, attmail!tony hansen@pegasus.att.com
nichols@cbnewsc.ATT.COM (robert.k.nichols) (05/24/89)
In article <9383@alice.UUCP> ark@alice.UUCP (Andrew Koenig) writes: |In article <6957@brunix.UUCP>, sdm@cs.brown.edu (Scott Meyers) writes: |> Consider the following C++ source line: |> //********************** |> How should this be treated by the C++ compiler? | |It's a // comment followed by a bunch of *'s. | |However, many C preprocessors strip comments out of your |program and also don't recognize C++ comments. Thus by the |time the C++ compiler sees this, it looks like this: | | / If cpp is invoked with the "-C" option it will leave comments as is, which should solve problems like the above. This won't solve problems with // comments in macro definitions, though. -- .sig included at no extra charge. | Disclaimer: My mind is my own. Cute quote: `` '' | >> Bob Nichols nichols@iexist.att.com << |
easterb@ucscb.UCSC.EDU (William K. Karwin) (05/24/89)
In article <6957@brunix.UUCP> sdm@cs.brown.edu (Scott Meyers) writes: >Consider the following C++ source line: > > //********************** > >How should this be treated by the C++ compiler? The GNU g++ compiler >treats this as a comment-to-EOL followed by a bunch of asterisks, but the >AT&T compiler treats it as a slash followed by an open-comment delimiter. Some students ran into this problem, and the "macros-expanded-even- though-they're-in-comments" problem this school term, in a class using C++. We think one way to solve it is to have in a makefile: .c.o: @sed s/\\/\\/.\*// $< > $*.C CC $(CFLAGS) -c $*.C @/bin/rm -f $*.C The sed command strips // comments and all characters following on a line. We are using the .c suffix for our C++ code files. William Karwin, ...ucbvax!ucscc!ucscb!easterb
ark@alice.UUCP (Andrew Koenig) (05/24/89)
In article <7636@saturn.ucsc.edu>, easterb@ucscb.UCSC.EDU (William K. Karwin) writes: > The sed command strips // comments and all characters following on a > line. We are using the .c suffix for our C++ code files. What will it do with this? a = b /* *// c; -- --Andrew Koenig ark@europa.att.com
gsf@ulysses.homer.nj.att.com (Glenn Fowler[drew]) (05/24/89)
In article <958@cbnewsc.ATT.COM>, nichols@cbnewsc.ATT.COM (robert.k.nichols) writes: > > > Consider the following C++ source line: > > > //********************** > If cpp is invoked with the "-C" option it will leave comments as is, > which should solve problems like the above. This won't solve problems > with // comments in macro definitions, though. even with -C, the lines following a //*** may be treated as a single comment: #define X Y //***** X /* X is not expanded */ to get this right the // must be recognized by each component of the C++ compilation system -- Glenn Fowler (201)-582-2195 AT&T Bell Laboratories, Murray Hill, NJ uucp: {att,decvax,ucbvax}!ulysses!gsf internet: gsf@ulysses.att.com
jima@hplsla.HP.COM (Jim Adcock) (05/25/89)
> AT&T does not supply a preprocessor with its C++ translator, > any more than it supplies a linker, assembler, or C compiler. > It's up to whoever ports C++ to deal with the preprocessor. > -- > --Andrew Koenig So are preprocessor commands to be considered part of C++ *the language*, or not?
cline@sunshine.ece.clarkson.edu (Marshall Cline) (05/25/89)
In article <7636@saturn.ucsc.edu> easterb@ucscb.UCSC.EDU (William K. Karwin) writes: >Summary: one of many possible fixes >Some students ran into this problem, and the "macros-expanded-even- >though-they're-in-comments" problem this school term, in a class >using C++. We think one way to solve it is to have in a makefile: >.c.o: @sed s/\\/\\/.\*// $< > $*.C > CC $(CFLAGS) -c $*.C > @/bin/rm -f $*.C >The sed command strips // comments and all characters following on a >line. We are using the .c suffix for our C++ code files. >William Karwin, ...ucbvax!ucscc!ucscb!easterb As you said, this is _one_ of _many_ fixes. But it should be pointed out that "sed" is ignorant of the appropriate language constructs. Thus a printf which is supposed to print the string constant "double slash (//) starts a C++ comment" would be bashed into "double slash ( which would undoubtedly cause numerous syntax errors. Any regular expression parser (like "sed") is limited to regular languages. Even a push-down-automata (recognizing _context_free_languages_) is insufficient. The only correct "fix" is then a context *sensitive* language recognizer, which is nearly as complex as a Turing Machine. In other words, somebody's gonna have to buckle down and write a "c++pp" (like Gnu apparently has done). Marshall -- ________________________________________________________________ Marshall P. Cline ARPA: cline@sun.soe.clarkson.edu ECE Department UseNet: uunet!sun.soe.clarkson.edu!cline Clarkson University BitNet: BH0W@CLUTX Potsdam, NY 13676 AT&T: (315) 268-6591
jmm@eci386.uucp (John Macdonald) (05/26/89)
In article <7636@saturn.ucsc.edu> easterb@ucscb.UCSC.EDU (William K. Karwin) writes: |In article <6957@brunix.UUCP> sdm@cs.brown.edu (Scott Meyers) writes: |>Consider the following C++ source line: |> |> //********************** |> | |... We think one way to solve it is to have in a makefile: | |.c.o: | @sed s/\\/\\/.\*// $< > $*.C | CC $(CFLAGS) -c $*.C | @/bin/rm -f $*.C This will cause rare and therefore surprising problems whenever a program has a string containing // (for example, generator programs for: ed scripts (or any other generating programs that use pattern matches, sed, perl, ...); JCL (did I really admit that I thought of that example?); checks for doubled slashes in pathnames generated by concatenating a bunch of strings).