" comments and #define's : problems

shankar@hpclscu.HP.COM (Shankar Unni) (07/12/89)

When fixing up a "cpp" to handle "//"-style comments, I came across an
interesting problem.

Comments on #define's are preserved by cpp when sucking up the define, and
later, after substitution, subsequent rescans of the line simply strip out
the comment. (This is how it makes token-pasting, etc work).

When the comment is a "//" comment, the following happens:

    #define a b // comment
    x a y

turns into

    x b // comment y

and then into

    x b

(I.e. the "y" is stripped out).

This is an intrinsic problem with "//"-comments on #defines, because if you
use the "pass comments" option on cpp (usually -C on sysV-based systems),
the same thing will happen (you'll get stage two above, and then cfront
will strip out the "y" token).

Question:

How should a "//"-comment on a #define work?

  (a) Completely stripped out when processing #define. Not passed through
      even when requesting "pass comments". (This is my preference).

  (b) Leave it as is. I.e. specify that a "//"-comment on a #define will
      have "undefined" (i.e. nasty and unpredictable) effects. Warn the
      user when detected.

  (c) Do some fudge like have "cpp" insert a "/*" and "*/" around the
      "//"-style comment, as in

      x b /* // comment */ y

I'm interested in hearing opinions on this (esp. from AT&T, since they are
asking their cfront customers/VARs to provide their own cpp).  Shouldn't this
be standardized somewhere?

This is one more reason to adopt a *standard* cpp (preferably an ANSI C style
cpp), with well-defined "//"-comment semantics.
-----
Shankar Unni                                   E-Mail: 
Hewlett-Packard California Language Lab.     Internet: shankar@hpda.hp.com
Phone : (408) 447-5797                           UUCP: ...!hplabs!hpda!shankar

paulsc@radio_flyer.WV.TEK.COM (Paul Scherf;685-2734;61-028;;orca) (07/12/89)

In article <1000019@hpclscu.HP.COM> shankar@hpclscu.HP.COM (Shankar Unni) writes:
>When fixing up a "cpp" to handle "//"-style comments, I came across an
>interesting problem.

>Comments on #define's are preserved by cpp when sucking up the define, and
>later, after substitution, subsequent rescans of the line simply strip out
>the comment. (This is how it makes token-pasting, etc work).
...
>This is an intrinsic problem with "//"-comments on #defines, because if you
>use the "pass comments" option on cpp (usually -C on sysV-based systems),
>the same thing will happen (you'll get stage two above, and then cfront
>will strip out the "y" token).

>Question:

>How should a "//"-comment on a #define work?

>  (a) Completely stripped out when processing #define. Not passed through
>      even when requesting "pass comments". (This is my preference).

On the C compiler I wrote for my homebrew system, cpp doesn't perserve
comments on #defines when sucking up the define, unless the "pass
comments" option is used.  I decided this was probably ok.  I don't
particularly like the token pasting syntax specified in ANSI C, but I
can live with that, when I need to paste tokens together.

>  (b) Leave it as is. I.e. specify that a "//"-comment on a #define will
>      have "undefined" (i.e. nasty and unpredictable) effects. Warn the
>      user when detected.

This is what Bjarne Stroustrup says C++ should do in his book "The C++
Programming Language" (I only know about one edition) on page 130.  I
think my scheme limits the nasty and unpredictable effects, to
programmers who are trying to be tricky (e.g. using the "pass comments"
option).

>  (c) Do some fudge like have "cpp" insert a "/*" and "*/" around the
>      "//"-style comment, as in

>      x b /* // comment */ y

I considered this, but I rejected it when I discovered that an old
C-style comment (/* ... */) can't always comment a new C++-style
comment (// ... ).

shankar@hpclscu.HP.COM (Shankar Unni) (07/13/89)

> When fixing up a "cpp" to handle "//"-style comments, I came across an
> interesting problem.

I've already received several messages pointing out page 130 of the Bible
to me.

I guess my reading of that part of the book was a little perfunctory.
However, after reading that page, I'm still uncomfortable with it. What
that page is trying to do is treat the preprocessing phase as something
separate from the C++ language, and cramming the C preprocessing semantics
down the throat of C++ (since the C language does not have "//" comments,
let's make them work only on the lines unaffected by the preprocessor..).

The first thing we need to do is to look at the whole setup in a unified
fashion. Either macros and ifdef's are in the language, or they are not.
If they are, then the preprocessing phase is an essential part of the
language, and should work coherently with the other aspects of the
language, like "//" comments. If they are not, then that fact should be
mentioned explicitly up front. Since they are mentioned in pages 129 and
130, and also in sec 7.3.5 (generic classes), I assume that they are part
of the language.

It does not take a great effort to define precise semantics for the
preprocessing phase. For instance, in ANSI C, the precise semantics are
defined in sections 2.1.1.2 (translation phases) and 3.7 (preprocessor
directives). Incidentally, in the ANSI standard, the comments are stripped
from the macro replacement text UP FRONT (i.e. even before processing the
#define directive), so page 130 of the Bible will have to change a bit.

What do you say?
-----
Shankar Unni                                   E-Mail: 
Hewlett-Packard California Language Lab.     Internet: shankar@hpda.hp.com
Phone : (408) 447-5797                           UUCP: ...!hplabs!hpda!shankar

wrl@apple.com (Wayne Loofbourrow) (07/18/89)

In article <1000019@hpclscu.HP.COM> shankar@hpclscu.HP.COM (Shankar Unni) 
writes:
> How should a "//"-comment on a #define work?
> 
>   (a) Completely stripped out when processing #define. Not passed through
>       even when requesting "pass comments". (This is my preference).
> 
>   (b) Leave it as is. I.e. specify that a "//"-comment on a #define will
>       have "undefined" (i.e. nasty and unpredictable) effects. Warn the
>       user when detected.
> 
>   (c) Do some fudge like have "cpp" insert a "/*" and "*/" around the
>       "//"-style comment, as in
> 
>       x b /* // comment */ y

Another alternative is to keep all of the text of the #define, _including_ 
the "\n", as in:
        x b // comment
        y

This is somewhat like (c), although simpler.
Would this work in all cases, or would one need to detect the presence of 
a // comment first?
(Note: it would only be used if user specifies the "pass comments" option)

----------------
Wayne Loofbourrow
wrl@apple.com
Apple Computer

scott@cs.ua.oz.au (Scott Davis) (07/21/89)

In article <1000019@hpclscu.HP.COM> shankar@hpclscu.HP.COM (Shankar Unni) writes:
>   When fixing up a "cpp" to handle "//"-style comments, I came across an
>   interesting problem.
>
>   Comments on #define's are preserved by cpp when sucking up the define, and
>   later, after substitution, subsequent rescans of the line simply strip out
>   the comment. (This is how it makes token-pasting, etc work).
>
>	[further discussion of problem]
>
>   Question:
>
>   How should a "//"-comment on a #define work?
>
>     (a) Completely stripped out when processing #define. Not passed through
>	 even when requesting "pass comments". (This is my preference).
>
>     (b) Leave it as is. I.e. specify that a "//"-comment on a #define will
>	 have "undefined" (i.e. nasty and unpredictable) effects. Warn the user
>	 when detected.
>
>     (c) Do some fudge like have "cpp" insert a "/*" and "*/" around the
>	 "//"-style comment, as in
>
>	 x b /* // comment */ y
>
>   This is one more reason to adopt a *standard* cpp (preferably an ANSI C 
>   style cpp), with well-defined "//"-comment semantics.
>

What about:
      (d) Insert the carriage return which terminates the comment as well, as
in 
	x b // comment
	y

OK, so this doesn't look quite the same, but at least it is semantically
correct, and the comments *are* still there if you want to read them.

This is the method I would expect, as the comment is terminated by end of line,
it doesn't just fade off to some indeterminate point. The Carriage return is as
much a part of the comment as the '//' which started it.

You don't strip off the */ from the end of a C-style comment, just leaving the
/* at the beginning, so why treat C++-style comments any different?

Comments?

Scott.
--
Scott Davis				ACSnet: scott@cs.ua.oz{.au}
Honours student,			 other: scott@cs.ua.oz@uunet.uu.net
Department of Computer Science,
University of Adelaide.
Australia.
--
Scott Davis				ACSnet: scott@cs.ua.oz{.au}
Honours student,			 other: scott@cs.ua.oz@uunet.uu.net
Department of Computer Science,
University of Adelaide.

shankar@hpclscu.HP.COM (Shankar Unni) (07/25/89)

> What about:
>       (d) Insert the carriage return which terminates the comment as well,
>            as in 
> 	x b // comment
> 	y

I'm uncomfortable about this, for a couple of reasons:

  (a) You're overloading the Newline. If you suck up the newline as part
      of the comment, then the #define mechanism should not see the newline.

      Consider, for example,
      
        #define foo bar  /*
	                */
      
      You surely wouldn't treat the embedded newline as a #define terminator.
      In your example, if you treat the newline as part of the comment, then
      you've got the same thing - a newline embedded in a comment, devoid of
      its special properties with respect to #define's. Strictly speaking,
      the preprocessor should then continue on to the next line and tack on
      those tokens onto the expansion list.
      
  (b) Consider, also, tokenizing preprocessors (not the character-based ones
      like the classic /lib/cpp). These preprocessors would convert input
      into discrete tokens (and would convert newlines into a unique,
      white-space-like token so that the #define mechanism would see it
      separately from other whitespace).
      
      If you asked such a preprocessor to suck up the newline as part of
      the "comment" token, it would "lose" the newline (after all, the same
      character cannot be a part of two different tokens).

In fact, the approach that I ended up taking is to strip out "//" comments
from #defines entirely. They will not show up even if "-C" is requested.
It's a minor aesthetic wart, but semantically it's much more satisfying..
-----
Shankar Unni                                   E-Mail: 
Hewlett-Packard California Language Lab.     Internet: shankar@hpda.hp.com
Phone : (408) 447-5797                           UUCP: ...!hplabs!hpda!shankar