[comp.std.c] warning: '/*' within comment

darcy@druid.uucp (D'Arcy J.M. Cain) (06/02/90)

I have the following in a program I compiled with GNU C 1.36 on a 386
system.

/*
Sample usage:
	mkscript src/*.c src/*.h src/makefile man/* readme > dist.txt
*/

This gives me the above mentioned warning.  Since I like silent compiles
I modified the code by enclosing the problem line in quotes as follows:

/*
Sample usage:
	"mkscript src/*.c src/*.h src/makefile man/* readme > dist.txt"
*/

However I got the same warning.  Is there anything in the standard that
allows the compiler to ignore the quotes while inside a comment.  After
all, quotes are required to be balanced.  It seems to me that it should
work both ways and the compiler should be cognizant of quoted strings
and not look for nested comments within them.

-- 
D'Arcy J.M. Cain (darcy@druid)     |   Government:
D'Arcy Cain Consulting             |   Organized crime with an attitude
West Hill, Ontario, Canada         |
(416) 281-6094                     |

henry@utzoo.uucp (Henry Spencer) (06/02/90)

In article <1990Jun1.200433.6919@druid.uucp> darcy@druid.UUCP (D'Arcy J.M. Cain) writes:
>However I got the same warning.  Is there anything in the standard that
>allows the compiler to ignore the quotes while inside a comment.  After
>all, quotes are required to be balanced...

The stuff inside a comment doesn't have to even resemble normal C tokens.
Remember that a comment like /* isn't this subtle */ contains, by normal
C standards, an unbalanced quote.  The standard specifically commands
that the interior of comments be totally ignored except for looking for
the terminating `*/'.  (However, the standard does not constrain extra
warning messages from compilers.)
-- 
As a user I'll take speed over|     Henry Spencer at U of Toronto Zoology
features any day. -A.Tanenbaum| uunet!attcan!utzoo!henry henry@zoo.toronto.edu

gwyn@smoke.BRL.MIL (Doug Gwyn) (06/04/90)

In article <1990Jun1.200433.6919@druid.uucp> darcy@druid.UUCP (D'Arcy J.M. Cain) writes:
>However I got the same warning.  Is there anything in the standard that
>allows the compiler to ignore the quotes while inside a comment.

There is nothing in the standard that prohibits an implementation from
blathering even about the most perfect, portable code.

>After all, quotes are required to be balanced.

No, they are not, and especially not within a comment.

My advice to you is to tell your compiler vendor that you don't
appreciate gratuitous warning messages, and that
	/*	stuff;	/* comment */
is a fairly common usage for avoiding code generation when the
programmer wants to indication an operation that would normally
be necessary, but fortuitously happens to be already taken care
of in a particular case.

diamond@tkou02.enet.dec.com (diamond@tkovoa) (06/04/90)

In article <1990Jun1.200433.6919@druid.uucp> darcy@druid.UUCP (D'Arcy J.M. Cain) writes:

>Is there anything in the standard that
>allows the compiler to ignore the quotes while inside a comment.

In fact, a compiler is REQUIRED to ignore quotes while inside a comment.

/* If a compiler doesn't ingore this comment, it is broken.
   Though if it wishes to be unpopular, it may warn about the apostrophe. */

char *x = "Address of " /* Beginning of string; comment with extra " mark */
          "string";     /* and x MUST point to "Address of string".       */

-- 
Norman Diamond, Nihon DEC     diamond@tkou02.enet.dec.com
Proposed group comp.networks.load-reduction:  send your "yes" vote to /dev/null.

karl@haddock.ima.isc.com (Karl Heuer) (06/04/90)

In article <13040@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
>[correctly points out that quotes are meaningless in a comment]
>My advice to you is to tell your compiler vendor that you don't
>appreciate gratuitous warning messages,

The compiler in question was gcc, and it doesn't produce this warning by
default.  You have to enable it.

>and that
>	/*	stuff;	/* comment */
>is a fairly common usage...

Certainly, people who use this common-yet-questionable style should leave the
warning disabled.  Those of us who find it atrocious can enable the warning,
thereby catching the common error of an unclosed comment.

[D'arcy originally wrote:]
>/*
>Sample usage:
>	mkscript src/*.c src/*.h src/makefile man/* readme > dist.txt
>*/

Although you can silence the warning easily enough, note that changing the
example to include something like `*/foo.c' could be surprising.

Karl W. Z. Heuer (karl@ima.ima.isc.com or harvard!ima!karl), The Walking Lint

ghoti+@andrew.cmu.edu (Adam Stoller) (06/04/90)

Use the alternate style of commenting - i.e. instead of:

/*
Sample usage:
	mkscript src/*.c src/*.h src/makefile man/* readme > dist.txt
*/

Try:

#if 0
Sample usage:
	mkscript src/*.c src/*.h src/makefile man/* readme > dist.txt
#endif /* 0 */

I believe that this should pass through lint, compiler, linker, etc.
without any problem - and not conflict with ANSI in any way (at least
none that I recall reading)

--fish
( I'm not an authority - and I don't play one on tv )

darcy@druid.uucp (D'Arcy J.M. Cain) (06/04/90)

In article <13040@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
>In article <1990Jun1.200433.6919@druid.uucp> darcy@druid.UUCP (D'Arcy J.M. Cain) writes:
>>However I got the same warning.  Is there anything in the standard that
>>allows the compiler to ignore the quotes while inside a comment.
> [...]
>My advice to you is to tell your compiler vendor that you don't
>appreciate gratuitous warning messages, and that
>	/*	stuff;	/* comment */
>is a fairly common usage for avoiding code generation when the
>programmer wants to indication an operation that would normally
>be necessary, but fortuitously happens to be already taken care
>of in a particular case.
But I do appreciate it most of the time.  The proper way to do what you
are suggesting is to use "#if 0" instead of a comment.  What I want is
a way to tell the compiler that in this case I really mean what I am
saying so please don't warn me.  It is sort of like the construct "if
(a = getval())" which may generate a warning.  It's a good warning and
can be avoided by "if ((a = getval()) != 0)" if you really mean it.

I considered using the "#if 0" construct to handle my example but that
would probably mean closing the regular comment and reopening it after the
#endif.  I am sure that the compiler would ignore any #if statement in
a comment.

-- 
D'Arcy J.M. Cain (darcy@druid)     |   Government:
D'Arcy Cain Consulting             |   Organized crime with an attitude
West Hill, Ontario, Canada         |
(416) 281-6094                     |

steve@taumet.COM (Stephen Clamage) (06/05/90)

In article <1990Jun1.200433.6919@druid.uucp> darcy@druid.UUCP (D'Arcy J.M. Cain) writes:
>/*
>Sample usage:
>	mkscript src/*.c src/*.h src/makefile man/* readme > dist.txt
>*/
>
>This gives me the above mentioned warning.
> [ more about quotes in comments, should they be balanced ]

The standard says (section 3.1.9):

"Except within a character constant, a string literal, or a comment,
the characters /* introduce a comment.  The contents of a comment are
examined only to identify multibyte characters and to find the
characters */ that terminated it."

A footnote clarifies that comments do not nest.

Some compilers allow comment nesting, and warn about apparently nested
comments.  This is not standard behavior.
-- 

Steve Clamage, TauMetric Corp, steve@taumet.com

karl@haddock.ima.isc.com (Karl Heuer) (06/05/90)

In article <AaOZGd_00as98coVU9@andrew.cmu.edu> ghoti+@andrew.cmu.edu (Adam Stoller) writes:
>Use the alternate style of commenting:
>#if 0
>Sample usage:
>	mkscript src/*.c src/*.h src/makefile man/* readme > dist.txt
>#endif /* 0 */

Won't work.  The `#endif' has been commented out by the first `/*' inside.

More generally, `#if 0...#endif' should not be considered a `comment', except
in the sense of `commenting out code'.  The contents are still lexed into C
tokens, which is why it's also illegal to say
	#if 0
	The compiler won't like this
	#endif

Karl W. Z. Heuer (karl@ima.ima.isc.com or harvard!ima!karl), The Walking Lint

harrison@necssd.NEC.COM (Mark Harrison) (06/06/90)

In article <16786@haddock.ima.isc.com>, karl@haddock.ima.isc.com
 (Karl Heuer) writes:
> More generally, `#if 0...#endif' should not be considered a `comment', except
> in the sense of `commenting out code'.  The contents are still lexed into C
> tokens, which is why it's also illegal to say
> 	#if 0
> 	The compiler won't like this
> 	#endif
> 
> Karl W. Z. Heuer (karl@ima.ima.isc.com or harvard!ima!karl), The Walking Lint

Are you sure about this?  I tried your example, and it both compiled and
linted.  If this is true, then the following should also not work:

#if MICROSOFT
extern far char * x;  /* however it's done */
#endif

#if VMS
extern char * x$something;  /* however it's done */
#endif
-- 
Mark Harrison             harrison@necssd.NEC.COM
(214)518-5050             {necntc, cs.utexas.edu}!necssd!harrison
standard disclaimers apply...

diamond@tkou02.enet.dec.com (diamond@tkovoa) (06/07/90)

In article <371@necssd.NEC.COM> harrison@necssd.NEC.COM (Mark Harrison) writes:
>In article <16786@haddock.ima.isc.com>, karl@haddock.ima.isc.com
   (Karl Heuer) writes:
 >>More generally, `#if 0...#endif' should not be considered a `comment', except
 >>in the sense of `commenting out code'.  The contents are still lexed into C
Should be:                 The contents MIGHT still be lexed into PREPROCESSOR
 >>tokens, which is why it's also illegal to say
 >> 	#if 0
 >> 	The compiler won't like this
Should be:  SOME perfectly valid compilers won't like this
 >> 	#endif
>Are you sure about this?  I tried your example, and it both compiled and
>linted.

Your compiler is lazy, legally (i.e. just being efficient, unless you asked
for extra checking).  Your lint is broken.  Get a refund for your lint.

>If this is true, then the following should also not work:
>#if MICROSOFT
>extern far char * x;  /* however it's done */
>#endif

No, because "far" is a valid preprocessor token.  The compiler can't
know if you also if'ed out something legal like "#define far unsigned".

>#if VMS
>extern char * x$something;  /* however it's done */
>#endif

Yes, compilers can legally complain about this.

-- 
Norman Diamond, Nihon DEC     diamond@tkou02.enet.dec.com
Proposed group comp.networks.load-reduction:  send your "yes" vote to /dev/null.

karl@haddock.ima.isc.com (Karl Heuer) (06/07/90)

In article <371@necssd.NEC.COM> harrison@necssd.NEC.COM (Mark Harrison) writes:
>[Karl Heuer writes:]
>>The contents [of #if...#endif] are still lexed into C tokens, which is why
>>it's also illegal to say   #if 0...The compiler won't like this...#endif
>
>Are you sure about this?  I tried your example, and it both compiled and
>linted.

Yes, I'm sure.  You just happen to be using a preprocessor that chooses to
recover from the error of an unterminated character constant by silently
resetting the lexical state at the newline.

>If this is true, then the following should also not work:
>	#if MICROSOFT
>	extern far char * x;  /* however it's done */
>	#endif
>	#if VMS
>	extern char * x$something;  /* however it's done */
>	#endif

Does not follow.  The tokenizing that continues to occur through excluded code
is the same as normal; it sees the `/*...*/' as a comment.

The `$' is a more interesting case, but I'll factor it out into a separate
article.

Karl W. Z. Heuer (karl@ima.ima.isc.com or harvard!ima!karl), The Walking Lint

msb@sq.sq.com (Mark Brader) (06/08/90)

> /*
> Sample usage:
> 	mkscript src/*.c src/*.h src/makefile man/* readme > dist.txt
> */

Assuming that your source is in ASCII, the *appearance* of a /* inside
a comment may be obtained by using the sequence /, space, backspace, *.
Of course, some programs for displaying the file may elect to show ^H
for the backspace, so this won't always work.  There is no ANSI C problem
with this approach, at any rate, as anything is legal inside a comment.

The appearance of a */ can be obtained inside a comment similarly, but
I would term this usage so bizarre as to be misleading.

I would write the above comment as:

/*
 * Sample usage:
 * 	mkscript src/_*.c src/_*.h src/makefile man/_* readme > dist.txt
 *
 * (Ignore the _ characters; they are there to prevent "unclosed comment?"
 * warnings when this code is compiled!)
 */

And similarly using *_/file.c if */file.c was the usage.


-- 
Mark Brader	"Many's the time when I've thanked the DAG of past years
utzoo!sq!msb	for anticipating future maintenance questions and providing
msb@sq.com	helpful information in the original sources."	-- Doug A. Gwyn

This article is in the public domain.

seanf@sco.COM (Sean Fagan) (06/09/90)

In article <371@necssd.NEC.COM> harrison@necssd.NEC.COM (Mark Harrison) writes:
>Are you sure about this?  I tried your example, and it both compiled and
>linted.  If this is true, then the following should also not work:

>#if MICROSOFT
>extern far char * x;  /* however it's done */
>#endif
>#if VMS
>extern char * x$something;  /* however it's done */
>#endif

This lexes just fine.  For most purposes, all we care about is the
preprocessor (or, if it's built into the compiler, the preprocessing stage).
Since you say you ran lint, I suspect you were on a traditional unix box,
and /lib/cpp was used.

/lib/cpp is notorious for being easy.  If neither MICROSOFT nor VMS is
defined, then the compiler will never see them, and, therefore, it can't
care.

If, however, you were to use a compiler with a built-in preprocessor, then
it might not work.  (uSoft C accepts it, I suspect because it considers $ a
valid lexable character [support for other languages, I suspect].)

Remember that no semantic analysis is being done.

-- 
-----------------+
Sean Eric Fagan  | "It's a pity the universe doesn't use [a] segmented 
seanf@sco.COM    |  architecture with a protected mode."
uunet!sco!seanf  |         -- Rich Cook, _Wizard's Bane_
(408) 458-1422   | Any opinions expressed are my own, not my employers'.

thorinn@skinfaxe.diku.dk (Lars Henrik Mathiesen) (06/09/90)

karl@haddock.ima.isc.com (Karl Heuer) writes:
<More generally, `#if 0...#endif' should not be considered a `comment', except
<in the sense of `commenting out code'.  The contents are still lexed into C
<tokens ...

diamond@tkou02.enet.dec.com (diamond@tkovoa) writes:
<Should be: The contents MIGHT still be lexed into PREPROCESSOR tokens

<> 	#if 0
<> 	The compiler won't like this
<Should be:  SOME perfectly valid compilers won't like this
<> 	#endif

If you know of an ANSI C compiler (I don't think it's ``perfectly
valid'' otherwise) which does not lex #if'fed-out blocks into
pptokens, please explain how it handles this conformant (I think)
program:
----------- cut -----------
#include <stdio.h>
#if 0
C's weird\
#if 1		/* isn't it */
#else
int main(int c, char *v[]) { printf("Hello, world!\n"); return 0; }
#endif
----------- cut -----------

--
Lars Mathiesen, DIKU, U of Copenhagen, Denmark      [uunet!]mcsun!diku!thorinn
Institute of Datalogy -- we're scientists, not engineers.      thorinn@diku.dk

diamond@tkou02.enet.dec.com (diamond@tkovoa) (06/11/90)

In article <1990Jun8.224827.23783@diku.dk> thorinn@skinfaxe.diku.dk (Lars Henrik Mathiesen) writes:

>If you know of an ANSI C compiler (I don't think it's ``perfectly
>valid'' otherwise) which does not lex #if'fed-out blocks into
>pptokens, please explain how it handles this conformant (I think)
>program:
>#include <stdio.h>
>#if 0
>C's weird\
>#if 1		/* isn't it */
>#else
>int main(int c, char *v[]) { printf("Hello, world!\n"); return 0; }
>#endif

What's the problem?
 #include <stdio.h>
 #if 0
 C's weird#if 1 /* isn't it */
 #else
 int main(int c, char *v[]) { printf("Hello, world!\n"); return 0; }
 #endif
preprocesses to
 [contents of <stdio.h>]
 int main(int c, char *v[]) { printf("Hello, world!\n"); return 0; }
regardless of whether the preprocessor really tokenizes or not the line
 C's weird#if 1 /* isn't it */

-- 
Norman Diamond, Nihon DEC     diamond@tkou02.enet.dec.com
Proposed group comp.networks.load-reduction:  send your "yes" vote to /dev/null.

tkacik@rphroy.uucp (Tom Tkacik) (06/12/90)

In article <1777@tkou02.enet.dec.com>, diamond@tkou02.enet.dec.com
(diamond@tkovoa) writes:
 
|> What's the problem?
|>  #include <stdio.h>
|>  #if 0
|>  C's weird#if 1 /* isn't it */
|>  #else
|>  int main(int c, char *v[]) { printf("Hello, world!\n"); return 0; }
|>  #endif
|> preprocesses to
|>  [contents of <stdio.h>]
|>  int main(int c, char *v[]) { printf("Hello, world!\n"); return 0; }
|> regardless of whether the preprocessor really tokenizes or not the line
|>  C's weird#if 1 /* isn't it */

The problem is the C's.  If the preprocessor tokenizes, it will think that
the 's is the start of 's'.  It may issue an error about a missing
closing quote.  Even though the text is ignored, this must be analyzed. 
--
Tom Tkacik			GM Research Labs,   Warren MI  48090
Work phone: (313)986-1442	uunet!edsews!rphroy!megatron!tkacik
"I'm president of the United States, and I'm not going to eat anymore
broccoli."
						--- George Bush

thorinn@skinfaxe.diku.dk (Lars Henrik Mathiesen) (06/13/90)

diamond@tkou02.enet.dec.com (diamond@tkovoa) points out that I gave a
bad counterexample, so I'll try again.

If you know of an ANSI C compiler (I don't think it's ``perfectly
valid'' otherwise) which does not lex #if'fed-out blocks into
pptokens, please explain how it handles this conformant (I think)
program:
#include <stdio.h>
#if 0
I'll put a /* inside 's
#endif
#if 1
Now it's a */. This program doesn't use comments.
#else
int main(int c, char *v[]) { printf("Hello, world!\n"); return 0; }
#endif

--
Lars Mathiesen, DIKU, U of Copenhagen, Denmark      [uunet!]mcsun!diku!thorinn
Institute of Datalogy -- we're scientists, not engineers.      thorinn@diku.dk

diamond@tkou02.enet.dec.com (diamond@tkovoa) (06/13/90)

In article <1990Jun12.222745.3015@diku.dk> thorinn@skinfaxe.diku.dk (Lars Henrik Mathiesen) writes:

>If you know of an ANSI C compiler (I don't think it's ``perfectly
>valid'' otherwise) which does not lex #if'fed-out blocks into
>pptokens, please explain how it handles this conformant (I think)
>program:
>#include <stdio.h>
>#if 0
>I'll put a /* inside 's
>#endif
>#if 1
>Now it's a */. This program doesn't use comments.
>#else
>int main(int c, char *v[]) { printf("Hello, world!\n"); return 0; }
>#endif

This is not a conformant program.  If processed by an ANSI C compiler,
here is the result of preprocessing:
 [contents of <stdio.h>]
 Now it [character constant] t use comments.
There isn't necessarily whitespace next to the character constant, and
we don't know if brute-force tokenization actually occurs.
We still don't know if the preprocessor really tokenizes the if'ed-out
lines, aside from recognizing characters, strings, and comments:
 I [character constant] s
 int main(int c, char *v[]) { printf( [string constant] ); return 0; }

-- 
Norman Diamond, Nihon DEC     diamond@tkou02.enet.dec.com
Proposed group comp.networks.load-reduction:  send your "yes" vote to /dev/null.

diamond@tkou02.enet.dec.com (diamond@tkovoa) (06/14/90)

The original poster of this topic has already changed the example
twice, but let's wrap up this tangent.

In article <25176@rphroy.UUCP> tkacik@rphroy.uucp (Tom Tkacik) writes:
||| What's the problem?
|||  #include <stdio.h>
|||  #if 0
|||  C's weird#if 1 /* isn't it */
|||  #else
|||  int main(int c, char *v[]) { printf("Hello, world!\n"); return 0; }
|||  #endif
||| preprocesses to
|||  [contents of <stdio.h>]
|||  int main(int c, char *v[]) { printf("Hello, world!\n"); return 0; }
||| regardless of whether the preprocessor really tokenizes or not the line
|||  C's weird#if 1 /* isn't it */
>
>The problem is the C's.  If the preprocessor tokenizes, it will think that
>the 's is the start of 's'.  It may issue an error about a missing
>closing quote.  Even though the text is ignored, this must be analyzed. 

Huh?
|||  C's weird#if 1 /* isn't it */
      '                   '
it has opening and closing 's.
The VALUE of a character constant containing more than one character is
implemenation defined, but its syntactic effect is not.  It is a character
constant.  (And it is #if'ed out.)

-- 
Norman Diamond, Nihon DEC     diamond@tkou02.enet.dec.com
Proposed group comp.networks.load-reduction:  send your "yes" vote to /dev/null.

martin@mwtech.UUCP (Martin Weitzel) (06/15/90)

In article <25176@rphroy.UUCP> tkacik@rphroy.uucp (Tom Tkacik) writes:
>
>In article <1777@tkou02.enet.dec.com>, diamond@tkou02.enet.dec.com
>(diamond@tkovoa) writes:
[refer to above article for deleted lines]
>
>The problem is the C's.  If the preprocessor tokenizes, it will think that
>the 's is the start of 's'.  It may issue an error about a missing
                                 ^^^
>closing quote.  Even though the text is ignored, this must be analyzed. 
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Very true. I think this is what all related questions can be reduced to:
  1) In ANSI-C there are certain "phases of translation" which (at least
     logically) occur *before* parts of the source are "#if-ed out".
  2) There are certain sequences of input characters which have to be
     analized during this early phases and which would clearly be
     erreneous *if* they would have to be fully compiled later.
Now: Is it "required", "unacceptable" or a "quality of implementation
     issue", that such a sequence of characters in ignored parts of
     the source makes the whole program not to compile?

Note that the answer to this must be known if you have to decide what
may follow a #pragma in a "conformant program". To my understanding
#pragma-s *are* allowed in such programs *if* they are enclosed in
some #if !(__STDC__ == 1) -- #endif context.
-- 
Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83

thorinn@skinfaxe.diku.dk (Lars Henrik Mathiesen) (06/16/90)

martin@mwtech.UUCP (Martin Weitzel) writes:
>Now: Is it "required", "unacceptable" or a "quality of implementation
>     issue", that such a sequence of characters in ignored parts of
>     the source makes the whole program not to compile?

The compiler must behave as if the ignored parts were parsed into
proprocessing tokens. A conforming compiler must produce at least one
diagnostic message for a program that violates a syntax rule or a
constraint in the standard. The syntax states that a character or
string literal does not contain newline characters.

On the other hand, the compiler could just say "non-conforming syntax
used" once and let end-of-line close the literal; that would be an
extension to the syntax. It might alternatively allow multiline
literals; that would perhaps be better quality of implementation, but
it would still have to warn about it.

Strictly speaking, a conforming preprocessor will not be able to find
a ``real'' literal on a line with only one quote character. In that
case, the quote falls into the same category of preprocessing token as
$, @, etc., but with undefined behaviour. It is actually this
undefined behaviour which allows the compiler to find a ``nonreal''
literal instead. But I wonder if it will also allow the preprocessor
to throw the quote away with no message? In any case, it may just let
the pptoken be; if it is ever converted to a token, an error will
occur; but ignored parts are never converted. (Note: This is according
to the MAY 88 draft; it may have changed). But I don't think anybody
will do a lexer like that: When they get to end-of-line with an
unclosed literal, they'd have to go back and rescan for comment starts
and the other quote char. If they did, their (conforming) compiler
might accept the following (conforming, but not strictly conforming)
fragment _without_error_messages_:

#define Q '
#define CTRL(a) (Q##a##Q&037)
	putchar(CTRL(g));

Somebody please tell me that the final standard rules this out. (I
know that it doesn't require it; there are at least two undefined's
and an interpretation issue in here).

--
Lars Mathiesen, DIKU, U of Copenhagen, Denmark      [uunet!]mcsun!diku!thorinn
Institute of Datalogy -- we're scientists, not engineers.      thorinn@diku.dk