[comp.lang.c] Why nested comments not allowed?

ly@prism.TMC.COM (02/15/90)

	I'm just curious to know why nested comments are not allowed in many 
	languages.

raw@math.arizona.edu (Rich Walters) (02/15/90)

In article <236100027@prism> ly@prism.TMC.COM writes:
>
>	I'm just curious to know why nested comments are not allowed in many 
>	languages.

Nested comments are not supported because it is difficult to tell where the
nesting ends.  Have you ever written a paren checker in C? (in any language?)

Reason 2) Why waste the computing power??  After all, it's only a comment!!

mike@hpfcso.HP.COM (Mike McNelly) (02/16/90)

	> I'm just curious to know why nested comments are not allowed in many 
	> languages.

1.  They have limited usefulness.  For most of the occasions where they
are useful, conditional compilation seems to work better.  In C, for
example,

	# if condition1
	...
	#	 if condition2
	...
	#	 endif
	...
	# endif 

seems to fill the void (pardon the pun) pretty well for me.

2.  For those languages with the concept of lines, I find end-of-line
comments to be the least bug inducing form.  Incorrectly terminated
comment blocks, whether in a language that supports comment nesting or
not, seem to be a great source of bugs.  Note that I am NOT saying that
I prefer line oriented languages.

I guess it's pretty much a matter of personal preference on the part of
the language designers.

Mike McNelly
mike%hpfcla@hplabs.hp.com

bethge@wums.wustl.edu (02/17/90)

In article <1523@wacsvax.OZ>, chris@wacsvax.OZ (chris mcdonald) writes:
> raw@math.arizona.edu (Rich Walters) writes:
>>In article <236100027@prism> ly@prism.TMC.COM writes:
>>>	I'm just curious to know why nested comments are not allowed in many 
>>>	languages.
>>Nested comments are not supported because it is difficult to tell where the
>>nesting ends.  Have you ever written a paren checker in C? (in any language?)
>>Reason 2) Why waste the computing power??  After all, it's only a comment!!
> What a stupid response!
> I don't know why they are not supported but agree that they are damn
> useful. They are easy to parse in syntactically correct programs (ever
> heard of counting?) and, after all, the computer/compiler is supposed to
> do what we tell it, not for us to bow down and minimize its work.
> If its really too hard for you to count comments I'll sell you a little
> parser for a ridiculous amount.

I must be missing something here. What are nested comments good for? The only
use  I  can  think  of  is  "commenting  out" a section of code which already
contains comments.  But C has #if ... #endif for this purpose.  So what's the
problem?

One minor thing that has always annoyed me about C is that  it  takes  *four*
keystrokes,  two  shifted and two unshifted, to make a comment.  I know, *if*
one has a smart enough editor one can define a macro to do it.  But  it  also
doesn't leave much space for end-of-line comments in indented code.

IMO the only suitable comment delimiters are <some-reserved-single-character>
and <end-of-line>. This convention saves keystrokes and space, and avoids the
hazard of accidentally "commenting out" a section of code  with  a  defective
comment terminator.  Of course, it's a problem for C, which already uses just
about every character in the ASCII set! :-)

raymond@twinkies.berkeley.edu (Raymond Chen) (02/20/90)

Dear everyone who wants to argue that nested comments are "good", "easy to
implement", "has no hidden surprises":

Please present a coherent rule for nested comments for which the following
lines of code produce "expected" results:

	int openquote = 34; /* " */ int closequote = 34; /* Also " */
	int quote = 34; /* for a good time, type printf("*/ %c",quote) */
	/* int doublequote = 34; /* " */ */ /* don't need this one: '"' */
	/* printf("*/ is the close-comment token\n"); 
	   printf("/* is the \"open-comment\" token\n"); */

Any set of rules for making these pathological cases work "right" will
probably be so complicated that nobody will understand them, and in
fact you'll have MORE problems with nested comments than you do today
(because the current rules are easy to remember).

Conclusions:  

(1) Don't use comments to comment out code.  If you need to 
    comment out code, use #if 0 .. #endif.  They nest nicely.

(2) There exist programs out there whose job is to help catch runaway 
    comments.  Use them.  I'll send you mine if you want one.
--
 raymond@math.berkeley.edu         mathematician by training, hacker by choice

schaefer@ogicse.ogc.edu (Barton E. Schaefer) (02/21/90)

In article <1990Feb19.221039.4243@agate.berkeley.edu> raymond@math.berkeley.edu (Raymond Chen) writes:
} Dear everyone who wants to argue that nested comments are "good", "easy to
} implement", "has no hidden surprises":
} 
} Please present a coherent rule for nested comments for which the following
} lines of code produce "expected" results:

Actually, the cases presented are not in themselves very interesting.
The interesting thing is trying to parse "code" like that after
commenting it out:

	/* Commented out ...
	/* ... with extra nesting just for fun
	int openquote = 34; /* " */ int closequote = 34; /* Also " */
	int quote = 34; /* for a good time, type printf("*/ %c",quote) */
	/* int doublequote = 34; /* " */ */ /* don't need this one: '"' */
	/* printf("*/ is the close-comment token\n"); 
	   printf("/* is the \"open-comment\" token\n"); */
	*/ end extra */

-- 
Bart Schaefer          "February.  The hangnail on the big toe of the year."
                                                                    -- Duffy

schaefer@cse.ogi.edu (used to be cse.ogc.edu)

ok@goanna.oz.au (Richard O'keefe) (02/21/90)

In article <236100027@prism>, ly@prism.TMC.COM writes:
> 	I'm just curious to know why nested comments are not allowed in many 
> 	languages.

To start with, some languages _do_ allow them.  For example,
Common Lisp has   #|...comment...|#  which nests.

There is the obvious point that nested structures of any kind are
not definable with regular expressions (and LEX is not the _only_
r.e. tool around, you know).

But the *real* reason is that they simply don't work.  Imagine a
Pascal dialect which admits nested comments.  Comments are used to include
natural-language text in the program, so we have to allow things like
	{This is a `quotation'}
But program text may legitimately contain
	fred := '}';
and when we comment it out by wrapping {..} around it we get
	{ fred := '}'; }
In order to handle this code fragment, we mustn't take the "}" following
the "'" as a closing bracket, but in order to handle the text fragement
we *must* take the "}" following the "'" as a closing bracket.

We could easily arrange for comments to be viewed as unstructured
except for comment brackets being significant.  That's what Common Lisp
does, and it's what's usually done when nested comments are provided.
But that means that wrapping a *valid* statement in comment brackets
may produce *invalid* text.
We could easily arrange for comments to be viewed as sequences of
programming language tokens.  Pop-2 did that.  Commenting out code
fragments would work well done that way, but you'd have trouble with
text.  In fact Pop-2 programmers used to have to write
	comment `This is text written as a string so that it can'
		`be included in a comment without being parsed as'
		`Pop tokens';
Not good.

We conclude that there are two *different* things:
    (a) marking a sequence of tokens so that the processor will behave
	as though those tokens were not present
    (b) including text which does not follow the lexical rules of the
	programming language in question

In C, we use #if/#endif (which nest!) for (a) and /**/ for (b).

Another clue that (a) and (b) are different is that there is usually
some _reason_ why the sequence of tokens in (a) is not to be included,
but no reason is needed for (b) because non-token text could _never_
have been part of the program proper.  This also suggests that it might
be a good idea to explicitly label type (a) "comments" with the reason.
In C, for example, we would have
	#if	DEBUGGING
		....
	#endif

cmp8118@sys.uea.ac.uk (D.S. Cartwright) (02/21/90)

Forgive me if I'm wrong (for this is just a guess) but perhaps the availability
of nested comments is simply a decision by the original language designer based
on the eventual implementation of the thing.

	For example, someone heavily into recursion would probably include
nested comments, as they can then be dealt with recursively, something like :

	Function CommentAnalyse
	   Do
	      Get token
	      If token = 'Open Comment'
	          Call CommentAnalyse
	   Until token = 'Close Comment'
        End Function

	(in a completely no-existent language that I just made up).

  On the other hand, they might not consider recursion. This would mean that
an iterative method would have to be used to find the end of a comment. If,
then, nested comments were required, a counter would have to be stored with
details of how many 'Open Comment' tokens we had had so far. With no nested
comments, it's much easier to do (no counter is needed):

	Function CommentAnalyse
	   Do
	      Get token
	   Until token = 'Close Comment'
        End Function

	(in the same non-existent language).

   Perhaps I am talking utter rubbish, and language designers have better
things to think about than how their design is going to be implemented. It
just seems rather a coincidence that the two basic methods of processing
(iteration and recursion) fall into line with the two types of comment (nested
and non-nested).

	Anyone else got any thoughts ?? Can anyone categorically state that
I'm wrong (my Programming Systems lecturer will probably rip my argument to
bits, but that's nothing new) ?? If so, I'd appreciate your views.

	Dave C.
-- 
 Dave Cartwright,               | cmp8118@uk.ac.uea.sys    <- Here
 School of Information Systems, | cmp8118@uk.ac.uea.cpc780 <- Somewhere else
 University of East Anglia,     |  Just 'cos I'm thick ...
 Norwich, ENGLAND. NR4 7TJ.     |         ... don't take it out on ME !!

martin@mwtech.UUCP (Martin Weitzel) (02/21/90)

In article <236100027@prism> ly@prism.TMC.COM writes:
>
>	I'm just curious to know why nested comments are not allowed in many 
>	languages.

While other posters showed, that forbidding nested comments is
no *disadvantage* because of C-s "#if-#endif" capability, here's
a reasons why non-nested comments could be considered to be an
advantage:

	................... <--- 100 lines of C-Code
	/* here is some comment */
	................... <--- some statement
	................... <--- another 100 lines of C-Code

If this source compiles without errors and nested comments *were*
allowed, I could never be sure that the compiler had really seen
the marked statement after the comment without checking the source
before and after it for enclosing comments. The only thing which
is still of importance are conditionaly compiled sections, which
are far easier to track because in the average programm they appear
far less than comments.
-- 
Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83
-- 
Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83

brad@SSD.CSD.HARRIS.COM (Brad Appleton) (02/22/90)

Personally, I prefer what C++ does using "/*", "*/" for long comments
and "\\", end-of-line for one liners. This makes it easier to comment
out code using "/*", "*/" if I only use "\\" comments inside them 
(thats a *big* IF though -- I would still probably use "#ifdef 0")

IMHO, Im not sure the issue is worth all the attention we seem to have
given it (some obviously disagree). Couldnt we let this topic die now?
(Am I the only one who feels this way?)
+=-=-=-=-=-=-=-=-=- "... and miles to go before I sleep." =-=-=-=-=-=-=-=-=-=-+
|  Brad Appleton   MS-161              |    PHONE  : (305) 973-5007           |
|  Harris Computer Systems Division    |    DOMAIN : brad@ssd.csd.harris.com  |
|  2101 West Cypress Creek Road        |    UUCP   : ...!novavax!hcx1!brad    |
|  Fort Lauderdale, FL 33309 USA       |         or  ...!uunet!hcx1!brad      |
+=-=-=-=-=-=-=-=-= DISCLAIMER: I said it, not my company! =-=-=-=-=-=-=-=-=-=-+

adrian@mti.mti.com (Adrian McCarthy) (02/23/90)

In article <7060002@hpfcso.HP.COM> mike@hpfcso.HP.COM (Mike McNelly) writes:
>
>	> I'm just curious to know why nested comments are not allowed in many 
>	> languages.
>
>1.  They have limited usefulness.  For most of the occasions where they
>are useful, conditional compilation seems to work better.  In C, for
>example,

Often, while developing a program, it is useful to comment out a section of
code.  Using conditional compilation for this is about as clumsy as using
nested comments.  Sometimes you want to comment out *part* of a line,
but that is awkward since #ifdef must be at the beginning of a line.  When
you want to comment out larger chunks of code, the #ifdef - #endif lines
are easy to loose site of (especially if your code has lots of #ifdef's for
other reasons).  On top of all that, not all languages have preprocessors.

In practice, it seems we don't really need true nested comments, but simply
two kinds of comments:  regular comments and code omitting comments.  It
seems Pascal almost had it right.  You could use "{" and "}" to delimit
regular comments, and "(*" and "*)" to comment out code.  Alas, most
implementations of Pascal will allow a "*)" to end a comment introduced by
"{".

Aid.

P.S.  I'm working on a macro preprocessor that, unlike the C preprocessor
and m4, will be useful on nearly all types of source code *and*
documentation.  In its default mode, it is an ANSI compliant C preprocessor.
Don't hold your breath though; it'll be awhile before it's done.

karl@haddock.ima.isc.com (Karl Heuer) (02/26/90)

In article <4601@jarthur.Claremont.EDU> dfoster@jarthur.Claremont.EDU (Derek R. Foster) writes:
>I wasn't saying that these characters should never be used as string data.
>I said that they should not be placed LITERALLY in a string, since they may
>be mistaken (by the parser) for comments.  [So, when it is desirable to
>put them in a string constant, they should be encoded] in some way that
>breaks up the /* and */ pairs ... my favorite so far is this:
>  #define CS "/""* "
>  #define CE " *""/"
>  printf("before comment"CS"comment"CE"after comment");

If we're talking about C, then of course this is not necessary since the
scanner already knows about strings.  Therefore, I assume we're talking about
a hypothetical language which is a lot like C, but which has nestable comments
and therefore must worry about the interaction between comments and strings.

Surely if such a language existed, it would also have the escape sequences
`\*' (literal star) and `\/' (literal slash), so you could simply write
  printf("before comment/\* comment *\/after comment");
(note that `\?' was added to ANSI C for essentially this reason).

Karl W. Z. Heuer (karl@ima.ima.isc.com or harvard!ima!karl), The Walking Lint

dfoster@jarthur.Claremont.EDU (Derek R. Foster) (02/27/90)

In article <16023@haddock.ima.isc.com> karl@haddock.ima.isc.com (Karl Heuer) writes:
>In article <4601@jarthur.Claremont.EDU> dfoster@jarthur.Claremont.EDU (Derek R. Foster) writes:
>>I wasn't saying that these characters should never be used as string data.
>>I said that they should not be placed LITERALLY in a string, since they may
>>be mistaken (by the parser) for comments.  [So, when it is desirable to
>>put them in a string constant, they should be encoded] in some way that
>>breaks up the /* and */ pairs ... my favorite so far is this:
>>  #define CS "/""* "
>>  #define CE " *""/"
>>  printf("before comment"CS"comment"CE"after comment");
>
>If we're talking about C, then of course this is not necessary since the
>scanner already knows about strings.  Therefore, I assume we're talking about
>a hypothetical language which is a lot like C, but which has nestable comments
>and therefore must worry about the interaction between comments and strings.

NO!!!!!!!
Just because the scanner knows about strings doesn't mean this can't still
cause problems! With or without nested comments! In perfectly ordinary C!

Try this:
printf("some C code"); /* that code could be printf("/* with comments */"); */

question:What is the result of trying to compile the above?

answer:(assuming no nested comments, although it is perfectly easy to
construct a different example which will fail with nested comments)

printf("some C code"); "); */
                       ^
UNTERMINATED STRING CONSTANT

The scanner only knows about strings THAT ARE NOT STARTED INSIDE
COMMENTS. It can't recognize strings inside comments, because it can't
assume that comments contain valid, parsable C code. As someone pointed
out, if it did do this, it would crash on something like

  /* These are quotation marks --> " */

in the input file.

The scanner has to interpret strings and comments based on whichever it
thinks it is in the middle of. For instance:

1) If it is neither parsing a string nor a comment, and it finds a ", then
   it is now parsing a string. Otherwise, if it finds a /* then it is now
   parsing a comment. Otherwise, it continues parsing neither.
2) if it is parsing a string, it will continue to do so until it finds a " .
   (note: /* and */ are treated as STRING DATA under this condition)
3a) if it is parsing a comment and the compiler does not allow nested comments,
    it will continue to do so until it finds a */ . (note: " is treated like
    all other characters and simply IGNORED under this condition.)
3b) if it is parsing comments, and the compiler allows nested comments, it
    will continue to do so until it has encountered the same number of */
    as it has /* . (note: " is treated like all other characters and simply
    IGNORED under this condition.)

>Surely if such a language existed, it would also have the escape sequences
>`\*' (literal star) and `\/' (literal slash), so you could simply write
>  printf("before comment/\* comment *\/after comment");
>(note that `\?' was added to ANSI C for essentially this reason).

I think this is a VERY good idea. I wasn't talking about a hypothetical
language, however, and unfortunately, I haven't been able to
find any documentation for a feature like this in C, which seems like 
rather a glaring omission, in my opinion. I definitely prefer this method
to the one I showed above -- I just can't find any documentation that
states definitively that \* or \/ are allowable escape sequences.

>Karl W. Z. Heuer (karl@ima.ima.isc.com or harvard!ima!karl), The Walking Lint

Please! If people are going to post on this topic, realize the following
facts! I have stated them multiple times!

1) Whether or not C parses comments in strings DOES NOT BEAR ON MY ARGUMENT.
2) Whether or not C comments can be nested DOES NOT BEAR ON MY ARGUMENT.
3) In fact, part of my argument is that if you've written your code
   well, nesting/not nesting comments WILL NOT MATTER TO THE PARSER. 
   If people wish to argue about the aesthetics of nested comments vs.
   #ifdef/#endif, that's fine, but arguments like

"But with nested (or "but with non-nested...") comments, how can I do 
  printf("/* some horrible abomination with lots of /* and */ */");

   are NOT MEANINGFUL, since the same problems exist with/without nested
   comments, just in different situations! And it's the code-writer's fault!
4) If you write code that depends on whether comments are nested or not,
   it's not the compiler's fault, it's YOURS, and you deserve what you get!
   Encode /* and */ before putting them in strings!

If you aren't sure you understand the above, please reread this
posting before posting one of your own on this topic. If you still don't
understand, please e-mail me.

P. S. Karl - please understand that this is not directed at you personally.

Derek Riippa Foster

dalenber@cbnewsc.ATT.COM (Russel Dalenberg) (03/03/90)

In article <4686@jarthur.Claremont.EDU>, dfoster@jarthur.Claremont.EDU (Derek R. Foster) writes:
>    If people wish to argue about the aesthetics of nested comments vs.
>    #ifdef/#endif, that's fine, but arguments like
> 
> "But with nested (or "but with non-nested...") comments, how can I do 
>   printf("/* some horrible abomination with lots of /* and */ */");
> 
>    are NOT MEANINGFUL, since the same problems exist with/without nested
>    comments, just in different situations!

I agree.  I only posted  my printf example to point out that parsing
nested comments would not be "simple".

>                                            And it's the code-writer's fault!

Well, in the sense that all the code a programmer writes is his "fault", I
agree. But I think it's perfectly reasonable to put comment delimiters in
a string (and quotes in comments).

> 4) If you write code that depends on whether comments are nested or not,
>    it's not the compiler's fault, it's YOURS, and you deserve what you get!

No. Comments in C are *defined* not to nest. It's perfectly reasonable to
expect then not to nest, and program accordingly.

>    Encode /* and */ before putting them in strings!

I think this is foolish overkill. It can't hurt your code, but it can
make it harder to read and maintain.


Russel Dalenberg

att!ihlpb!dalenber
dalenber@ihlpb.att.com

Disclaimer: These are my opinions, not AT&T's