[comp.lang.c] ANSI C and comment preprocessing

donald@cae780.csi.com (Donald Maffly) (01/08/91)

In pre-ANSI C compilers, I noticed that it was possible
to place a comment within an idenitfier without
splitting the identifier in two.  

The following code segment illustrates this, and 
compiles "sans erreur" on all of our C compilers
(which claim to support ANSI C):

main()
{
	int my/* comment */var;
	myvar = 1;
}

This example may seem pretty obscure to some of you, 
but there do exist people who know this
to be a trusted and accepted feature in older C compilers, so
it can't be too obscure.

Well, in "The C Programming Language" (2nd Edition), which
purports to laying down ANSI C Standards, reads in section 
"A12.  Preprocessing", item #5:
	
	"[...] comments are replaced by a single space;"

Now, if this were true, then the code segment above
wouldn't compile.  

Can anyone help me answer this koan????

Donald Maffly

bhoughto@hopi.intel.com (Blair P. Houghton) (01/08/91)

In article <11228@cae780.csi.com> donald@cae780.csi.com (Donald Maffly) writes:
>	"[...] comments are replaced by a single space;"
>
>Can anyone help me answer this koan????

It's not a koan, it's a feature.

ANSI standardized existing practice; i.e., there were relatively
few compilers that replaced comments with no whitespace (i.e.,
most considered comments to be whitespace).  The philosophy is
that it is legal to put a comment only where it would be legal
to put whitespace, and it's more important to adhere to this
philosophy than to enable an obfuscatory feature, and whitespace
is not legal within identifiers, therefore if there's a comment
there it's considered whitespace.

Did it break existing code?  Sure it did, but such code was
unportable to all but a few platforms, anyway.

As compensation, however, ANSI included explicit token-pasting.

				--Blair
				  "I asked, 'Master, why should I code
				   for(;;) when what I mean is while(1)?'
				   and the master said, 'the pot of tea
				   comes with hot water at no additional
				   charge,' and I was enlightened."

henry@zoo.toronto.edu (Henry Spencer) (01/09/91)

In article <11228@cae780.csi.com> donald@cae780.csi.com (Donald Maffly) writes:
>In pre-ANSI C compilers, I noticed that it was possible
>to place a comment within an idenitfier without
>splitting the identifier in two.  

In *some* pre-ANSI compilers this was possible.  Not by any means all.
The pre-ANSI specs for the preprocessor were just plain vague.

(Note that a lot of people's idea of "all the pre-ANSI compilers in the
world" is "all the ports of PCC that I normally use".  PCC is only one
compiler, for such non-code-generation issues, no matter how many machines
you run it on.  There were a good many non-PCC pre-ANSI compilers.)
-- 
If the Space Shuttle was the answer,   | Henry Spencer at U of Toronto Zoology
what was the question?                 |  henry@zoo.toronto.edu   utzoo!henry

gwyn@smoke.brl.mil (Doug Gwyn) (01/09/91)

In article <11228@cae780.csi.com> donald@cae780.csi.com (Donald Maffly) writes:
>	int my/* comment */var;
>Can anyone help me answer this koan????

The Reiser C preprocessor incorrectly implemented comment stripping,
as well as several other portions of the C language specification used
at the time, and some over-clever programmers decided to exploit this
bug in their C code.  It was never valid C usage, and still isn't.

greywolf@unisoft.UUCP (The Grey Wolf) (01/17/91)

[ Tried to yank back Blair's article on the ANSI implementation of
  comments, but rn bletched (no rn flames, please!). ]

If comments are expanded to spaces, this kind of breaks things like

#define	operate_with(x) \
	dbm_put(p->pw_/**/e); \
	munge_data(p->pw_/**/e, munge_factor[1]); \
	login->pw_/**/e = p->pw_/**/e;

doesn't it?

How does ANSI token-pasting work? (DON'T say RTFM because I don't *have*
TFM!)  Is is something similar to what I've provided above as an example?

-- 
On the 'Net:  Why are more and more fourth-level wizard(-wannabe)s trying to
invoke ninth-level magic, instead of taking the time to climb the other
(quite essential) thirteen levels so they can do this properly?
...!{ucbvax,acad,uunet,amdahl,pyramid}!unisoft!greywolf

henry@zoo.toronto.edu (Henry Spencer) (01/18/91)

In article <3303@unisoft.UUCP> greywolf@unisoft.UUCP (The Grey Wolf) writes:
>If comments are expanded to spaces, this kind of breaks things like
>
>#define	operate_with(x) \
>	dbm_put(p->pw_/**/e); \
>	munge_data(p->pw_/**/e, munge_factor[1]); \
>	login->pw_/**/e = p->pw_/**/e;
>
>doesn't it?

This macro is already broken; even before ANSI C, there were many C compilers
that wouldn't do what you're expecting with it.

>How does ANSI token-pasting work? (DON'T say RTFM because I don't *have*
>TFM!)  Is is something similar to what I've provided above as an example?

No.  This revolting kludge has been flushed, in favor of a language feature
that doesn't rely so heavily on implementation accidents.  "##" is the
ANSI C token-concatenation operator.
-- 
If the Space Shuttle was the answer,   | Henry Spencer at U of Toronto Zoology
what was the question?                 |  henry@zoo.toronto.edu   utzoo!henry