[net.lang.c] Bug

stephenf@elecvax.SUN (Stephen Frede) (05/29/84)

Newsgroups: net.bugs,net.lang.c
Subject: Bug (?) in C preprocessor

Note the following behaviour of the C preprocessor (V7 and System V):

#define FRED jim
#define QUOTE(x) "x"

QUOTE(FRED)
______________________

	The output of the C preprocessor when fed this fragment is

"FRED"

	Obviously, it is applying the QUOTE() macro on FRED,
before re-scanning and since "FRED" is then quoted, no substitution
takes place. If the order of evaluation were the other way, ie to
evaluate what is inside the parentheses first, before applying the
macro QUOTE to it, the result would be

"jim"

	(which is actually what I wanted when I discovered this). I
suggest that this order of evaluation is more sensible, and conforms
more closely to what C itself (and almost all other programming
languages) does (evaluating expressions from within the most deeply
nested level of parentheses first).
	I will probably be changing the preprocessor here to do this,
but would like to see what the net thinks first. Has anyone else
come across this problem?

Aside:
	If this change is made, then it allows constructs like

#define QUOTE(x) "x"
#include QUOTE(NAME.h)

	where NAME is defined to the preprocessor on the command line
at compile time. Whether this is a terrible thing to do or not is
another story. I wanted it so I could modify maketerms.c in the nroff
system so that I could make terminal driving tables one at a time (a
single "make" at the top of our source directory would then install
a new terminal driving table in /usr/lib/terms without having to create
a whole set. Also, new driving tables could then be added without
modifying maketerms.c.


					Stephen Frede
					Dept. Computer Science
					University of NSW
					P.O. Box 1
					Kensington 2033
					AUSTRALIA


				...!decvax!mulga!stephenf:elec70b

boyd@basser.UUCP (05/29/84)

No.  The reference manual states:

	"Each occurrance of an identifier mentioned in the
	formal parameter list of the definition is replaced
	by the corresponding token string from the call."

	"...the replacement string is rescanned for more
	defined identifiers."

	"Text inside a string or character constant is not
	subject to replacement."

Hence:

#define quote(x)	"x"
#define gooeys		sooterkin

quote(gooeys) expands to "gooeys" and not "sooterkin".

I have written a C preprocessor that adheres to the most
recent standard.  The standard has these "additions":

	#elif expression
and
	defined(name) in #if expressions

I wouldn't advise a change to the preproccessor to do this alternate
expansion because by definition it would NOT be the C preproccessor.


---
>From the VB can of Boyd Roberts.	...!decvax!mulga!boyd:basser

stephenf@elecvax.UUCP (05/29/84)

>   From: boyd@basser.SUN (Boyd Roberts)
>   References: <204@elecvax.SUN>
>   Organization: Dept of C.S., University of Sydney
>   
>   No.  The reference manual states:
>   
>   	"Each occurrance of an identifier mentioned in the
>   	formal parameter list of the definition is replaced
>   	by the corresponding token string from the call."

It also state that subsequent occurrences of identifiers used in
ordinary "#define"s are replaced by the appropriate token-string
wherever they occur. What the manual doesn't specify is any order
for replacement.
    
>    	"...the replacement string is rescanned for more
>    	defined identifiers."

Exactly, but which ones come first?
    
>    	"Text inside a string or character constant is not
>    	subject to replacement."

The identifier is not inside a string or character constant. It happens
to be inside another macro, which may introduce quotes at that point,
when replacement occurs. But not until replacement occurrs.
    
>    Hence:
>    
>    #define quote(x)	"x"
>    #define gooeys		sooterkin
>    
>    quote(gooeys) expands to "gooeys" and not "sooterkin".

Your "hence" does not follow at all.
	quote(gooeys)
may first be expanded to
	quote(sooterkin)
and thence to "sooterkin" in complete accordance with the reference
manual. It's not defined.
    
>    I wouldn't advise a change to the preproccessor to do this alternate
>    expansion because by definition it would NOT be the C preproccessor.

By definition where? Not the C reference manual.
    
					- Stephen Frede
					...!decvax!mulga!stephenf:elecvax

ed@mtxinu.UUCP (05/29/84)

If you change the behaviour of the preprocessor, remember that
C requires that the order of evaluation of expressions be
undefined.  Therefore, depending on any evaluation order
will yield unportable code.

-- 
Ed Gould
ucbvax!mtxinu!ed

guy@rlgvax.UUCP (Guy Harris) (05/30/84)

> I have written a C preprocessor that adheres to the most
> recent standard.  The standard has these "additions":

> 	#elif expression
> and
> 	defined(name) in #if expressions

The former is very nice - I wish it were available in the Reiser "cpp"
that comes with just about every UNIX under the sun.  The latter is actually
in the most recent standard, if the S5 man page CPP(1) is considered the
most recent standard (it also documents __FILE__ and __LINE__).  The
parentheses are optional (shades of the "return(x)" vs. "return x" debate!)
according to CPP(1).

	Guy Harris
	{seismo,ihnp4,allegra}!rlgvax!guy