[comp.lang.c] preprocessor bug

erik@tcom.stc.co.uk (Erik Corry) (09/12/90)

In article <141513@sun.Eng.Sun.COM> eager@ringworld.Eng.Sun.COM (Michael J. Eager) writes:
>
>The preprocessor is supposed to insert whitespace after the expansion
>of a macro.  If it doesn't and the define is as described, there seems
>to be a preprocessor bug.  
>
Are you sure? The ULTRIX VAX I am using will preprocess:

    #define TRUE -1
    int c;
    c=TRUE;

into

    int c;
    c=-1;

Which gives a compiler warning because it is interpreted as an
old-fashioned version of c-=1. Does this mean the compiler here
is not pcc compliant? Not ANSI compliant? (no surprises there).
Which standard defines the behaviour you describe?

(I am not interested in hearing comments on #define TRUE -1. It
is irrelevant.)

steve@taumet.com (Stephen Clamage) (09/13/90)

erik@tcom.stc.co.uk (Erik Corry) writes:

>In article <141513@sun.Eng.Sun.COM> eager@ringworld.Eng.Sun.COM (Michael J. Eager) writes:
>>
>>The preprocessor is supposed to insert whitespace after the expansion
>>of a macro.  If it doesn't and the define is as described, there seems
>>to be a preprocessor bug.  
>>
>Are you sure? The ULTRIX VAX I am using will preprocess:

>    #define TRUE -1
>    int c;
>    c=TRUE;

>into

>    int c;
>    c=-1;

>Which gives a compiler warning because it is interpreted as an
>old-fashioned version of c-=1. Does this mean the compiler here
>is not pcc compliant? Not ANSI compliant? (no surprises there).
>Which standard defines the behaviour you describe?

One reason the ANSI committee spent so much time on defining preprocessor
behavior was that, so far as I know, there was no complete published
specification of what the preprocessor was supposed to do.  When a
programmer wondered what the effect of a construct was, the solution was
to try it and see; in the programmer's mind, that became the definition.
As long as no other preprocessor was ever used on that code, the code
was ok.  The next preprocessor took a different approach, and the code
failed.  There was no specification at which to point and say "this
proves the preprocessor is wrong."

It is not exactly true that ANSI requires the preprocessor 
"to insert whitespace after the expansion of a macro."  The result of
macro expansion cannot result in combining two tokens into one, as in
	c=TRUE;
becoming equivalent to
	c =- 1;
In that sense, a preprocessor which tranlates text into text (as opposed
to one which converts into internal tokens or other data structures)
must insert white space in some cases.  If the compiler is going to
warn about =-, then this is such a case.

>(I am not interested in hearing comments on #define TRUE -1. It
>is irrelevant.)

Sorry, I'm going to comment anyway.  Because not all compilers are ANSI-
conforming, and because among buggy ANSI compilers and non-ANSI compilers
there are differences in preprocessor behavior, it is safer to use
	#define TRUE (-1)
This will prevent the above class of problem from occurring with any kind
of preprocessor.  Because -1 is an expression (negation of constant 1),
it should for safety be enclosed in parentheses, just as you would with
	#define SUM (b + c)
-- 

Steve Clamage, TauMetric Corp, steve@taumet.com

burley@world.std.com (James C Burley) (09/14/90)

In article <452@taumet.com> steve@taumet.com (Stephen Clamage) writes:

   erik@tcom.stc.co.uk (Erik Corry) writes:

   >(I am not interested in hearing comments on #define TRUE -1. It
   >is irrelevant.)

   Sorry, I'm going to comment anyway.  Because not all compilers are ANSI-
   conforming, and because among buggy ANSI compilers and non-ANSI compilers
   there are differences in preprocessor behavior, it is safer to use
	   #define TRUE (-1)
   This will prevent the above class of problem from occurring with any kind
   of preprocessor.  Because -1 is an expression (negation of constant 1),
   it should for safety be enclosed in parentheses, just as you would with
	   #define SUM (b + c)
   -- 

   Steve Clamage, TauMetric Corp, steve@taumet.com

I think people would be best advised to #define <whatever> (-1) even when
using true ANSI compilers -- it avoids any possibility of any kind of
nearby operators changing the precedence.  I can't see any useful case of
this in C at this point, looking at my precedence chart, but it is a good
habit to get into -- parenthesize your macros to ensure that the precedence
you want is the precedence you'll get.

The #define SUM (b + c) is another good example.  In this case, it seems ok
because (I assume) b and c are not themselves macros (beyond simple
constants, perhaps).

But suppose you do

#define SUM(b,c) (b + c)

Seems ok, right?  Well...now consider this invocation:

SUM(bool ? 5 : 4,i)

This expands to

(bool ? 5 : 4 + i)

Which is equivalent (in terms of precedence) to:

(bool ? 5 : (4 + i))

Hardly what is expected.  Replace the macro definition with

#define SUM(b,c) ((b) + (c))

And you end up with what you expect:

((bool ? 5 : 4) + (i))

This serves to illustrate two very important points about C macros: 1) Use
parentheses to enclose any entity within the expansion that is not a
"constant" (i.e. either an argument of the macro or itself another macro that
might not have its own parentheses); 2) Switch to C++ and use inline functions
instead, where things are much clearer!  (-:

James Craig Burley, Software Craftsperson    burley@world.std.com