[net.lang.c] expressions and #if

craig@BBN-LABS-B.ARPA (07/10/84)

From:  Craig Partridge <craig@BBN-LABS-B.ARPA>


[I don't believe I have seen this issue discussed before, though I was
off the net for a while].

    I have been working a little bit on extending a program that reads
C programs and replaces all #ifdef'ed code with the code for which
the condition applies.  I.e. given the code
-------------------
#ifdef FOO
    a = b;
#else
    b = a;
#endif
-------------------
and FOO not defined, the program replaces all this stuff with simply
-------------------
    b = a;
-------------------
My extension is to add support for the #if statement, and I have run
into several definitional problems.  The problems come from several
different directions -- I will try to break them up logically.

    The definition of #if is that it evaluates a constant-expression
(K&R 12.3 and 15).  But constant-expression is not very well defined.
First, it is only defined for case, array bounds and initializers.
The definitions differ between the first two and initializers.  Since
initializers seem to be viewed as a very exceptional case, I have
assumed that #if is akin to case or array bounds.  In these cases
a constant-expression is defined as involving

	"only integer constants, character constants, and sizeof
	expressions, possibly connect by the binary operators

		+ - * / % & | ^ << >> == != < > <= >=

	or by the unary operators

		- ~

	or by the ternary operator

		?:

	Parentheses can be used for grouping, but not for function calls"

O.K.  now for the problems: 

    (1) the unary operator ! is not included in the list of legitimate
    operators.  This makes #if much less useful.  Any views on whether
    this is intentional or just a typo?  Same question for && and ||.

    (2) The Berkeley code seems to include a new macro operator
    defined(x) which returns 1 if x is defined, 0 otherwise.
    Is this now a generally accepted feature or just a Berkeley extension?

    (3) the fact that two types of constants raises some interesting
    troubles.  Is one required to support type casts in the
    constant-expression?
    
    (4) Related to (3), must sizeof(expression) be supported?  If
    so can "expression" be any expression or just a constant-expression?
    Can "expression" contain type casts?

    (5) /lib/cpp on 4.2 supports a few features that are not mentioned
    in the standards.  For example, the comma operator is supported in
    the constant-expression.  Are there other extensions people know
    of -- are they generally accepted?

    (6) A nice definitional problem to finish off the list.
    #if is clearly a preprocessor statement.  But there is no
    definition in the manual of how much the preprocessor is
    supposed to know about.  The manual says simply that the preprocessor
    is "capable of macro substitution, conditional compilation and
    inclusion of include files."  No mention is made of expression
    handling, although #if clearly requires it.  Just how bright
    is the preprocessor required to be?

Interested to hear people's views,

Craig Partridge
craig@bbn-unix  (ARPA)
bbncca!craig	(USENET)

buck@NRL-CSS.ARPA (07/11/84)

From:  Joe Buck <buck@NRL-CSS.ARPA>


A look at the yacc grammar for cpp on 4.1bsd shows that yes, the !, &&,
and || operators are supported.  The c preprocessor for 4.1bsd knows
about the following operators:

*  /  %  +  -  <<  >>  <  >  >=  <=  ==  !=  &  ^  |  &&  ||  ?:  ,
UMINUS  !  defined

where UMINUS is the unary minus.  "defined" is handled as an operator,
not a macro; defined(foo) returns 1 if foo is defined and zero
otherwise. As an operator, it is only evaluated on lines beginning with
"#".  Apparently, "defined foo" is equivalent to "defined(foo)".

Once it's decided that any cpp must support some of the unary and binary
operators, the extra work to include others is trivial, so it seems to
me that the K&R listing of legal operators was an error on their part.
"defined", on the other hand, I don't know about.


ARPA: buck@nrl-css.ARPA
UUCP: ...!{decvax,linus,umcp-cs}!nrl-css!buck

-Joe

guy@rlgvax.UUCP (07/14/84)

"defined" is given in the System V documentation for "cpp" (along with
__FILE__ and __LINE__; anybody for a "used, but not documented" error
message from "lint"? :-)  They've been around since V7, and the "assert.h"
include file depends on __FILE__ and __LINE__).  I suspect it may have made
it into C language references later than the one issued in K&R.  All three
of those features, I believe, first entered C in V7; the C preprocessor was
completely rewritten, making it considerably faster (I verified the figures
John Reiser - the author of the V7 preprocessor - quotes for speed improvement
a while ago) and adding those features.

	Guy Harris
	{seismo,ihnp4,allegra}!rlgvax!guy

guy@rlgvax.UUCP (07/14/84)

"cpp" doesn't support "sizeof" in #if expressions; it would be nice, but
would also mean that "cpp" would have to swallow a good deal more of C's
grammar, would have to know what machine it was running on, and would have
to parse all "typedef" statements that passed by it - in effect, it would
have to absorb a good deal of the C compiler.  A C compiler with macro
expansion built into it might be better able to support "sizeof" in #if.

	Guy Harris
	{seismo,ihnp4,allegra}!rlgvax!guy

brownell@harvard.ARPA (Dave Brownell) (07/16/84)

Contrast smarter and smarter CPP constant evaluation with the approach
taken in Ada, which is to insist that the compilers optimize out unreachable
code.  I frankly prefer this;  contrast

    #ifdef	FEATURE
	if (not_otherwise_indicated()) {    /* el hacko gross-me-out */
    #endif	FEATURE
	process ();
	...
    #ifdef	FEATURE
	}
    #endif	FEATURE

(which I have seen in some System V code, by the way) with code like

    if (FEATURE == disabled || not_otherwise_indicated()) {
	process ();
    }

I don't think I'm alone in preferring the second option.  In heavily
parameterized with #ifdefs or #ifs this seems sooo much more readable ...

Thoughts/flames, anyone ???


Dave Brownell
{allegra,floyd,ihnp4,seismo}!harvard!brownell

ron@brl-tgr.ARPA (Ron Natalie <ron>) (07/16/84)

Or maybe it would be better if the compiler swallowed CPP's functionality.

henry@utzoo.UUCP (Henry Spencer) (07/20/84)

> Contrast smarter and smarter CPP constant evaluation with the approach
> taken in Ada, which is to insist that the compilers optimize out unreachable
> code.  ...

The problem is that this doesn't work at all for declarations.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

thoth@tellab2.UUCP (Marcus Hall) (07/24/84)

How about using the cpp #ifdef this way:

#ifdef	FEATURE
	if (not_otherwise_indicated())
#endif	FEATURE
		{
		process ();
		...
		}

To me it is just as clear (maybe even a little more) than saying:

	if (FEATURE == disabled || not_otherwise_indicated()) {
		process ();
		...
		}

Plus, it is more obvious that the test is optionally compiled.  I know that
  this does get into some heavy problems when there are lots of #ifs around
  in a file, so if there are lots of them it does present a problem.

My objection to the later form is that I keep trying to relate the expression
   'FEATURE == disabled' to the execution of the code, not to the execution
   of the test.

marcus
..!ihnp4!tellab1!tellab2!thoth

bsafw@ncoast.UUCP (The WITNESS) (07/30/84)

> How about using the cpp #ifdef this way:
> 
> #ifdef	FEATURE
>	if (not_otherwise_indicated())
> #endif	FEATURE
> 		{
> 		process ();
> 		...
> 		}
> 
> To me it is just as clear (maybe even a little more) than saying:
> 
>	if (FEATURE == disabled || not_otherwise_indicated()) {
>		process ();
>		...
>		}
>

But not_otherwise_indicated() will be executed in the second version, and
I've seen a LOT of code that uses side-effects... in which case the second
version may do something disastrous.  If not_otherwise_indicated() is just
a simple Boolean test, however, it'll work (if not as fast as it might).
-- 
		Brandon Allbery: decvax!cwruecmp{!atvax}!bsafw
		  6504 Chestnut Road, Independence, OH 44131

		  Witness, n.  To watch and learn, joyously.

jim@ism780b.UUCP (08/02/84)

#R:sri-arpa:-183600:ism780b:25500011:000:3509
ism780b!jim    Jul 17 22:35:00 1984

The way I did this was to add a flag to cpp to only instantiate those
symbols given with -U and -D.  It was a lot of work.  I actually rewrote
cpp from scratch, since the original code was too sick to deal with directly.
I avoided all your questions of what the semantics of cpp are, by retaining
the behavior of the Bell version (I have source) when not clearly buggy and by
guaranteeing that my unifer and cpp had identical semantics by in fact being
the same code.  But, for specific answers:

>   (1) the unary operator ! is not included in the list of legitimate
>    operators.  This makes #if much less useful.  Any views on whether
>    this is intentional or just a typo?  Same question for && and ||.

Typo's.  Bell's cpp supports all the C operators, including "!", "&&", "||",
"?:", and ",".

>    (2) The Berkeley code seems to include a new macro operator
>    defined(x) which returns 1 if x is defined, 0 otherwise.
>    Is this now a generally accepted feature or just a Berkeley extension?

It is a general feature, and is in the SysV manual and the proposed
ANSI standard.

>    (3) the fact that two types of constants raises some interesting
>    troubles.  Is one required to support type casts in the
>    constant-expression?
    
Bell's cpp does not support type casts in #if expressions; since the only kind
of constants it deals with are ints (even if there is a trailing "l" or "L"!),
there isn't much point.

>    (4) Related to (3), must sizeof(expression) be supported?  If
>    so can "expression" be any expression or just a constant-expression?
>    Can "expression" contain type casts?

Bell's cpp does not support sizeof().  And it really should be
"sizeof lvalue" or "sizeof(type)".  There are no lvalues in cpp, and it
would not be reasonable for it to handle other than basic types without
being built into the compiler.

>    (5) /lib/cpp on 4.2 supports a few features that are not mentioned
>    in the standards.  For example, the comma operator is supported in
>    the constant-expression.  Are there other extensions people know
>    of -- are they generally accepted?

Berkeley hasn't added any functionality that wasn't in some Bell version.
And since there is no formal documentation of cpp, it is hard to talk
about extensions.  comma and defined have been there since the PTS
(slightly pre-V7) version; they just haven't ever been documented.

>    (6) A nice definitional problem to finish off the list.
>    #if is clearly a preprocessor statement.  But there is no
>    definition in the manual of how much the preprocessor is
>    supposed to know about.  The manual says simply that the preprocessor
>    is "capable of macro substitution, conditional compilation and
>    inclusion of include files."  No mention is made of expression
>    handling, although #if clearly requires it.  Just how bright
>    is the preprocessor required to be?

The brightness is determined by the functional spec, which includes
macro definitions and use ("macro substitution"), #if, #ifdef, etc.
("conditional compilation"; actually cpp isn't capable of any compilation at
all!), #include ("inclusion of include files" (like the Department of
Redundancy department?), and #if constant-expression, which certainly
implies "expression handling" (are you somehow expecting it to do something
with expressions elsewhere in the file?).  For the best (but not very good)
spec available, see the cpp.1 article in the SysV manual.

-- Jim Balter, INTERACTIVE Systems (ima!jim)