[comp.std.c] Questions about ANSI constant expressions

diamond@csl.sony.co.jp (Norman Diamond) (09/26/89)

According to the standard, a compiler may optionally fold the
following constants:

[a]    a = 5.0 + 6.0;

[b]    b = 5 + 6;

[c]    c = 5000000000 + 6000000000;  /* five billion + six billion */

If the compiler chooses to fold [a], it is required to provide at least
the same range and precision at compile time as the target machine
will provide at execution time.

If the compiler chooses to fold [b] or [c], the standard does not seem
to place any restriction on compile time accuracy.

Intuitively, [b] might be expected to fold properly, because the host
machine might be required to handle longs of up to around two billion
(there might be a self-compiler on the host machine).  But does the
standard actually require this?

Case [c] becomes interesting if the host machine has 32-bit longs but
the target machine has 64-bit longs.  A sensible implementor might
decline to fold the constant.  I believe the standard permits a lack
of folding, and it permits correct folding.  But does it actually
require one of these?  As far as I can tell, it permits incorrect
folding as well.

The situation becomes more important in the case of a preprocessor
"#if" directive.  Cases [b] and [c] are required to be computed at
compile-time, and accuracy must be guaranteed up to a range of about
two billion.  So case [b] must be folded properly in the case of an
"#if".  What about case [c]?  Is the preprocessor required to simulate
the target machine's range, or is it allowed to produce incorrect
results when they exceed approx. two billion?

(Thanks to a friend, who probably prefers to remain anonymous for
kindly providing an illegal copy of the draft standard.)

--
-- 
Norman Diamond, Sony Corporation (diamond@ws.sony.junet)
  The above opinions are inherited by your machine's init process (pid 1),
  after being disowned and orphaned.  However, if you see this at Waterloo or
  Anterior, then their administrators must have approved of these opinions.

dfp@cbnewsl.ATT.COM (david.f.prosser) (09/28/89)

In article <10880@riks.csl.sony.co.jp> diamond@ws.sony.junet (Norman Diamond) writes:
>According to the standard, a compiler may optionally fold the
>following constants:
>
>[a]    a = 5.0 + 6.0;
>
>[b]    b = 5 + 6;
>
>[c]    c = 5000000000 + 6000000000;  /* five billion + six billion */
>
>If the compiler chooses to fold [a], it is required to provide at least
>the same range and precision at compile time as the target machine
>will provide at execution time.

It is worth noting that the pANS does not require the folding of any
floating expression.

>If the compiler chooses to fold [b] or [c], the standard does not seem
>to place any restriction on compile time accuracy.

This is covered by all the descriptions in section 3.3 (expressions)
and the final paragraph of section 3.4 (constant expressions):

	The semantic rules for the evaluation of a constant expression
	are the same as for non-constant expressions.

>Intuitively, [b] might be expected to fold properly, because the host
>machine might be required to handle longs of up to around two billion
>(there might be a self-compiler on the host machine).  But does the
>standard actually require this?

The same answer as would have been produced if the expression were
evaluated at runtime must be used if the expression's evaluation is
completely well-defined (no overflows, shifts by negative numbers,
and so on).

>Case [c] becomes interesting if the host machine has 32-bit longs but
>the target machine has 64-bit longs.  A sensible implementor might
>decline to fold the constant.  I believe the standard permits a lack
>of folding, and it permits correct folding.  But does it actually
>require one of these?  As far as I can tell, it permits incorrect
>folding as well.

Only the situations specified as requiring constant expressions force
the translator to attempt the constant folding.  (This includes all
expressions in which one might have a null pointer constant, but in
this case, the inability to evaluate the expression at compile time
is not an error.)  If the target machine has "n" bits for longs, the
host implementation must be able to evaluate expressions that require
"n" bit sized values.

Since overflowing the target's appropriate integral size results in
undefined behavior, an implementation need not emulate all aspects
of the target's behavior.

>The situation becomes more important in the case of a preprocessor
>"#if" directive.  Cases [b] and [c] are required to be computed at
>compile-time, and accuracy must be guaranteed up to a range of about
>two billion.  So case [b] must be folded properly in the case of an
>"#if".  What about case [c]?  Is the preprocessor required to simulate
>the target machine's range, or is it allowed to produce incorrect
>results when they exceed approx. two billion?

The pANS has special rules for the evaluation of #if and #elif
constant expressions.  These are due to two items: many preprocessing
implementations are completely divorced from the compiler proper, and
all the necessary translation phases (see section 2.1.1.2) have not
yet occurred by the time that #if and #elif directives are handled.

For these directives, all evaluation occurs either in long or unsigned
long, and the only guarantees about the sizes of these types are the
minimums of section 2.2.4.2 (i.e. 32 bits).  Moreover, character
constants need not have the same value as they do in the compiler
proper.

However, it is still required that any forced evaluation of a constant
expression that produces a value that does not fit in an object of
the expression's type issue a diagnostic.

For a host machine that provides a 32 bit #if and #elif,

	#if 5 + 6 == 11

must evaluate to true.  However,

	#if 5000000000 + 6000000000 == 11000000000

must cause at least one diagnostic (provided no prior problems were
found, and this directive is not being skipped) because none of these
constants fit in 32 bits.  If the constants were small enough to fit,
but the sum wasn't, the preprocessing portion must still complain.

Given the same 32 host compiling for a 64 bit target, it must handle

	switch (<expr>)
	{
	case 5 + 6:
		<stmt>
	case 5000000000 + 6000000000:
		<stmt>
	}

But even it must complain about

	enum { big = 5000000000 * 6000000000 };

since its value is too big for 64 bits.

Dave Prosser	...not an official X3J11 answer...