[comp.std.c] Semi constant expressions

wittig@gmdzi.UUCP (Georg Wittig) (08/30/89)

I haven't seen this topic in comp.lang.c and comp.std.c in the last months ...
 
My question is about semi constant expressions, i.e. about expressions of the
form
	0 * x			(x non-constant)
	0 / x
	x << (100000000L)
	etc.

Does (p)ANSI C say anything about which code should be generated for such
expressions? K&R I and Harbison&Steele II don't seem to cover this. Does ANSI
C? (I don't have the ANSI C papers, so please `cat flames > /dev/null`)

Three possibilities come to my mind, how code can be generated for those
expressions:

1)	Compilers are free to generate code as if the constant expression `0'
	had been seen, even if `x' can have side effects (`x' could be a
	function call `foo()').

2)	If `x' has no side effects, prentending a constant expression `0'
	might be o.k.; however if `x' can have side effects, compilers should
	interpret `0 * foo()' as if they had seen `(foo(), 0)'.

3)	Compilers must not think about such binary expressions and must
	always generate code for multiplication, division. etc.

H&S II p. 162: "both operands are fully evaluated (but in no particular order)
before the operation is performed." However, nothing is said about optimization
in those semi constant expressions. Nevertheless, that sentence seems to imply
that case 3 above (and may be case 2 also) is the correct solution. Is ANSI C
more specific at these cases?

BTW:
- How about `int x,y; x=0; y=x*foo()' thrown to a good optimizing compiler?
  What is it allowed to optimize away?

- Does there exist an offical word for what I called `semi constant
  expression'?


-- 
Georg Wittig		     email:wittig@gmdzi.uucp   phone:(+49 2241) 14-2294
German National Research Laboratory for Computer Science (GMD)
P.O. Box 1240				|"Freedom's just another word for noth-
D-5205 St. Augustin 1 (West Germany)	| ing left to lose" (K. Kristofferson)

gwyn@smoke.BRL.MIL (Doug Gwyn) (08/31/89)

In article <1237@gmdzi.UUCP> wittig@gmdzi.UUCP (Georg Wittig) writes:
>	0 * x			(x non-constant)
>	0 / x
>	x << (100000000L)
>	etc.
>Does (p)ANSI C say anything about which code should be generated for such
>expressions?

This falls naturally out of the specification.  The operands ARE evaluated,
so if `x' has side-effects they do occur.  If evaluation would produce what
the Standard labels as "undefined behavior", then the implementation can do
anything it likes including taking short cuts, but in cases where the
behavior is well defined the only short cuts permitted are those that
produce the same results as doing it the long way.

The Standard neither prohibits nor requires optimizations.  It prescribes
what the program behavior must be.

msb@sq.sq.com (Mark Brader) (09/01/89)

>	x << (100000000L)

Unlike the 0*x and 0/x examples, this one is explicitly undefined behavior
(assuming that x is of an integral type less than 100000000 bits wide!).

See 3.3.7:
	If the value of the right operand is negative or is greater
	than or equal to the width in bits of the promoted left
	operand, the behavior is undefined.

The reason for this restriction is to simplify implementation on computers
where the shift instruction's amount-to-shift field is only lg2(wordsize)
bits long -- that is, is only 5 bits if ints are 32 bits, for example --
or where only that many bits are actually examined, or where a software
shift is done and has similar restrictions.

By the way, the L on the right operand has no effect.  In the proposed
standard the type of the result of << is the promoted type of the left
operand, so it doesn't matter whether the right operand is int or long,
and the plain constant 100000000 would be one or the other depending on the
size of ints.


-- 
Mark Brader		    "...most mistakes are made the last thing before
SoftQuad Inc., Toronto	        you go to bed.  So go to bed before you do
utzoo!sq!msb, msb@sq.com	the last thing."	-- David Jacques Way

This article is in the public domain.

flaps@dgp.toronto.edu (Alan J Rosenthal) (09/01/89)

msb@sq.sq.com (Mark Brader) writes:
>>	x << (100000000L)
>By the way, the L on the right operand has no effect.  ...
>and the plain constant 100000000 would be [int] or [long] depending on the
>size of ints.

But a good compiler might warn you if you say "100000000" and it gets promoted
to long.  (For example, not that this compiler is otherwise good, later
versions of the QNX C compiler.)  Of course, compilers can warn about anything,
but this sounds like a reasonable warning.

ajr

dolf@idca.tds.PHILIPS.nl (Dolf Grunbauer) (09/05/89)

In article <10885@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>In article <1237@gmdzi.UUCP> wittig@gmdzi.UUCP (Georg Wittig) writes:
>>	0 * x			(x non-constant)
>>	0 / x
>>	x << (100000000L)
>>	etc.
>>Does (p)ANSI C say anything about which code should be generated for such
>>expressions?
>This falls naturally out of the specification.  The operands ARE evaluated,
>so if `x' has side-effects they do occur.  If evaluation would produce what
>the Standard labels as "undefined behavior", then the implementation can do
>anything it likes including taking short cuts, but in cases where the
>behavior is well defined the only short cuts permitted are those that
>produce the same results as doing it the long way.

How can you tell what will happen when 'doing it the long way' ? Even C
has contructs which behave quite different from using the short cut and
the long way, e.g. the next construct won't work the long way:
      if ((i != 0) && (x/i > 0))     /* assume i and x integers */
(Yes I know the && is defined to always use the short cut, this example is
used only to make my point clear, at least I hope :-).
So I would say (from a compiler builders standpoint of view): 
   once you see a way to determind the result of an expression on compile
   time then use this result, and do not execute the expression on runtime.
This means that in the examples of Georg, I would like to have the results
of the expressions (i.e. 0) in my code.
I even think this should be used on constructs like (x always becomes 0):
    x = 0 * i++;     /* no auto-increment of i  */
    x = 0 * foo();   /* no function call to foo */
    x = 0 * foo(i++);/* no function call to foo, and no inc on i */
A compiler warning seems appropriate to me on this point.
Another problem given by Georg was:
    x = 0;
    y = x * foo(i++);
This is a bit harder. The side effects are hidden. I hink in this case the
function has to be called and the i incremented. I do not think it is allowed
to compile this last line as:
   if (x != 0) foo(i++);
   y = 0;
-- 
Dolf Grunbauer          Tel: +31 55 432764  Internet dolf@idca.tds.philips.nl
Philips Telecommunication and Data Systems  UUCP ....!mcvax!philapd!dolf
Dept. SSP, P.O. Box 245, 7300 AE Apeldoorn, The Netherlands

chris@mimsy.UUCP (Chris Torek) (09/05/89)

In article <242@ssp1.idca.tds.philips.nl> dolf@idca.tds.PHILIPS.nl
(Dolf Grunbauer) writes:
>I even think this should be used on constructs like (x always becomes 0):
>    x = 0 * i++;     /* no auto-increment of i  */
>    x = 0 * foo();   /* no function call to foo */
>    x = 0 * foo(i++);/* no function call to foo, and no inc on i */

The point of Doug Gwyn's answer is that the pANS constrains the
effects of legal C code, not the actual code generated by the compiler.
In particular, in well-defined expressions (which includes `0 * x',
for all well-defined x), all side effects must take place.  So all
three of the above comments are wrong.

>A compiler warning seems appropriate to me on this point.

Here I agree.

>Another problem given by Georg was:
>    x = 0;
>    y = x * foo(i++);
>This is a bit harder. The side effects are hidden.

The side effects must always take place, whether optimised or not.  The
only thing an optimiser can do about

	x = 0 * i++;

is replace it with the sequence

	i++, x = 0;

(The `i++' can then be deleted if and only if i is a dead variable.)
Likewise, for

	x = 0 * foo();

the compiler can use

	foo(), x = 0;

and the call can only be deleted if foo() is a pure function (has no
side effects).  Most compilers do not even attempt to begin to consider
thinking about possibly looking for `function purity', although some
are starting to cogitate on the possibility of maybe doing so.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

gwyn@smoke.BRL.MIL (Doug Gwyn) (09/05/89)

In article <242@ssp1.idca.tds.philips.nl> dolf@idca.tds.PHILIPS.nl (Dolf Grunbauer) writes:
>In article <10885@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>>In article <1237@gmdzi.UUCP> wittig@gmdzi.UUCP (Georg Wittig) writes:
>>>	0 * x			(x non-constant)
>>>Does (p)ANSI C say anything about which code should be generated for such
>>>expressions?
>>This falls naturally out of the specification.  The operands ARE evaluated,
>>so if `x' has side-effects they do occur.  If evaluation would produce what
>>the Standard labels as "undefined behavior", then the implementation can do
>>anything it likes including taking short cuts, but in cases where the
>>behavior is well defined the only short cuts permitted are those that
>>produce the same results as doing it the long way.
>How can you tell what will happen when 'doing it the long way' ?

I thought this was clear from what I said and from the context.  In the
case
	0 * x
if evaluation of the expression `x' would have observable side effects
then they MUST occur for a correct implementation of C.  In such a case,
an implementation that replaced the expression "0 * x" with the constant
"0" as an "optimization" would not be standard conforming.

The run-time behavior of a C program is defined in terms of the operation
of a "virtual machine" (with some deliberately fuzzy areas in its
specification, so it's really a class of virtual machines) and input/output
operations (which are usually what makes the program's behavior observable).
A correct implementation must produce observable results identical to those
that would be produced by the virtual machine (actually, by some member of
the class of virtual machines).

henry@utzoo.uucp (Henry Spencer) (09/06/89)

In article <10885@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>>	0 * x			(x non-constant)
>
>This falls naturally out of the specification.  The operands ARE evaluated,
>so if `x' has side-effects they do occur... in cases where the
>behavior is well defined the only short cuts permitted are those that
>produce the same results as doing it the long way.

A historical note on this...

-------------
From decvax!harpo!eagle!mhtsa!alice!research!dmr Mon Jan 31 02:52:38 1983
Subject: foo()*0
Newsgroups: net.lang.c

A couple of years ago I changed my C compiler not to throw out
0*x, 0&x, and the like where  x  is an expression with side effects.
I believed then and now that anyone who depended on such things was
mad, and the recent examples have not convinced me otherwise.
However, it was much easier to change the compiler than to attempt
to argue the implausibility of each carefully crafted example...

		Dennis Ritchie
-------------
-- 
V7 /bin/mail source: 554 lines.|     Henry Spencer at U of Toronto Zoology
1989 X.400 specs: 2200+ pages. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

gwyn@smoke.BRL.MIL (Doug Gwyn) (09/06/89)

In article <1989Sep5.180315.25627@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>From decvax!harpo!eagle!mhtsa!alice!research!dmr Mon Jan 31 02:52:38 1983
>A couple of years ago I changed my C compiler not to throw out
>0*x, 0&x, and the like where  x  is an expression with side effects.
>I believed then and now that anyone who depended on such things was
>mad, and the recent examples have not convinced me otherwise.
>However, it was much easier to change the compiler than to attempt
>to argue the implausibility of each carefully crafted example...

I think most of us would agree that it is "mad" to write such an
expression.  However, C provides a preprocessor which is perfectly
capable of rewriting an innocuous expression so that such a case
results.  Depending on configuration parameters, one's code could
mysteriously malfunction yet look reasonable when you stare at the
source code.  It is "friendlier" for patent operations in  the
source code to actually be performed as the programmer intended.

bvs@light.uucp (Bakul Shah) (09/06/89)

In article <10937@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>  ...
>The run-time behavior of a C program is defined in terms of the operation
>of a "virtual machine" (with some deliberately fuzzy areas in its
>specification, so it's really a class of virtual machines) and input/output
>operations (which are usually what makes the program's behavior observable).
>A correct implementation must produce observable results identical to those
>that would be produced by the virtual machine (actually, by some member of
>the class of virtual machines).

I have some questions about `observable behavior'.  How does this
concept interact with `volatile' variables? with debuggers? with
shared memory?

Are volatile variables considered to be observable within their
scope at all times (or at sequence points)?  Is this mechanism
sufficient for indicating that an object is observable?

While debugging a piece of code I may want everything accessed from
it to be observable.  Should a compiler treat such variables as
volatile?

Is a program with *no* output of any kind observable?

Some discussion on this subject would be helpful.  Thanks

-- Bakul Shah <..!{ames,sun,ucbvax,uunet}!amdcad!light!bvs>

henry@utzoo.uucp (Henry Spencer) (09/07/89)

In article <10949@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>>I believed then and now that anyone who depended on such things was
>>mad, and the recent examples have not convinced me otherwise.
>
>I think most of us would agree that it is "mad" to write such an
>expression.  However, C provides a preprocessor which is perfectly
>capable of rewriting an innocuous expression so that such a case
>results...

Dennis's objection may have been the same as the one I would have to such
an expression:  exploiting internal side effects at all is a lousy idea,
barring a few well-defined idioms (which don't run into this problem).
There are no "innocuous" expressions which can experience such rewriting,
although there are plenty of poorly-written ones which can.
-- 
V7 /bin/mail source: 554 lines.|     Henry Spencer at U of Toronto Zoology
1989 X.400 specs: 2200+ pages. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

gwyn@smoke.BRL.MIL (Doug Gwyn) (09/07/89)

In article <1989Sep6.160709.4890@light.uucp> bvs@light.UUCP (Bakul Shah) writes:
>I have some questions about `observable behavior'.  How does this
>concept interact with `volatile' variables? with debuggers? with
>shared memory?

"Observable behavior" was my own terminology.  The proposed Standard
doesn't go into this in great detail.  It does mention output of a
strictly conforming program, acceptance of a program by the
implementation, accessing volatile objects, and modifying files.
Section 2.1.2.3 contains the main relevant constraints, e.g. the
implementation must ensure that at program termination all data
written into files be identical with that which would have resulted
according to the abstract machine's semantics.

>Are volatile variables considered to be observable within their
>scope at all times (or at sequence points)?  Is this mechanism
>sufficient for indicating that an object is observable?

I believe the constraints on volatile variables are sufficient to
ensure that accesses really do occur according to the abstract
machine.  There is a bit of deliberate vagueness about the atomicity
of the access.

>While debugging a piece of code I may want everything accessed from
>it to be observable.  Should a compiler treat such variables as
>volatile?

Debugging environments are beyond the scope of the Standard.
How useful yours is depends on the care put into it by the vendor.

>Is a program with *no* output of any kind observable?

My interpretation is that a purely computational process could be
optimized into a no-op and meet the constraints of the Standard.
Undoubtedly some vendors will do just that for common benchmarks!

Generally, it would be too hard for a compiler to perform the analysis
necessary to determine how much code generation could be omitted due
to its having no observable effect.  Some local optimizations are
expected, but without a very smart compiler the only safe thing would
be for it to ensure that side effects have code executed to cause them.