[comp.lang.misc] Optimizations possible in FORTRAN but not C

pmontgom@sonia.math.ucla.edu (Peter Montgomery) (01/19/90)

In article <8960004@hpfcso.HP.COM> mjs@hpfcso.HP.COM (Marc Sabatella) writes:
>
>>Fortran very specifically prohibits invisible aliasing among arguments and
>>common, the optimizer is allowed to make the most optimistic assumptions in
>>this case.
>
>Show me the optimizer that makes these assumptions, and I'll show you one that
>breaks the code of every customer we have, and I am not sure I believe it, in
>any case.  Do you mean it is even illegal to pass the same argument in two
>different positions of a formal parameter list?  For instance:
>
>subroutine foo (x, y)
>
>...
>
>call foo (a, a)

	Yes, this is illegal if foo modifies x or y.  It is permitted
if both are read only.  This allows the optimizer to make local copies 
of x and y when compiling foo.  As a FORTRAN programmer, I do occasionally
violate the rule, as in

	subroutine VECADD(x, y, z)
	real x(3), y(3), z(3)
	z(1) = x(1) + y(1)
	z(2) = x(2) + y(2)
	z(3) = x(3) + y(3)
	end

	real vel(3), delvel(3)
	call VECADD(vel, delvel, vel)

but only when I "know" (as here) that no element of array x 
will be needed again after the corresponding element of z is set.
Even then, I cannot complain if my code runs incorrectly, 
because I am violating the standard.
The advantage of this restriction comes if instead I write

	subroutine CROSS(x,y,z)
	real x(3), y(3), z(3)
	z(1) = x(2)*y(3) - x(3)*y(2)
	z(2) = x(3)*y(1) - x(1)*y(3)
	z(3) = x(1)*y(2) - x(2)*y(1)
	end

Now a good compiler will fetch x(1), x(2), x(3), y(1), y(2), and y(3)
all at once, keeping them in registers during the multiplies and subtracts
(each is referenced twice).  This optimization is safe since it is 
illegal for z to overlap x and y.  This also allows the six multiplies 
to be done in parallel (i.e., pipelined) on suitable hardware.  
In C, the compiler must worry about whether z(1) or z(2) overlaps 
the other variables, and cannot make these optimizations.
A C programmer needing it would need to so something like

    {
	register const float z0 = x[1]*y[2] - x[2]*y[1];
	register const float z1 = x[2]*y[0] - x[0]*y[2];
	register const float z2 = x[0]*y[1] - x[1]*y[0];
	z[0] = z0; z[1] = z1; z[2] = z2;
    }

and hope the C compiler realized the local variables z0,z1,z2
could not overlap x or y.  Even then, the need for three more
floating point variables in registers may overflow the register 
set and produce worst code than FORTRAN gives.
--------
        Peter Montgomery
        pmontgom@MATH.UCLA.EDU 
	Department of Mathematics, UCLA, Los Angeles, CA 90024