pmontgom@sonia.math.ucla.edu (Peter Montgomery) (01/19/90)
In article <8960004@hpfcso.HP.COM> mjs@hpfcso.HP.COM (Marc Sabatella) writes: > >>Fortran very specifically prohibits invisible aliasing among arguments and >>common, the optimizer is allowed to make the most optimistic assumptions in >>this case. > >Show me the optimizer that makes these assumptions, and I'll show you one that >breaks the code of every customer we have, and I am not sure I believe it, in >any case. Do you mean it is even illegal to pass the same argument in two >different positions of a formal parameter list? For instance: > >subroutine foo (x, y) > >... > >call foo (a, a) Yes, this is illegal if foo modifies x or y. It is permitted if both are read only. This allows the optimizer to make local copies of x and y when compiling foo. As a FORTRAN programmer, I do occasionally violate the rule, as in subroutine VECADD(x, y, z) real x(3), y(3), z(3) z(1) = x(1) + y(1) z(2) = x(2) + y(2) z(3) = x(3) + y(3) end real vel(3), delvel(3) call VECADD(vel, delvel, vel) but only when I "know" (as here) that no element of array x will be needed again after the corresponding element of z is set. Even then, I cannot complain if my code runs incorrectly, because I am violating the standard. The advantage of this restriction comes if instead I write subroutine CROSS(x,y,z) real x(3), y(3), z(3) z(1) = x(2)*y(3) - x(3)*y(2) z(2) = x(3)*y(1) - x(1)*y(3) z(3) = x(1)*y(2) - x(2)*y(1) end Now a good compiler will fetch x(1), x(2), x(3), y(1), y(2), and y(3) all at once, keeping them in registers during the multiplies and subtracts (each is referenced twice). This optimization is safe since it is illegal for z to overlap x and y. This also allows the six multiplies to be done in parallel (i.e., pipelined) on suitable hardware. In C, the compiler must worry about whether z(1) or z(2) overlaps the other variables, and cannot make these optimizations. A C programmer needing it would need to so something like { register const float z0 = x[1]*y[2] - x[2]*y[1]; register const float z1 = x[2]*y[0] - x[0]*y[2]; register const float z2 = x[0]*y[1] - x[1]*y[0]; z[0] = z0; z[1] = z1; z[2] = z2; } and hope the C compiler realized the local variables z0,z1,z2 could not overlap x or y. Even then, the need for three more floating point variables in registers may overflow the register set and produce worst code than FORTRAN gives. -------- Peter Montgomery pmontgom@MATH.UCLA.EDU Department of Mathematics, UCLA, Los Angeles, CA 90024