prl@eiger.uucp (04/21/89)
achhabra@uceng.uc.edu (atul k chhabra): > I chanced upon a segment of code that runs approximately 300 times faster > in FORTRAN than in C. I have tried the code on Sun3(OS3.5) and on > Sun4(OS4.0) (of course, on Sun4 the -f68881 flag was not used.) The > results are similar on both machines. Can anyone enlighten me on this > bizzare result?... > [[ ...For f77, "-O" defaults to "-O3", which > includes global optimization. Presumably it also does dead code > elimination.... --wnl ]] > Atul Chhabra, Dept. of Electrical & Computer Engineering, ML 030, > University of Cincinnati, Cincinnati, OH 45221-0030. Wnl is correct in his analysis, but the suggestion that placing a print *,tmp will defeat global optimisation is wrong (demonstrably in some non-Sun compilers). There are a number of optimisations which can be applied to the program. 1. (cos(2.5)**4) is a constant which can be calculated at compile time. 2. The assignment tmp = cos(2.5)**4 is independent of the loop variable and so can be moved out of the loop. 3. After the assignment is moved out of the loop, the loop is empty, so it can be replaced by the assignment i = 262144 4. Since neither `tmp' nor `i' are used in expressions which affect the global state of the program or the outside world, their calculation can be discarded entirely. The statement print *,tmp only disables optimisation 4. (3 and 4 are the only optimisations being applied by Sun's compiler: in the order (4) on tmp, (3) on the now-empty loop, then (4) on i). Alliant's compiler with global optimisation turned on applies 1, 2 and 4, but not 3). In defense of Alliant's compiler writers, optimisation 3 is of little practical use, since in any reasonable program *all* loops should have some code which can't be moved outside the loop (otherwise **WHY** is there a loop there anyway?). A good optimising compiler should be able to turn Atul's program (including the print statement) into equivalently: tmp=0.411947.... (cos(2.5)**4) print *,tmp Writing synthetic benchmarks which cannot be defeated in `unreasonable' ways by good optimising compilers is **very** difficult. You will often need to look at the assembly code, or better, run some real application set that is important to you, or a benchmark set which is very careful about such things.... Peter Lamb uucp: uunet!mcvax!ethz!prl eunet: prl@ethz.uucp Tel: +411 256 5241 Integrated Systems Laboratory ETH-Zentrum, 8092 Zurich