mccalpin@MASIG1.OCEAN.FSU.EDU ("John D. McCalpin") (03/17/89)
Thanks to: Len Lattanzi (len@Synthesis.com) Synthesis Software Solutions, Inc. for information about the feedback optimization. Some more questions: (1) Can both of these options (cache reorganization and inlining) be used at the same time? (2) The -O3 with the -feedback option worked. I got no speedup on this particular code, but that is not necessarily a problem. (3) The -cord option does NOT work. The compiler gets almost to the end before bombing --- it could not find /usr/bin/ftoc. What does ftoc do, and should it be there? By the way, I am not running dhrystone :-). I am running LINPACK and a variety of floating-point intensive finite-difference PDE codes. So far it looks like loop unrolling buys a lot on this machine. On 32-bit LINPACK (order 100 case), with full optimization -O3, unrolling the innermost loops (the BLAS subroutines) gives a speedup from 1.4 to 1.9 MFLOPS (unrolled to a depth of 16). I still can't recover the 3.0 MFLOPS in the LINPACK published results for the MIPS M-800 (which should be the same CPU and clock speed). -- ---------------------- John D. McCalpin ------------------------- Mesoscale Air-Sea Interaction Group & Department of Oceanography & Supercomputer Computations Research Institute - Fl State Univ. mccalpin@masig1.ocean.fsu.edu mccalpin@nu.cs.fsu.edu mccalpin@fsu (BITNET or MFENET) SCRI::MCCALPIN (SPAN) ------------------------------------------------------------------