[comp.sys.dec] When is -O2 faster than -O3 for MIPS compiler?

hsong@nvuxl.UUCP (g hugh song) (12/29/90)

I wrote yesterday about the f77 compiler 2.1 for DecStation 5000/200
with Ultrix 4.0 :
> It is running with -O2 -C.  However it does not run without -C.  Usually
> the revese is the usual case. 

("-C" is for the run-time subscript range checking, BTW)
I finally pinpointed out a routine which is affected by the compiler
bug.   (It was not easy to pinpoint out just one routine among
about 100 routines.)

The same program ran with -O3 without -C, but at slower speed.
What?  Yes. it ran more than two times slowly with higher optimization 
level. The other compiler switches I used for both cases are 
"-Olimit 2000 -align32" and either "-c" or "-j" with respect to 
-O2 or -O3.  And the loader switch is "-G 4800".

So my question is:
In what situation, is "-O2" faster than "-O3"?

My conclusion so far is there is a very critical bug in the 
compiler-optimizer.  And there is no handy mechanism for detecting
floating point errors.   Someone pointed out "fpc" in <mips/fpu.h>.
But I still do not understand how it works.
If you know how to use it, please let us know how to use it.

Thanks.

	-hsong-
	nvuxl!hsong@bellcore.bellcore.com
	hosng%nvuxl@bellcore.bellcore.com

PS: You need to type "nvuxl\!hsong@..." on Csh command line.

meissner@osf.org (Michael Meissner) (12/29/90)

In article <765@nvuxl.UUCP> hsong@nvuxl.UUCP (g hugh song) writes:

	...

| I finally pinpointed out a routine which is affected by the compiler
| bug.   (It was not easy to pinpoint out just one routine among
| about 100 routines.)

I have some perl scripts that aid in finding these sorts of things.
They are geared towards finding which .o produces the unexpected
behavior, and which routine within the .o is the faulty one (this last
one is tuned to gcc on the MIPS, I imagine it could be hacked if
needed for the MIPS compiler).

--
Michael Meissner	email: meissner@osf.org		phone: 617-621-8861
Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142

Considering the flames and intolerance, shouldn't USENET be spelled ABUSENET?

fred@mips.COM (Fred Chow) (01/03/91)

In article <765@nvuxl.UUCP> hsong@nvuxl.UUCP (g hugh song) writes:
>
>I wrote yesterday about the f77 compiler 2.1 for DecStation 5000/200
>with Ultrix 4.0 :
>> It is running with -O2 -C.  However it does not run without -C.  Usually
>> the revese is the usual case. 
>
>("-C" is for the run-time subscript range checking, BTW)
>I finally pinpointed out a routine which is affected by the compiler
>bug.   (It was not easy to pinpoint out just one routine among
>about 100 routines.)
>
>The same program ran with -O3 without -C, but at slower speed.
>What?  Yes. it ran more than two times slowly with higher optimization 
>level. The other compiler switches I used for both cases are 
>"-Olimit 2000 -align32" and either "-c" or "-j" with respect to 
>-O2 or -O3.  And the loader switch is "-G 4800".
>
>So my question is:
>In what situation, is "-O2" faster than "-O3"?

From your description, it seems that you compiled your program in the
following two modes:

(1) -O2 -C -Olimit 2000 -align32

(2) -O3 -Olimit 2000 -align32

And you said the program compiled under (2) is twice as slow.  This is
totally impossible, unless your program runs differently and produces
different results.  Compilation with -C will introduce a lot of extra
code to do bounds-checking, and thus will slow down your program.
Sometimes, it can be bugs in the program itself that shows up only
in -O2 and not in -O3 or vice versa, and this may have to do with
uninitialized variables, but there can be other reasons. That fact 
that your program has run on other machines does not give you 100% 
guarantee, because different compiler generates different code. With 
MIPS' compiler, compiling with the -trapuv option will help you 
detect uninitialized variables. If it is a compiler bug, the -C 
option can also cause the compiler bug to disappear. By the way, have 
you tried -g without -O2/-O3/-C and see if the program runs? 

The speed difference between -O2 and -O3 is not great, and usually less
than 20%.  In many programs, -O2 and -O3 yields approximately the
same speed.  Occasionally, there are programs where -O3 is slower than -O2,
but never more than 10%.   

The -align32 option will also slow down your program, because it makes
the compiler issues 2 instructions to load and store doubles in many
memory references.

- Fred Chow (fred@mips.com)

>My conclusion so far is there is a very critical bug in the 
>compiler-optimizer.  And there is no handy mechanism for detecting
>floating point errors.   Someone pointed out "fpc" in <mips/fpu.h>.
>But I still do not understand how it works.
>If you know how to use it, please let us know how to use it.
>