[comp.sys.mips] mips2 option for the compiler

peregier@vlsi.waterloo.edu (Phil Regier) (02/13/91)

In the man page for cc and f77, there are two options '-mips1'
and '-mips2' mentioned.  The first generates code using the instruction
set of the R2000/R3000 RISC architecture.  It is the default for
all MIPS systems.  The second generates code for the R6000 architecture.
It appears that the different libraries that are used are libm,
libm43 and libfastm.  The other libraries are the same.  In the
assembly code I've looked at from some test files, the only difference
appears to be the lack of some '.align 0' instructions when using
'-mips2'.

But I can't find any more information than this.  Can somebody tell
me what advantage there is to using the -mips2 option?  Is there
a significant performance improvement if one is actually running
on an R6000 machine?

cprice@mips.COM (Charlie Price) (02/16/91)

In article <1991Feb12.205718.22168@vlsi.waterloo.edu> peregier@vlsi.waterloo.edu (Phil Regier) writes:
>
>But I can't find any more information than this.  Can somebody tell
>me what advantage there is to using the -mips2 option?  Is there
>a significant performance improvement if one is actually running
>on an R6000 machine?

MIPS-II includes some new instructions that can make
some programs faster or the program (very slightly) smaller.

The amount of improvement in a program depends a lot on what it does.
The largest improvements are in double precision FP codes.
The biggest win is probably if you do a lot of FP square roots.
You need to trade off some performance improvement, perhaps negligible,
against the effort of creating two executables and figuring out how to
use the correct one if you have anything besides MIPS-II machines.
This effort is probably only worthwhile for very long-running codes,
and I would measure it then:

Borrowing from an article by Michael Z Slater talking about what
MIPS-II adds to the MIPS-I...

> Loads are interlocked; you don't need a no-op if the next instruction uses
> the data from the load.

Though you still execute load delay slot instructions if you don't
use the result before it arrives.

> Annulling branches and conditional traps are added.

The annulling branches are useful for loops if you can't fill the
branch-delay slot with something useful.

> FP additions include 64-bit load and store, square root, and float-to-fixed
> conversion with rounding mode (ceiling, floor, etc.) explicitly specified.

On the R6000, the data bus is 32 bits, so double load/store is faster,
but not as much faster as you might expect.
The square root is a big win if you do a lot of them.
-- 
Charlie Price    cprice@mips.mips.com        (408) 720-1700
MIPS Computer Systems / 928 Arques Ave. / Sunnyvale, CA   94086-23650