[comp.sys.mips] What are the new features in MIPS-II instruction set?

k2@bl.physik.tu-muenchen.de (Klaus Steinberger) (01/25/91)

Hi,

I'm interested in the differences between MIPS-I and MIPS-II
instruction sets. The assembler manual doesn't mention it.
Are there any advantages compiling a program with "-mips2"
for the CD4680 (RC6280)?

Sincerely,
Klaus Steinberger

--
Klaus Steinberger               Beschleunigerlabor der TU und LMU Muenchen
Phone: (+49 89)3209 4287        Hochschulgelaende
FAX:   (+49 89)3209 4280        D-8046 Garching, Germany
BITNET: K2@DGABLG5P             Internet: k2@bl.physik.tu-muenchen.de

meissner@osf.org (Michael Meissner) (01/25/91)

In article <k2.664739175@woodstock> k2@bl.physik.tu-muenchen.de (Klaus
Steinberger) writes:

| Hi,
| 
| I'm interested in the differences between MIPS-I and MIPS-II
| instruction sets. The assembler manual doesn't mention it.
| Are there any advantages compiling a program with "-mips2"
| for the CD4680 (RC6280)?

I too would be interested in such a list, including pipe line delays
and such (this is for the scheduling code that will be going into
GCC).
--
Michael Meissner	email: meissner@osf.org		phone: 617-621-8861
Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142

Considering the flames and intolerance, shouldn't USENET be spelled ABUSENET?

mslater@cup.portal.com (Michael Z Slater) (01/28/91)

>I'm interested in the differences between MIPS-I and MIPS-II
>instruction sets. The assembler manual doesn't mention it.
>Are there any advantages compiling a program with "-mips2"
>for the CD4680 (RC6280)?

I assume someone from MIPS will answer in more detail, but here's a quick
summary.

Loads are interlocked; you don't need a no-op if the next instruction uses
the data from the load.

Annulling branches and conditional traps are added.

FP additions include 64-bit load and store, square root, and float-to-fixed
conversion with rouding mode (ceiling, floor, etc.) explicitly specified.

Code should be faster and smaller if compiled for the MIPS-II instructions set,
but of course it won't work on anything but an R6000-based machine. I expect
double-precision FP code would see the biggest improvement.

Rumors are that the R4000 will support the same MIPS-II instructions, but it's 
going to be a while before you see R4000-based machines.

Michael Slater, Microprocessor Report   mslater@cup.portal.com

qzhe1@cs.aukuni.ac.nz (Qun Zheng ) (02/05/91)

MIPS-II does not use delayed-load. Why?

Most MIPS instructions are RRR/RRI type.  So how is MEM stage justified?  I know
the address calculation is done in EX stage.  But this calculation can be done
is one instruction and then use another instruction to do indirect access.  So
EX stage uses ALU for ALU operations and accesses memory for load/store.  Note
that RISC never encourage people to program in assembly level, so not much in-
convenience here.  Maybe because lots of load/stores used in programs?

It does not have to any performence gain by changing the pipeline.  But there
must be some important factors in the design.  Mind you informative/intelligent
netters give me some insight.

Thanks in advance.

Chuck Q ZHENG
Dept CompSci
Auckland University
New Zealand

mash@mips.COM (John Mashey) (02/08/91)

In article <qzhe1.665730055@aucs7.cs.aukuni.ac.nz> qzhe1@cs.aukuni.ac.nz (Qun                 Zheng          ) writes:
>MIPS-II does not use delayed-load. Why?
>
>Most MIPS instructions are RRR/RRI type.  So how is MEM stage justified?  I know
>the address calculation is done in EX stage.  But this calculation can be done
>is one instruction and then use another instruction to do indirect access.  So
>EX stage uses ALU for ALU operations and accesses memory for load/store.  Note
>that RISC never encourage people to program in assembly level, so not much in-
>convenience here.  Maybe because lots of load/stores used in programs?
>
>It does not have to any performence gain by changing the pipeline.  But there
>must be some important factors in the design.  Mind you informative/intelligent
>netters give me some insight.

I think what you're proposing is akin to the AMD 29K, where
the loads and stores cannot have offsets, i.e. 0(RX) is only addressing
mechanism, i.e. instead of:
	lw	r2, 10(r3)
you do something like
	add	r4,r3,10
	lw	r2, 0 (r4)

and the reason you don't do it is that loads and stores are usually
30% of a program, and if you do the cycle count analysis, it's better
to include the offset calculation inside the instruction,
rather than use 2 instructions much of the time.
I'd guess that you need to do 2 instructions at least 70% of the time, or more.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	 mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash 
DDD:  	408-524-7015, 524-8253 or (main number) 408-720-1700
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086