[comp.arch] IQFs

mash@mips.COM (John Mashey) (08/01/88)

In article <408@ma.diab.se> pf@ma.UUCP (Per Fogelstr|m) writes:
>In article <941@srs.UUCP> srs!matt@cs.rochester.edu (Matt Goheen) writes:
>>I have always been led to believe that Sun's rating of the Sun 4/200
>>series as 10 MIPS to be "Vax" MIPS (this goes for the 7 MIPS 4/110....

>Recently ELECTRONICS had an article sereies on RISCS. One of the articles
>compared different cpu's introducing something called "instruction quality
>factor". In this comparision the VAX780 was rated 1.0 with its 0.47 MIPS
>performence. The funny thing that puzzels me was that most RISC designs
>had an "IQF" of 0.8 - 0.9 exept SPARC wich was rated 0.6. Assuming SPARC is
>a native 10MIPS cpu this would yield a 6 normalized mips cpu. I did the same
>comparition with my 32532 design. This 25Mhz prototype executes about 6-8
>real MIPS (Yes there is a pin on the chip that pulses every time a new
>instruction is starte where i connect the counter). Okay, the IQF for this
>cpu must be very close to 1.0 since the architecture resembles the VAX
>architecture.  In this case my 25Mhz 32532 design is a 15VAX MIPS processor.
>Well then, MIPS what the heck is it, really. At least it's useless when
>comparing different processor architectures.

All of this illustrates how confused things get when you:
	start with engineering reality, sort of
	look at it thru the marketing viewpoint
	and then stir in some absolutely bogus guesses

1) I read the ELECTRONICS article.  I don't recall seeing any sensible
derivation for the IQF; in fact, there is no earthly reason for SPARC
to be that much less than the other RISCs.  Those numbers were guesses,
as far as I can tell.  There is nothing particularly wrong with
SPARC instructions [the performance hits come from otehr things, mostly.]

2) It might be reasonable to compute IQFs, but you need access to very
good architectural simulations.  It is very hard to measure them by running
benchmarks.

3) Native MIPS is a useful and interesting number, for the people who
build computers and run the architectural simulations.  It's fairly
useful when comparing the same executable object code across
architecturally compatible machines.

4) It is sad but true, that a particular game is played in this business.
	a) Compute and publish some kind of mips-number for your processor.
	b) Describe the number by saying it's:
		native mips
		integer sustained mips
		normalized average intger mips
	c) WHen these things get summarized, in marketing brochures, or
	in the press, the qualifications usually get dropped.
	(To be fair to the press, it is HARD to disentangle what's
	being told to them).
	d) WHen people read these things, they sort of expect that these
	have either turned into VAX-relative mips, or that the numbers
	somehow indicate actual relative performance on *something*.
	e) Many people claim to use VAX-mips:
		sometimes using MicroVAXII == 1 (FYI, uVAX != 11/780)
		without having the foggiest idea how DEC computes the
			numbers themselves
		based on one or two benchmarks

	f) The result is that you should NEVER, EVER believe that
	mips-ratings mean anything at all unless you can obtain
	substantial backup data, including enough benchmarks in common
	that you can compare and figure out what individual mips-ratings
	really mean. By "enough", just for CPU benchmarks, one would
	like at least 10 each of integer and floating-point benchmarks,
	and real programs, not toys. 

5) Expect things to be real messy for a while.  1988 is the year of true
mipsflation and trade press ooh-and-ahhing over each new entry in
the mips-race, and it is very hard to tell what is real, what is almost-real,
and what is near-fantasy, or heavy-pre-announcemnt, especially when the
press writes about everything in the present tense :-)

6) IS THERE ANY HOPE?
	Well, not much, but there are a few hopeful signs amidst the mess:

a) Some people gather and publish benchmarks as a public service.
Notable examples include Richardson [DHrystone], Dongarra [LINPACK],
McMahon [LIevermore Loops].

b) Some of the trade press has gotten tired of just quoting vendor-supplied
mips-ratings, and are acquiring/developing useful benchmarks to be run.
This doesn't mean the benchmarks are necessarily GOOD ones, but sometimes they
are reasonable, and in any case, these folks are at least trying, and they
deserve support and encouragement from the rest of us.
Particular examples (there are more, of course) include:
	Digital Review magazine has a large FORTRAN benchmark they use to
		compare systems.  It contains 33 sub-benchmarks, some of which
		have problems.  However, the folks at DR continue to analyze
		the results and improve the usefulness of the suite,
		and they've started weeding out problems by looking at
		the statistical effects of the various benchmarks.
	Byte magazine has done benchmarks for years.  Some of the benchmarks
		are not very applicable to higher-performance systems,
		but they're also trying hard, by generating more
		applications-oriented benchmarks.
	UNIX Review has a column provided by Workstation Laboratories,
		which runs quite a few benchmarks and compares machines.
		The July 88 issue covers the Sun 4/200, and a MIPS M/120
		will appear soon.

	Other magazines are looking for help from vendors in selecting good
	applications benchmarks.

c) Everybody [especially those who BUY computers, rather than those of
us who sell them :-)] can help by refusing to ceept meaningless numbers
	1) From vendors
	2) From the press
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	{ames,decwrl,prls,pyramid}!mips!mash  OR  mash@mips.com
DDD:  	408-991-0253 or 408-720-1700, x253
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

yuval@taux02.UUCP (Gideon Yuval) (08/25/88)

In article <2693@winchester.mips.COM> mash@winchester.UUCP (John Mashey) writes:
>2) It might be reasonable to compute IQFs, but you need access to very
>good architectural simulations.  It is very hard to measure them by running
>benchmarks.

Ori Danieli's M.Sc. Thesis (Tel Aviv University, 7/88) has  some  simulations
on  two  architectures:  one  is  almost  the full NS32K architecture (except
that memory-to-memory operations are not there); the other  is  a  load/store
RISC  subset  of the NS32K architecture. He finds a 12% average difference in
the number of  instructions  needed  to  run  the  same  program.  In  the  5
programs  run,  this  difference was 7% for DC, PTC (a Pascal-to-C converter)
and SORT, 17% for GREP, and 23% for SED.

-- 
Gideon Yuval, yuval@taux01.nsc.com, +972-2-690992 (home) ,-52-522255(work)
 Paper-mail: National Semiconductor, 6 Maskit St., Herzliyah, Israel
                                                TWX: 33691, fax: +972-52-558322

mash@mips.COM (John Mashey) (08/26/88)

In article <99@taux02.UUCP> yuval@taux02.UUCP (Gideon Yuval) writes:
>In article <2693@winchester.mips.COM> mash@winchester.UUCP (John Mashey) writes:
>>2) It might be reasonable to compute IQFs, but you need access to very
>>good architectural simulations.  It is very hard to measure them by running
>>benchmarks.
>
>Ori Danieli's M.Sc. Thesis (Tel Aviv University, 7/88) has  some  simulations
>on  two  architectures:  one  is  almost  the full NS32K architecture (except
>that memory-to-memory operations are not there); the other  is  a  load/store
>RISC  subset  of the NS32K architecture. He finds a 12% average difference in
>the number of  instructions  needed  to  run  the  same  program.  In  the  5
>programs  run,  this  difference was 7% for DC, PTC (a Pascal-to-C converter)
>and SORT, 17% for GREP, and 23% for SED.

Interesting numbers.  Please say some more on methodology, if possible,
and what sort of compiler technology was being used.

I think this is saying that deleting the non-RISC-subset adds 12%
in number of instructions.  Did the thesis say anything about
total cycle counts (i.e., incl. memory system?)
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	{ames,decwrl,prls,pyramid}!mips!mash  OR  mash@mips.com
DDD:  	408-991-0253 or 408-720-1700, x253
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

walter@garth.UUCP (Walter Bays) (08/26/88)

>(John Mashey) writes:
>>2) It might be reasonable to compute IQFs, but you need access to very
>>good architectural simulations.

In article <99@taux02.UUCP> yuval@taux02.UUCP (Gideon Yuval) writes:
>Ori Danieli's M.Sc. Thesis (Tel Aviv University, 7/88) has  some  simulations
>on [NS32K versus a RISC subset of NS32K]

This seems very difficult to me, since you can only measure the
compiler + instruction set, not just the instruction set.  (Or if you
resort to hand-coded assembler, you measure programmer + instruction set.)
Even if you use the same compiler front-end, the differences in code
generators will influence the results.  A RISC code generator that is a
modified version of a CISC code generator probably won't do as good a
job of register allocation, register targeting, and handling intermediate
values.  A CISC code generator that is a modified version of a RISC code
generator will probably use mostly a RISC subset of the instructions.

This is not a criticism of Mr. Danieli's thesis, which I have not read.
It's now on my "read sometime" list.  If you tell me he went to great
lengths to address the compiler effects, it goes on my "read as soon
as I can find a copy!" list.
-- 
------------------------------------------------------------------------------
My opinions are my own.  Objects in mirror are closer than they appear.
E-Mail route: ...!pyramid!garth!walter		(415) 852-2384
USPS: Intergraph APD, 2400 Geng Road, Palo Alto, California 94303
------------------------------------------------------------------------------