[comp.arch] VAX MIPS VUPS

eugene@pioneer.arpa (Eugene N. Miya) (03/23/88)

One tact to approach this problem is to run as small a kernel as possible.
I call this a "minimal equivalent benchmark."  This is a form of calibration
rather than application benchmarking.  The "minimal equivalent benchmark" or
"program" should be the smallest program kernel written in a high level-language
which can reproduce 1 million instructions (use a macro processor like cpp
or m4 to paper over differences between 1,000,000 and 1,048,576 (2^20).
Assume the appearance of a variable in an expression is either a load or
store from or to memory.  (Yes smart compilers can either fully optimize
the loop out, pre-compute work, keep everything in registers, or
numerous other optimizations, you take this into account with the idea
of "equivalence" then seek other testing/calibration strategies, see below.)

	i = 0
c	sync clock
c	start clock
 10	i = i + 1
	if (i .le. ONE_MILLION) goto 10
c stop clock

Question: how long does this take?  Can something be written in Fortran or C
or Pascal (Ada) which is tighter?  (Sure: while(i++ < ONE_MILLION {};))
How ONE_MILLION might actually be some other unit the idea is that it is set
by deciding how many "virtual" instructions are executed.  The idea is to get
a minimum loop to "execute" 1 million instructions. For Floating point OPs:
change i to a REAL (float/double), make 1 into 1.0, etc.  Tighter loops in
assembly language is possible, but these aren't portable.  The ideas is to
get the NO-OP to see if you really can get 1 MIP or what ever.
The point is: what does a chunk of code have to look like to get 1 MIP?

From the Rock of Ages Home for Retired Hackers:

--eugene miya, NASA Ames Research Center, eugene@ames-aurora.ARPA
  "You trust the `reply' command with all those different mailers out there?"
  "Send mail, avoid follow-ups.  If enough, I'll summarize."
  {uunet,hplabs,hao,ihnp4,decwrl,allegra,tektronix}!ames!aurora!eugene