[net.micro] Lisa benchmark

STERNLIGHT@usc-ecl.arpa (02/05/83)

Berry Kercheval's simple benchmark is said to run in 56.7
seconds on an unloaded lisa, and over 4 minutes on an IBM
system one.  I find this incredible, since I just compiled
it with BDS/C and ran it on my TRS-80 Mod II running a
4mHz Z-80 and it took less than 4 seconds to run.  The 
run time was increased only slightly when I added one, two,
and three zeroes to the index counted through.  It's gotta
be either an error or some silliness in the way the C compilers
react to that code in the lisa and System one.
If you missed his message, the code is:
main()
{
	register int i = 10000000;
	while (i--);
}
The BDS version needs register int i as a separate statement,
and runs in the same time whether you define i as a register int
or just an int.
--david
-------

SHULMAN@rutgers.arpa (02/05/83)

From:  Jeffrey Shulman <SHULMAN@rutgers.arpa>


	That should be easy to explain.  Assuming 16 bit numbers on
the Z-80 the maximum integer is 2^16-1 = 65,535.  Assuming 32 bit
numbers on the M68000 the maximum integer is 2^32 - 1 = 4,294,967,295.
So, assuming nothing funny was going on, the TRS-80 II was only
doing the loop 65,535 times and the Apple Lisa did do the loop
10,000,000 times.  BTW it took exactly 16 seconds on a unloaded DEC-20
written in Pascal (unoptimized).

							Jeff
-------

nkm (02/07/83)

I find it pretty hard to believe that your Z-80 was able to countdown
from 10 million in 4 seconds. Even if your code compiled to 1 instruction
to do the whole shebang, it seems that your Z-80 would be running at
2.5 MIPS, making it closer to a Cray or a Dorado than something that
runs toasters. Having done a similar benchmark (counting to 50,000,000)
on any number of workstations (HP 9000, Xerox Dolphin, Dandelion, Apollo,
Sun, VAX,...) and gotten numbers on the order of 3 to 7 minutes, I don't
think I'm ready to by Tandy stock quite yet. My guess is that your BDS/C
compiler thinks ints are 16 bits long, i = 10000000 really assigns 0 to
i, and it takes 4 seconds to test and fail the while condition. Only
a guess...

Norm Meyrowitz
Brown University CS
!{decvax,cornell,vax135}!brunix!nkm

peachey (02/07/83)

	Let's look at these CPU benchmarks logically for a moment.
	First of all, setting an integer to 10 million is not very
	useful if the integer size is 16 bits.  On a 16 bit
	machine, the maximum number of loops before hitting zero
	is 64K, which gives the 16 bit machine an advantage of
	more than 152 times.  This makes it easy for the Z80 to
	do the benchmark in a phenomenally short time.

	At 5 MHz with no wait states, a hand-optimized loop on
	the MC68000 should take about 3.6 microseconds.  Thus,
	36 seconds would be a reasonable time for 10 million
	loops.  Presumably the C compiler didn't generate as
	good code as I did, or some other hardware/software
	factor interfered, but the Lisa time of 56.7 seconds
	is still very reasonable.

	I don't recall which model of Series/1 processor
	IBM had on display at UNICOM.  Judging from the
	average instruction time described in their brochure,
	which ranges from 9.3 microseconds for the slow
	processor, to 2.65 microseconds for the fast one,
	the Series/1 is quite slow.  Even so, if the Series/1
	C compiler is using a 16-bit integer, you would
	expect a run time of a few seconds for the benchmark.
	If it is using a 32-bit integer, run time might
	be as long as ten minutes or so.  However, I suspect
	that it's more likely that something broke in the
	benchmark, and put it into an nonterminating loop.

	In case you're interested, here are some more measurements
	of run times on 10000000 loops with a "long" (32 bit) counter:

	LSI-11/23          206 seconds
	PDP-11/40          171 seconds (300 ns MOS memory)
	Z8000               63 seconds (Zilog System 8000;unknown clock rate)
	PDP-11/70           60 seconds
        10 MHz MC68000      18 seconds (no wait states)
	VAX-11/780          11 seconds

	Of course, the 16-bit machines in this test had to
	do more instructions in each loop to simulate 32-bit
	integers.

				Darwyn Peachey
				Hospital Systems Study Group

				harpo!utah-cs!sask!hssg40!peachey

dag (02/08/83)

I just ran the same benchmark program in Decus-C on an unloaded LSI-11 and
got back in about 14 seconds (hard to tell with a digital clock).  I hand-
wrote the thing in macro to do double integer and it took over 30 seconds,
but not much.  the PDP-11 macro looked like the following...

	mov	#1000,r1
	clr	r0
1$:	dec	r0
	bne	1$
	dec	r1
	bne	1$
	halt
(that was on an 11/23 stand-alone)
						Daniel Glasser
						...!decvax!sultan!dag

dag (02/08/83)

Correction:
	The PDP-11 Macro section that I submitted was the slow version...
	The version that had the good timings looked like:

	begin:	mov	1000,r0
		clr	r0		;above r0 should be r1
	1$:	sob	r0,1$
		sob	r1,1$
		halt

				...!decvax!sultan!dag
				Daniel Glasser

bernie (02/11/83)

Just for the record, ints under BDS-C are 16 bit.  (Why would they be 8-bit?)

frank (02/14/83)

#R:sri-arpa:-37400:zinfandel:15200005:000:467
zinfandel!frank    Feb 12 10:48:00 1983

/* The test should be */

main()
{
	register long i = 10000000;	/* 10 million, with 32 bit long */
	while(i--)
		;
}
/*
The best I have timed in person was the SUN cpu with UNISOFT UNIX & C compiler
running in onboard memory ( 10MHZ CPU clock ) at 63 seconds.

Note: BDS C ver 1.43 (or so) did not support longs
	( ints = 16 bits, char = 8 bits )
*/

Frank Berry	(decvax!sytek!zinfandel!frank)	(415) 932-6900
Zehntel Inc.
2625 Shadelands Drive
Walnut Creek, CA 94598

billn (02/15/83)

Frank must have misread his notes.  AGAIN, timings are:

with register var, 	27 sec.
without register var,	58 sec.

(SUN bd, on-board mem, 10mhz, Unisoft compiler and UNIX).  I just telneted
over to one of Zehntel's 68k's and ran the test again.
/b(ill Northlich)

mclure (03/09/83)

#R:sri-arpa:-37400:sri-unix:16000001:000:475
sri-unix!mclure    Feb  5 13:47:00 1983

I thought BDS C on a Z-80 had ints as 8-bit rather than 16-bit?
I've never used BDS C so I can't say for sure.

For comparison purposes, a DEC-20 assembly program to do that takes 5.8
cpu seconds.  That's all in a register: (movei 1, =10000000; sojn 1, .)
When moving it to and from a memory location, it takes 15.9 cpu seconds.

The equivalent program in C with i as a long and with cc -O took 85.7
cpu seconds on a 11/70.  Pretty good performance by Lisa I'd say.

	Stuart

mclure (03/09/83)

#R:sri-arpa:-37400:sri-unix:16000003:000:154
sri-unix!mclure    Feb 13 13:40:00 1983

There was a typo in that DEC-20 program.

	num: =10000000
	move 1, num
	sojn 1, .

since a movei only gets a half-word (18 bits). The time is ~6 seconds.

bernie (03/17/83)

Why on earth would BDS C use 8-bit ints?  It's always had 16-bit ints,
like any other decent compiler.  (Is it even *possible* to write a C compiler
where ints are only 8 bits wide?)
				--Bernie Roehl
				...decvax!watmath!watarts!bernie