[comp.arch] Benchmarks wanted

kruger@16bits.dec.com (I've got 50nS memory. What did you say?) (02/27/88)

I'd like to see the dhrystones benchmark, preferably implemented in C. If you
have it, please mail it to me. Other popular benchmarks also accepted -- ie
the Byte test suite and other such nonsense.

Thanks!
dov

eugene@pioneer.arpa (Eugene N. Miya) (02/28/88)

If you want benchmarks, use NBSLIB.
	nbslib@icst-cmr.arpa
Body of message shoud be like:
	send index
will show available directories.  Do this to see what you get.
	send index from stones
Will show what stones are available.
	send dhryc from stones
Gets you the version 1 dhrystones.

In recent days, I've promised to help them update and maintain this
thing.  I've "refer'ed" their benchmark bibliography and added our
references (Alan Smith's and mine) the NBS benchmark record
will be updated shortly.  The NBS needs our support in this.
You want to add something?  Fine.

From the Rock of Ages Home for Retired Hackers:

--eugene miya, NASA Ames Research Center, eugene@ames-aurora.ARPA
  "You trust the `reply' command with all those different mailers out there?"
  "Send mail, avoid follow-ups.  If enough, I'll summarize."
  {uunet,hplabs,hao,ihnp4,decwrl,allegra,tektronix}!ames!aurora!eugene

fritz@vlsi.Caltech.Edu (fritz nordby) (05/31/90)

Hello.

I'm looking for all the benchmarks I can find.
I know about SPEC, dhrystone, whetstone, linpack,
the Livermore kernels, and a few others.  What I'm
looking for are (1) what people have used as benchmarks,
and (2) where I can get the code they ran.

The types of benchmarks I'm primarily interested in are
CPU/memory benchmarks, but please, don't hold back, if
you know of something, send it to me.

I'm particularly interested (at the moment) in a group of
small CPU benchmarks: the Sieve of Eratosthenes, Ackermann's
function, Towers of Hanoi, Baskett's puzzle, quick-sort,
the 8-queens problem.  These benchmarks were popular around
the time of the original RISC papers, and in some cases I
have been able to find the algorithm, but I'm searching for
some of the algorithms and most of the input data sets.
If you know anything about these benchmarks, I'd very much
appreciate knowing what you can tell me.

Again, this is a rather peculiar request or rather particular
interest.  Please respond via e-mail.  If there is a general
interest, I will summarize what I find.

Thank you very much for your time, and for any help you can give.

	Fritz Nordby.	fritz@vlsi.caltech.edu	cit-vax!cit-vlsi!fritz

jan@ivory.SanDiego.NCR.COM (Jan Stubbs) (06/02/90)

In article <15049@cit-vax.Caltech.Edu> fritz@vlsi.caltech.edu (fritz nordby) writes:
>
>Hello.
>
>I'm looking for all the benchmarks I can find.
>I know about SPEC, dhrystone, whetstone, linpack,
>the Livermore kernels, and a few others.  What I'm
>looking for are (1) what people have used as benchmarks,
>and (2) where I can get the code they ran.
>

IOCALL RESULTS  10,001 Iteration Version, 9/27/87

SYSTEM				UNIX VERSION		SYSTEM TIME SECONDS
-----------			----------------	-------------------
Dec Pro-380			2.9 BSD			184
MicroVax I			Ultrix V1.1		180
Altos 68000			SIII, Altos v2.0a	178.6
Dec Rainbow			Venix/86 v2.0		177.5
DEC Rainbow100 w/NECV20 	Venix/86		148 *d
Onyx C8002s Z8000		SIII			137 	
Onyx C8002 Z8000		v7			130
Symmetric 375 10MHz NS32016	4.2BSD			128.7
TIL NS32016 9MHz No Wait states	Local Port		122
VAX 11/750			BSD 4.2			114.7
PC/AT w/Sritek M68000		SV/68 Rel2 V1.0		114
Sequent Balance 8000  6 CPU	Dynix V2.1		113.8 *b
Sequent Balance 21000 6 CPU				110.5 *c
Symmetric 375						101.4
PC Limited 286-6 6MHz 80286	SCO Xenix SV/286 V2.1.3	100.5
Tandy 6000 8Mhz M68000		Xenix 3.0		109 
ATT 3b2/300			SV			103
VAX 11/750			4.2 BSD			100
VAX 11/750			Ultrix V1.2
8 MHz 80286, 1 wait state	Microport SV/AT286 2.2u	98.5
PLexus P35 12.5 MHz M68000	SIII			98
ICM-32016 10 MHz NS 32016	SV.2			98
PDP 11/44			ISR 2.9 BSD		95
Motorola S2000 10MHz M68000	SV/68 Rel1		94
Concurrent XF/200 (PE7350A)	?			93
ICM32332 Beta version					90.2
VAX 11/750			4.3 BSD			90
Sun-2 10MHz 68010		4.2 BSD Rel 2.0		90
Sun-2				4.2BSD Rel 3.2		88
VAX 11/750			SV.2			88
Sun-2 10MHz 68010		4.2 BSD Rel 3.0 	87
Plexus P60 M68000		SIII			87
PE 3220				V7 Workbench		85 *a
ATT 3b2/400			SV.2			83
VAX 11/750			research version 8	81
HP9000-530 2 CPU's					75.0 *c
HP9000-320 16MHz 68020					73.4
Apple Mac II 16MHz M68020	A/UX 5.2 r1 (SV.2)	73.1		*n
NCR PC-8 8MHz 80286		SCO Xenix-286 SV R2.0.4	72.7
VAX 11/750			4.1 BSD			72
Radio Shack 16A			Xenix (v7)		72 
Altos 886 80286			Xenix 3.2fs3		71.1
Sperry IT 8MHz 80286		Xenix 5.0		71
ICL DRS300 8MHz 80286		DRS/NX (SV.2)		70.7
VAX 11/750			4.1BSD (lightly hacked)	70
PC/AT 				Venix 5.2		68
Encore Supermax NS32032		SV.2 Version 4		68.0 *c
VAXStation 2000			4.3BSD+NFS (wisconsin)	67.5
Arete 1100 M680?0		SV.2			65 *c
ATT7300 Unix PC 10MHz 68010	SV.2			64
IBM PC/RT 170Ns			4.2BSD			64
Concurrent 3230			Xelos Rel R01 (SV)	64 
MicroVAXII			4.3BSD+NFS (wisconsin)	63.8
Convergent MiniFrame		CTIX 3.20		63.3
Gould PN6080			UTX 1.1C		62 
MicroVaxII			Mach4.3			61.7
Pyramid 90x w/cache		OSx2.5			61
Apollo DN300 10 MHz M68000	Domain/IX		60 *e
IBM PC/RT 170MHz		?			60
Pyramid 90x w/cache		OSx2.3			58
ATT 3B15			SV.2			56.1
Plessey Mantra 12.5Mhz 68000	Uniplus SV Release 0	55
MicroVax II			Ultrix/32-m V1.2	53.4
VAX 11/780			4.2 BSD			53
Concurrent 3250XP		Xelos Rel R01 (SV)	52 
MicroVax II			Ultrix 1.1		52
HP9000-550 3cpu's		HP-UX 5.01		51 *c 
PC/AT 7.5 Mhz			Venix286 SV.2		51
Sun 3/50 			SunOS 3.2		50.9
Sun 3/52 16MHz 68020					50.1
VAX 11/780			SV.2			50 
Convex C-1			4.2 BSD			46
AT&T 3b20S			SV.2			44.4
Alliant FX/8 2 IPs, 4 CEs	Concentrix 2.0 (4.2BSD)	43.3 *c
IBM PC/RT			4.2A (4.2BSD)		42.8
IBM 4341 II			UTS 2.4(V7 on VM)	42   
PC/AT 9MHz 80286		Venix SV.2		40.9
Gould PN 6040			UTX/32 1.2		39.7
Plexus P/60 12.5 MHz 68020	Rel 1.5 of SV.2		38.3
VAX 11/785			4.3 BSD			36
Sun-3/75 16.67Mhz 68020		4.2 BSD			36
Sun-3/160M-4 16.67Mhz 68020	4.2 BSD Rel 3.0 Alpha	36
Altos 3068 16.67 MHz		SV.2			34.7
DG MV10000			DG/UX 3.00		33
CT MightyFrame S/320 68020	CTIX (SV.2)		32.1
Compaq 386-130 16Mhz 80386	SCO Xenix-386 2.2.1 SV3	31.2
HP9000/350 25MHz M68020		HP-UX 5.22 (SV)		30.2
Apollo Dn330 12Mhz M68020	Domain/IX		30  *e
VAX 11/785			SV.2			30
Celerity 1230 Accel/32 (NCR/32)	4.2 BSD			28.6  
Masscomp 5600 16Mhz 68020	RTU 3.1 (SV)		28.5
Gould Concept/97		UTX/32 1.2 (4.3BSD/SV2)	28
GEC 63/40			S 5.1			27
Gould PN9080			UTX 1.2	(4.3BSD)	25
ICL Clan 4 Model245 12Mhz 68020	Uniplus+ V.2		24.5
Diab DS 90-20 16MHz 68020	D-NIX 5.2.1.2 (SV)	23.9
Sun 3/260			SunOS 3.2		20.2
Gould PN9080			UTX 2.0 (4.3BSD)	19.4
Sperry 7000/40 aka CCI 6/32	4.2 BSD			19
ConvergentServer/PC 20MHz 80386 SV.3			19.0
MIPS M/500, 8Mhz R2000		UMIPS-BSD 2.0 (4.3+NFS)	19.0
VAX 11/785			Ultrix 2.0		18.9
Sun 3/280 25MHz M68020		SunOS 3.3		18.5
HP9000-840 (RISC)		HP-UX (4.2BSD)		17.2
Edge I/Model 1100 130NS.	GSX 3.1 (SV2)		14.9
Harris HCX-7			HCX-UX Vers. 2.2 (SV+)	14.5	*n
MIPS M/500, 8MHz R2000		UMIPS-V 1.1 (SVR3)	13.7
VAX 8600			4.3 BSD			12
VAX 8600			Ultrix 1.2-1		11
Sun 4/260 Sunrise		SunOS			11.0
IBM 3083			UTS SV			10  
Cray 2, 4NS Clock, 4 CPU	Unicos			10.4 *c, *d
Cray 2, 4NS Clock, 1 CPU	Unicos			9.4 *d
Amdahl 470/V8			UTS/V (SV Rel 2,3)V1.1+ 9  
MIPS M/800, 12.5Mhz R2000	UMIPS-BSD 2.0 (4.3+NFS)	8.4
MIPS M/1000, 15MHz R2000	UMIPS-BSD 2.01		7.1
VAX 8550			Ultrix 2.0		3.8
Cray X/MP-24			SysV (Pre release 8)    3.8
Amdahl 5890			UTS SV.2		3.63


Notes:
*c
Multi-cpu system. IOCALL was run single thread, which probably did not
utilize all cpu's. This system probably has considerably more power than
is reflected by the result. A better measurement of this system's capability
is to run as many copies of IOCALL as there are processors, in a script
with unique file names, and report the longest of the resulting system times
divided by the number of cpus, which should be about the same on each processor,
but longer than the single copy time.

*d
This time was run on a busy system, so it probably is not the best time
that is attainable on a idle system. A busy VAX time is about 20% worse
than the idle time, but whether this applies to other machines is unknown.

*e
Real time reported because system time appeared to be unreasonable.
Some implementations of Unix kernel don't charge for CPU time to
do IO properly.

*m
This time uses a modified version of IOCALL for a mapped file under the
Mach variant of Unix. The time is much better because the mapped file interface
doesn't use system calls, and is ideal for this repetitive and unlikely
program.

Send your results to me directly. The benchmark is a "C" program
which measures Unix kernel performance. 

To run it put the source below in iocall.c, then:
cc iocall.c -o iocall
time iocall     

Send all 3 times (user, system, real), but I am reporting the system
time only. The user time for this benchmark should be insignificant.
The real time should be about equal to system time plus user time, if not
you aren't running a real Unix, or your Unix has a bug. (Some people have
reported finding a bug in their port after running IOCALL). On BSD systems,
which report the number of IO's from TIME command, the number of IO's should
be 2 or 3 for the open, and a few paging io's to read in the program.

Please also send:
1)Type of machine and model #. 
2) Brand, model and clock rate of Microprocessor if any.
3) Version and name of OS, and its ancestry (e.g. SV2 or BSD 4.2)
         
The opinions expressed herein are those of the author. Your mileage may vary.
The times herein are obtained from unreliable sources, rely on them at 
your peril.
 
Benchmark should be run on an otherwise idle machine. If you can please 
run them so, it does improve the timings.

COMMENTARY:
What does this benchmark measure? It attempts to simulate a typical mix
of reading, writing and seeking. The cpu time used in the Unix kernel is
reported by the kernel.

It exercises the system call interface in a way less trivial than the 
getpid benchmark. It does not measure and is independent of, your IO hardware, 
and drivers. It does seem to show differences in Unix kernel efficiency on 
the same hardware.  It will exercise heavily your caches, and perhaps 
your block move bandwidth.

NO BENCHMARK IS PERFECT, (except your application),
but this one shows what a very IO intensive workload with good buffer
cache hit rates runs like on your cpu. 

Many synthetic benchmarks are criticized for giving unrealistic
results when run through optimizers that may throw out stuff that
does nothing useful. This is NOT a problem with IOCALL. If your
compiler finds something in the UNIX kernel that does nothing useful
and throws it out, MORE POWER TO IT!


-------cut----cut------cut-------------------------------

/*This benchmark tests speed of Unix system call interface
  and speed of cpu doing common Unix io system calls. */

char buf[512];
int fd,count,i,j;

main()
{
 fd = creat("/tmp/testfile",0777);
 close(fd);
 fd = open("/tmp/testfile",2);
 unlink("/tmp/testfile");
 for (i=0;i<=10000;i++) {
  	/*do seek, write, seek, read, read, read. */
 	lseek(fd,0L,0);		
 	count = write(fd,buf,500);
 	lseek(fd,0L,0);

 	for (j=0;j<=3;j++) 
	 	count = read(fd,buf,100);
  }
}
-----cut---cut---cut---cut-----------------------------------------
"There are lies, damn lies, and benchmarks."

Jan Stubbs    ....sdcsvax!ncr-sd!stubbs
619 485-3052
NCR Corporation Advanced Development
16550 W. Bernardo Drive MS4010
San Diego, CA. 92127