[net.arch] IOCALL Benchmark

stubbs@ncr-sd.UUCP (Jan Stubbs) (10/21/86)

	IOCALL, A UNIX SYSTEM PERFORMANCE BENCHMARK
Results as of 10/17/86.

This version of the benchmark does 10,000 iterations instead of
1000. Unix machines are getting so fast that 1000 was too quick
to measure accurately on some CPU's. Just Divide results by 10
on this benchmark to compare to my list. Next time I will use
time normalized to this benchmark, i.e they will all be 10 times
bigger.

Send your results to me directly. The benchmark is a "C" program
which measures Unix kernel performance. 
time iocall     Send all 3 times (user, system, real)
                I am reporting the system time only.
         
"The opinions expressed herein are those of the author. Your mileage may vary".
 
Benchmark should be run on an otherwise idle machine. If you can please 
run them so, it does improve the timings.

COMMENTARY:
I have had some criticism of this benchmark to whit:
It unfairly penalizes machines which do not have CPU data cacheing on the
Unix buffer pool.

My Response:
It does penalize such machines, because it heavily emphazises the function
of copying data from the buffer pool and back again. Whether this is unfair
depends on references to such data is sufficiently heavy in your application
that cacheing is a good idea. I bet that it is, if you have a heavily
interactive, multiuser, and IO intensive environment.

In particular, things like the superblock and your home directory are
probably referenced a bunch.

Also, in reference to systems which cache other stuff, but not the buffer
pool, keep in mind that there are (in my opinion) two reasons for a CPU
data cache, one is to improve memory read latency on cache hits, the other 
is to improve memory bandwidth by optimizing for sequential memory access
on a cache line fill after a cache miss. 

Thus even if your cache is so small that you have a poor hit rate, 
you may win on sequential memory access when, for example, you are 
moving data from a Unix buffer pool into the user program area on a 
read system call.

Also, many synthetic benchmarks are criticized for giving unrealistic
results when run through optimizers that may throw out stuff that
does nothing useful. This is NOT a problem with IOCALL. If your
compiler finds something in the UNIX kernel that does nothing useful
and throws it out, MORE POWER TO IT!


-------cut----cut------cut-------------------------------

/*This benchmark tests speed of Unix system call interface
  and speed of cpu doing common Unix io system calls. */

char buf[512];
int fd,count,i,j;

main()
{
 fd = creat("/tmp/testfile",0777);
 close(fd);
  fd = open("/tmp/testfile",2);
  unlink("/tmp/testfile");
for (i=0;i<=10000;i++) {
  lseek(fd,0L,0);		/* add this line! */
  count = write(fd,buf,500);
  lseek(fd,0L,0);		/* second argument must be long */

  for (j=0;j<=3;j++) 
  	count = read(fd,buf,100);
  }
}
-----cut---cut---cut---cut-----------------------------------------

"There are lies, damn lies, and benchmarks."

Jan Stubbs    ....sdcsvax!ncr-sd!stubbs
619 485-3052
NCR Corporation Advanced Development
16550 W. Bernardo Drive MS4010
San Diego, CA. 92127

              IOCALL RESULTS

SYSTEM				UNIX VERSION		SYSTEM TIME SECONDS
-----------			----------------	-------------------

Dec Pro-380			2.9 BSD			18.4
MicroVax I			Ultrix V1.1		18.0
DEC Rainbow100 w/NECV20 	Venix/86		14.8 *d
Onyx C8002s Z8000		SIII			13.7 *a	
Onyx C8002 Z8000		v7			13.0
TIL NS32016 9MHz No Wait states	Local Port		12.2
PC/AT w/Sritek M68000		SV/68 Rel2 V1.0		11.36
Tandy 6000 8Mhz M68000		Xenix 3.0		10.9 
ATT 3b2/300			SV			10.3
VAX 11/750			4.2 BSD			10.0
PLexus P35 12.5 MHz M68000	SIII			9.8
ICM-3216 10 MHz NS 32016	SV.2			9.8
PDP 11/44			ISR 2.9 BSD		9.5
Motorola S2000 10MHz M68000	SV/68 Rel1		9.42
Concurrent XF/200 (PE7350A)	?			9.3
VAX 11/750			4.3 BSD			9.0
Sun-2 10MHz 68010		4.2 BSD Rel 2.0		9.0
VAX 11/750			SV.2			8.8
NCR Tower/XP			Tower 3.0 (SV.2)	8.8
Sun-2 10MHz 68010		4.2 BSD Rel 3.0 	8.7
Plexus P60 M68000		SIII			8.7
ATT 3b2/400			SV.2			8.3
VAX 11/750			research version 8	8.1
VAX 11/750			4.1 BSD			7.2
Radio Shack 16A			Xenix (v7)		7.2 
Sperry IT 8MHz 80286		Xenix 5.0		7.1
VAX 11/750			4.1BSD (lightly hacked)	6.97
PC/AT 				Venix 5.2		6.8
Arete 1100 M680?0		SV.2			6.5 *c
ATT7300 Unix PC 10MHz 68010	SV.2			6.4
IBM PC/RT 170MHz		4.2BSD			6.4
Concurrent 3230			Xelos Rel R01 (SV)	6.4 *b
Gould PN6080			UTX 1.1C		6.2 *b
Pyramid 90x w/cache		OSx2.5			6.1
Apollo DN300 10 MHz M68000	Domain/IX		6.0 *e
IBM PC/RT 170MHz		?			6.0
Pyramid 90x w/cache		OSx2.3			5.8
Plessey Mantra 12.5Mhz 68000	Uniplus SV Release 0	5.5
VAX 11/780			4.2 BSD			5.3
Concurrent 3250XP		Xelos Rel R01 (SV)	5.2 *b
MicroVax II			Ultrix 1.1		5.2
HP9000-550 3cpu's		HP-UX 5.01		5.1 *c 
PC/AT 7.5 Mhz			Venix286 SV.2		5.1
VAX 11/780			SV.2			5.0 
Convex C-1			4.2 BSD			4.6
IBM 4341 II			UTS 2.4(V7 on VM)	4.23 *b
VAX 11/785			4.3 BSD			3.6
Sun-3/75 16.67Mhz 68020		4.2 BSD			3.6
Sun-3/160M-4 16.67Mhz 68020	4.2 BSD Rel 3.0 Alpha	3.6
Apollo Dn330 12Mhz M68020	Domain/IX		3.0 *e
VAX 11/785			SV.2			3.0
Celerity 1230 Accel/32 (NCR/32)	4.2 BSD			3.0 *b, *c
Gould Concept/97		UTX/32 1.2 (4.3BSD/SV2)	2.78
GEC 63/40			S 5.1			2.7
Gould PN9080			UTX 1.2	(4.3BSD)	2.5
Sperry 7000/40 aka CCI 6/32	4.2 BSD			1.9
VAX 8600			4.3 BSD			1.2
VAX 8600			Ultrix 1.2-1		1.1
IBM 3083			UTS SV			1.0 *b
Amdahl 470/V8			UTS/V (SV Rel 2,3)V1.1+ .98 *b
Cray X/MP-24			SysV (Pre release 8)    .38 *b


Notes:

*b 
This result was obtained with a system which probably had other
programs running at the time the result was obtained. Submitter is
requested to rerun if possible when system is idle. This will improve
the result somewhat.

*c
Multi-cpu system. IOCALL was run single thread, which probably did not
utilize all cpu's. This system probably has considerably more power than
is reflected by the result.

*e
Real time reported because system time appeared to be unreasonable.