[net.arch] IOCALL results

stubbs@ncr-sd.UUCP (Jan Stubbs) (12/18/85)
In article <457@rna.UUCP> dan@rna.UUCP (Dan Ts'o) writes:
>	- I have trouble understanding the point of the benchmark program.
>this whole routine just does user/kernel buffer copies, back and forth. If the
>performance of the system call interface and user/kernel memory copies is the
>what is trying to be measured, then the results may be okay, although strangely
>obtained. I don't believe it measures much else in the way of kernel
>performance, or system performance. 

This is exactly what is intended by IOCALL.     More Reads are done than Writes because that's  what users do the most of. I purposely avoided doing physical IOto the extent possible because I wanted to be disk access time independent, and buffer cache size independent.

>	- The numbers are way to small to interpret with any substantial
>significance(i.e. you should run the benchmark with say 10000, rather than 1000

This is a valid criticism. I encourage everyone to do this on fast machines, and divide the results by 10. Nevertheless I find this doesn't affect the ratings
much. 
>	- That a Radio Shack 16A performs 25% better than a VAX 11/750 is cute
>but little practical interest (read ridiculous, a benchmark that tells me that
>is probably not going to be very useful, are we really to think that an
>Amdahl 470/V8 is only 12% faster than a VAX8600, that a Pyramid is slower than
>a VAX 11/780).

I believe that the results show that the major difference between a Radio Shack
16A and a VAX 750 is not in the CPU, it is more likely in the disk IO subsystem.
It also shows that for cpu intensive work, a Radio Shack may be a better deal
than a VAX 750. Or that for some workloads, the size of your buffer cache is more important than the speed of your disk.

The Amdahl times were probably adversely affected by other users on the system. See note b. A Pyramid is slower than a 11/780 moving bytes to and from the
buffer cache, apparently, though it  more than makes up for this in other areas
such as context switching, calls and parameter passing.   

The latest results so far are below. Thanks everybody. Merry Xmas.
Send your results to me directly. The benchmark is a "C" program
which measures Unix kernel performance. 
time iocall     Send all 3 times (user, system, real)
                I am reporting the system time only.
         
"The opinions expressed herein are those of the author. Your mileage may vary".
 
Benchmark should be run on an otherwise idle machine. If you can please run them so, it does improve the timings.


-------cut----cut------cut-------------------------------

/*This benchmark tests speed of Unix system call interface
  and speed of cpu doing common Unix io system calls. */

char buf[512];
int fd,count,i,j;

main()
{
 fd = creat("/tmp/testfile",0777);
 close(fd);
  fd = open("/tmp/testfile",2);
  unlink("/tmp/testfile");
for (i=0;i<=1000;i++) {
  lseek(fd,0L,0);		/* add this line! */
  count = write(fd,buf,500);
  lseek(fd,0L,0);		/* second argument must be long */

  for (j=0;j<=3;j++) 
  	count = read(fd,buf,100);
  }
}
-----cut---cut---cut---cut-----------------------------------------

              IOCALL RESULTS

SYSTEM				UNIX VERSION		SYSTEM TIME SECONDS
-----------			----------------	-------------------

Dec Pro-380			2.9 BSD			18.4
MicroVax I			Ultrix V1.1		18.0
DEC Rainbow100 w/NECV20 	Venix/86		14.8 *d
Onyx C8002s Z8000		SIII			13.7 *a	
Onyx C8002 Z8000		v7			13.0
TIL NS32016 9MHz No Wait states	Local Port		12.2
Tandy 6000 8Mhz M68000		Xenix 3.0		12.0 *e
ATT 3b2/300			SV			10.3
VAX 11/750			4.2 BSD			10.0
PDP 11/44			ISR 2.9 BSD		9.5
VAX 11/750			4.3 BSD			9.0
Sun-2 10MHz 68010		4.2 BSD Rel 2.0		9.0
VAX 11/750			SV.2			8.8
Sun-2 10MHz 68010		4.2 BSD Rel 3.0 	8.7
PE 3220				V7 Workbench		8.5 *a
VAX 11/750			research version 8	8.1
VAX 11/750			4.1 BSD			7.2
Radio Shack 16A			Xenix (v7)		7.2 *a
PC/AT 				Venix 5.2		6.8
ATT7300 Unix PC 10MHz 68010	SV.2			6.4
Concurrent 3230			Xelos Rel R01 (SV)	6.4 *b
Gould PN6080			UTX 1.1C		6.2 *b
Pyramid 90x w/cache		OSx2.5			6.1
Pyramid 90x w/cache		OSx2.3			5.8
Plessey Mantra 12.5Mhz 68000	Uniplus SV Release 0	5.5
VAX 11/780			4.2 BSD			5.3
Concurrent 3250XP		Xelos Rel R01 (SV)	5.2 *b
MicroVax II			Ultrix 1.1		5.2
HP9000-550 3cpu's		HP-UX 5.01		5.1 *c 
PC/AT 7.5 Mhz			Venix286 SV.2		5.1
VAX 11/780			SV.2			5.0 *d
Convex C-1			4.2 BSD			4.6
VAX 11/785			SV.2			4.4
IBM 4341 II			UTS 2.4(V7 on VM)	4.23 *b
VAX 11/785			4.3 BSD			3.6
Sun-3/75 16.67Mhz 68020		4.2 BSD			3.6
Sun-3/160M-4 16.67Mhz 68020	4.2 BSD Rel 3.0 Alpha	3.6
Gould Concept/97		UTX/32 1.2 (4.3BSD/SV2)	2.78
GEC 63/40			S 5.1			2.7
Gould PN9080			UTX 1.2			2.5
Sperry 7000/40 aka CCI 6/32	4.2 BSD			1.9
VAX 8600			4.3 BSD			1.2
VAX 8600			Ultrix 1.2-1		1.1
IBM 3083			UTS SV			1.0 *b
Amdahl 470/V8			UTS/V (SV Rel 2,3)V1.1+ .98 *b
Cray X/MP-24			SysV (Pre release 8)    .38 *b


Notes:

*a 
This result obtained with original version of IOCALL which crosses the 512
512 byte buffer boundary, and this version of Unix has buffers of 512 bytes.
This is believed to be the case with all Version 7 and SIII derived OS's. It
will result in a 1001 writes being done which uses significantly more cpu time 
and makes these results comparable only to others with the same problem. See 
discussion above. 2.9 BSD????

*b 
This result was obtained with a system which probably had other
programs running at the time the result was obtained. Submitter is
requested to rerun if possible when system is idle. This will improve
the result somewhat.

*c
Multi-cpu system. IOCALL was run single thread, which probably did not
utilize all cpu's. This system probably has considerably more power than
is reflected by the result.

*d
Result obtained with new version of IOCALL which has an extra lseek to
prevent crossing 512 byte buffer boundary on older versions of Unix.

*e
Real time reported because system time appeared to be unreasonable.