stubbs@ncr-sd.UUCP (Jan Stubbs) (10/21/86)
IOCALL, A UNIX SYSTEM PERFORMANCE BENCHMARK Results as of 10/17/86. This version of the benchmark does 10,000 iterations instead of 1000. Unix machines are getting so fast that 1000 was too quick to measure accurately on some CPU's. Just Divide results by 10 on this benchmark to compare to my list. Next time I will use time normalized to this benchmark, i.e they will all be 10 times bigger. Send your results to me directly. The benchmark is a "C" program which measures Unix kernel performance. time iocall Send all 3 times (user, system, real) I am reporting the system time only. "The opinions expressed herein are those of the author. Your mileage may vary". Benchmark should be run on an otherwise idle machine. If you can please run them so, it does improve the timings. COMMENTARY: I have had some criticism of this benchmark to whit: It unfairly penalizes machines which do not have CPU data cacheing on the Unix buffer pool. My Response: It does penalize such machines, because it heavily emphazises the function of copying data from the buffer pool and back again. Whether this is unfair depends on references to such data is sufficiently heavy in your application that cacheing is a good idea. I bet that it is, if you have a heavily interactive, multiuser, and IO intensive environment. In particular, things like the superblock and your home directory are probably referenced a bunch. Also, in reference to systems which cache other stuff, but not the buffer pool, keep in mind that there are (in my opinion) two reasons for a CPU data cache, one is to improve memory read latency on cache hits, the other is to improve memory bandwidth by optimizing for sequential memory access on a cache line fill after a cache miss. Thus even if your cache is so small that you have a poor hit rate, you may win on sequential memory access when, for example, you are moving data from a Unix buffer pool into the user program area on a read system call. Also, many synthetic benchmarks are criticized for giving unrealistic results when run through optimizers that may throw out stuff that does nothing useful. This is NOT a problem with IOCALL. If your compiler finds something in the UNIX kernel that does nothing useful and throws it out, MORE POWER TO IT! -------cut----cut------cut------------------------------- /*This benchmark tests speed of Unix system call interface and speed of cpu doing common Unix io system calls. */ char buf[512]; int fd,count,i,j; main() { fd = creat("/tmp/testfile",0777); close(fd); fd = open("/tmp/testfile",2); unlink("/tmp/testfile"); for (i=0;i<=10000;i++) { lseek(fd,0L,0); /* add this line! */ count = write(fd,buf,500); lseek(fd,0L,0); /* second argument must be long */ for (j=0;j<=3;j++) count = read(fd,buf,100); } } -----cut---cut---cut---cut----------------------------------------- "There are lies, damn lies, and benchmarks." Jan Stubbs ....sdcsvax!ncr-sd!stubbs 619 485-3052 NCR Corporation Advanced Development 16550 W. Bernardo Drive MS4010 San Diego, CA. 92127 IOCALL RESULTS SYSTEM UNIX VERSION SYSTEM TIME SECONDS ----------- ---------------- ------------------- Dec Pro-380 2.9 BSD 18.4 MicroVax I Ultrix V1.1 18.0 DEC Rainbow100 w/NECV20 Venix/86 14.8 *d Onyx C8002s Z8000 SIII 13.7 *a Onyx C8002 Z8000 v7 13.0 TIL NS32016 9MHz No Wait states Local Port 12.2 PC/AT w/Sritek M68000 SV/68 Rel2 V1.0 11.36 Tandy 6000 8Mhz M68000 Xenix 3.0 10.9 ATT 3b2/300 SV 10.3 VAX 11/750 4.2 BSD 10.0 PLexus P35 12.5 MHz M68000 SIII 9.8 ICM-3216 10 MHz NS 32016 SV.2 9.8 PDP 11/44 ISR 2.9 BSD 9.5 Motorola S2000 10MHz M68000 SV/68 Rel1 9.42 Concurrent XF/200 (PE7350A) ? 9.3 VAX 11/750 4.3 BSD 9.0 Sun-2 10MHz 68010 4.2 BSD Rel 2.0 9.0 VAX 11/750 SV.2 8.8 NCR Tower/XP Tower 3.0 (SV.2) 8.8 Sun-2 10MHz 68010 4.2 BSD Rel 3.0 8.7 Plexus P60 M68000 SIII 8.7 ATT 3b2/400 SV.2 8.3 VAX 11/750 research version 8 8.1 VAX 11/750 4.1 BSD 7.2 Radio Shack 16A Xenix (v7) 7.2 Sperry IT 8MHz 80286 Xenix 5.0 7.1 VAX 11/750 4.1BSD (lightly hacked) 6.97 PC/AT Venix 5.2 6.8 Arete 1100 M680?0 SV.2 6.5 *c ATT7300 Unix PC 10MHz 68010 SV.2 6.4 IBM PC/RT 170MHz 4.2BSD 6.4 Concurrent 3230 Xelos Rel R01 (SV) 6.4 *b Gould PN6080 UTX 1.1C 6.2 *b Pyramid 90x w/cache OSx2.5 6.1 Apollo DN300 10 MHz M68000 Domain/IX 6.0 *e IBM PC/RT 170MHz ? 6.0 Pyramid 90x w/cache OSx2.3 5.8 Plessey Mantra 12.5Mhz 68000 Uniplus SV Release 0 5.5 VAX 11/780 4.2 BSD 5.3 Concurrent 3250XP Xelos Rel R01 (SV) 5.2 *b MicroVax II Ultrix 1.1 5.2 HP9000-550 3cpu's HP-UX 5.01 5.1 *c PC/AT 7.5 Mhz Venix286 SV.2 5.1 VAX 11/780 SV.2 5.0 Convex C-1 4.2 BSD 4.6 IBM 4341 II UTS 2.4(V7 on VM) 4.23 *b VAX 11/785 4.3 BSD 3.6 Sun-3/75 16.67Mhz 68020 4.2 BSD 3.6 Sun-3/160M-4 16.67Mhz 68020 4.2 BSD Rel 3.0 Alpha 3.6 Apollo Dn330 12Mhz M68020 Domain/IX 3.0 *e VAX 11/785 SV.2 3.0 Celerity 1230 Accel/32 (NCR/32) 4.2 BSD 3.0 *b, *c Gould Concept/97 UTX/32 1.2 (4.3BSD/SV2) 2.78 GEC 63/40 S 5.1 2.7 Gould PN9080 UTX 1.2 (4.3BSD) 2.5 Sperry 7000/40 aka CCI 6/32 4.2 BSD 1.9 VAX 8600 4.3 BSD 1.2 VAX 8600 Ultrix 1.2-1 1.1 IBM 3083 UTS SV 1.0 *b Amdahl 470/V8 UTS/V (SV Rel 2,3)V1.1+ .98 *b Cray X/MP-24 SysV (Pre release 8) .38 *b Notes: *b This result was obtained with a system which probably had other programs running at the time the result was obtained. Submitter is requested to rerun if possible when system is idle. This will improve the result somewhat. *c Multi-cpu system. IOCALL was run single thread, which probably did not utilize all cpu's. This system probably has considerably more power than is reflected by the result. *e Real time reported because system time appeared to be unreasonable.