dymm@b.cs.wvu.wvnet.edu (David Dymm) (10/25/88)
*** Question 1 *** I have been timing the relative differences between the speeds of the functions "strlen" and "strcpy" as executed on UNIX and VMS. Why is one system so much faster? Is it just the machine mips that accounts for the differences? *** Question 2 *** I am also concerned on VMS with the differences between the use of the "system" functions and my own function. VMS - We are running vms 4.7 with version 2.3 of the C compiler on a Vaxstation 2000. UNIX - We are running Sun 3.5 (UNIX BSD 4.2) on a 3/280 set up as a multi-user system with 14 users connected via a multiplexor. For example, the following times (seconds) for strlen computes the length of an 80 charactrer string in a loop 200,000 times: UNIX VMS system mine 10 68 51 The following times (seconds) for strcpy does 200,000 string copies with the copied string 50 characters in length: UNIX VMS system mine 6.5 57 29 Does anyone have a good explanation for the differences in times? Perhaps the differences between UNIX and VMS are due to the speeds of the machines. But why is my functions on VMS faster than the system supplied functions? I had assumed that the system functions are perhaps written in assembler so that they are truly optomized. Not true???? David Dymm Software Engineer ************************************* * Violence is the last refuge of * * the incompetent. * * Hardin * ************************************* USMAIL: Bell Atlantic Knowledge Systems, 145 Fayette Street, Morgantown, WV 26505 PHONE: 304 291-9898 (8:30-4:30 EST) USENET: {allegra,bellcore, cadre,idis,psuvax1}!pitt!wvucsb!dymm INTERNET: dymm@b.cs.wvu.wvnet.edu
chris@mimsy.UUCP (Chris Torek) (10/25/88)
In article <82@h.cs.wvu.wvnet.edu> dymm@b.cs.wvu.wvnet.edu (David Dymm) asks
why strcpy and strlen on a Sun-3/280 are so much faster than the same
functions on a Vax running VMS 4.7, and why his own simple versions of
strcpy and strlen are faster than the ones in the C Runtime Library.
A Sun 3/280 is faster than a uVAX II in general; but in particular,
VMS is cleverly using the `locc' instruction to find the length of
the string. This is quite a bit faster than simply looking at each
character for a '\0', since the locc instruction is in microcode ...
except on the MicroVAX, where locc must be emulated by the kernel!
4.3BSD suffers from the same cleverness.
(The Cvax chip, used in the 3000 and 6000 series machines, does
locc in microcode again.)
Chris
--
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
debra@alice.UUCP (Paul De Bra) (10/25/88)
In article <82@h.cs.wvu.wvnet.edu> dymm@b.cs.wvu.wvnet.edu (David Dymm) writes: - -*** Question 1 *** -I have been timing the relative differences between the speeds -of the functions "strlen" and "strcpy" as executed on UNIX -and VMS. Why is one system so much faster? Is it just the machine -mips that accounts for the differences? - -*** Question 2 *** -I am also concerned on VMS with the differences between the use of the -"system" functions and my own function. - - -VMS - We are running vms 4.7 with version 2.3 of the C compiler -on a Vaxstation 2000. -UNIX - We are running Sun 3.5 (UNIX BSD 4.2) on a 3/280 set up -as a multi-user system with 14 users connected via a multiplexor. - -For example, the following times (seconds) for strlen computes the length -of an 80 charactrer string in a loop 200,000 times: - UNIX VMS - system mine - 10 68 51 - I ran a similar program on my Microvax II with the ninth edition Unix. Result 66.5 sec. -The following times (seconds) for strcpy does 200,000 string copies -with the copied string 50 characters in length: - UNIX VMS - system mine - 6.5 57 29 - The Microvax II did this in 51.5 seconds. So VMS is not doing a really lousy job. The Microvax II (or VaxStation 2000) is not nearly as fast as a Sun 3/280. But the difference is more than I can explain from the raw cpu-speed. I did some more benchmarking with Suns and Vaxen and the difference should not exceed a factor of 4 or 5. So Sun must have done some optimization to the routines, or generate efficient code for string operations. Your code for strcpy already indicates that one can generate faster routines than the standard libraries. In fact, in BSD 4.3 a number of libraries have been rewritten to make them more efficient. Paul. -- ------------------------------------------------------------------------- |debra@research.att.com | uunet!research!debra | att!grumpy!debra | -------------------------------------------------------------------------
iglesias@orion.cf.uci.edu (Mike Iglesias) (10/25/88)
A Sun 3/280 is probably about 2-3 times faster than a VS 2000. Also, remember that not all the VAX instruction set is in hardware on the uVAX II chip set, so some of the instructions the library (and you) may be using are emulated in software. Mike Iglesias University of California, Irvine
guy@auspex.UUCP (Guy Harris) (10/26/88)
>I have been timing the relative differences between the speeds >of the functions "strlen" and "strcpy" as executed on UNIX >and VMS. Why is one system so much faster? Is it just the machine >mips that accounts for the differences? Partially, but probably not entirely. >VMS - We are running vms 4.7 with version 2.3 of the C compiler >on a Vaxstation 2000. >UNIX - We are running Sun 3.5 (UNIX BSD 4.2) on a 3/280 set up >as a multi-user system with 14 users connected via a multiplexor. The 3/280 has a 25MhZ zero-wait-state (unless you get a cache miss....) 68020, which may well be faster than a VAXStation 2000 (is the 2000 about equal to a 780? If so, the 3/280 has a much faster CPU - about 4x faster...). However, there is another issue. C strings are zero-terminated, not counted. VAX string instructions don't, as I remember, know about zero terminator bytes (unless you use "movtuc", but that adds extra overhead for table lookup). The 4.3BSD (and, I think, S5) VAX version of "strcpy" first does a "locc" to find the null terminator, and then a "movc3" to copy the string. This requires two passes over the string. The 68K, however, isn't "helped" by having those string instructions, so the 68K assembler-language version searches for the zero terminator while it copies, so it only makes one pass over the string. The loop fits into the on-chip instruction cache on the 68020, so it runs fairly fast.
bzs@encore.com (Barry Shein) (10/26/88)
Is it possible you're running on a Vax which doesn't have the hardware instructions the strlen (et al) you are using needs, so its trapping into an emulator? That is, you might have the wrong subroutine library on the VMS Vax. Just a guess. -Barry Shein, ||Encore||