T S Lande <bassen@oslo-vax.ARPA> (11/30/84)
A user on our VAX-11/780 are running performance-tests on different machines. The program is aprox. 7000 lines of FORTRAN. The performance on the 780 is only 0.75 of a 750 running VMS. The comparison was done between computer-bound parts of the program. When it comes to I/O it's even worse. I have heard that the f77-compiler is not too good. Is this the main reason or is UNIX giving less troughput than VMS? What about C? Would C-coded programs increase performance significantly? Bassen
donn@utah-gr.UUCP (Donn Seeley) (11/30/84)
From T S Lande <bassen@oslo-vax.ARPA>: A user on our VAX-11/780 are running performance-tests on different machines. The program is aprox. 7000 lines of FORTRAN. The performance on the 780 is only 0.75 of a 750 running VMS. The comparison was done between computer-bound parts of the program. When it comes to I/O it's even worse. I assume you're talking about the 4.2 BSD compiler since your machine prompts with '4.2 BSD UNIX (oslo-vax)' when I telnet to it... There are a few things you should know about f77 performance: + The distributed f77 compiler is a mess. I have installed > 50 bug fixes in the compiler since I got it a year or so ago. Some of these fixes and some minor performance enhancements have been posted to the net. + The performance of f77 routines which don't use I/O or math functions is reasonable if the optimizer is enabled when you compile. Reasonable means within 10-20% of code compiled under VMS Fortran. Optimized f77 code can run twice as fast or faster than unoptimized code. + I/O is slow. Formatted I/O is beastly, although that is partly due to the nature of the problem. Unformatted I/O will get a big boost when the new Berkeley C library appears, because the limiting factor currently is the speed of fread() and fwrite(). For large writes, the improvement is amazing: on the order of 10 times faster. The new fread() and fwrite() have essentially the same modifications as the System V r2 versions. Not much can be done to help until the next release, though... + The portable math library is slow. Any program that calls it loses horribly compared to the same program compiled with the very nice math library on VMS. The speed of math functions can be approximately doubled by using the 'native math library' (/usr/lib/libnm.a). Unfortunately both of these libraries do all of their computations in double precision, another major lose. I have spent some time recently hacking at f77 to use a new single-precision library sqrt() from Berkeley and have found that at least one benchmark runs 5.5 times faster when it uses the enhanced single-precision native sqrt() instead of the portable math library sqrt(). There is some hope that these hacks will find their way into f77 in the next Berkeley distribution. I distribute f77 fixes via ftp to ARPAnet sites; contact me by mail if you want a set. I have been providing tapes to the occasional desperate person without ARPAnet access but as my boss reminds me on a regular basis, no one is paying me (or him) for the work I do on the f77 (or C!) compilers. This is also my excuse for the fact that the production of bug fixes and bug reports is irregular at best... If it's any reassurance, most of the work that I have done, together with work from numerous other people who have contributed to the current version of the compiler, will be incorporated in the next Berkeley release. What about C? Would C-coded programs increase performance significantly? It depends on the nature of the program. Some programs will benefit vastly, others less so, and if you fail to optimize your C by hand then you can certainly do much worse than f77. f77 does the job of allocating register variables, moving invariant code out of loops, finding common subexpressions and so on automatically, but programmers must do these things to a C program themselves (with most current C compilers). C loses badly for applications which require single-precision floating point, a problem which the ANSI C committee is addressing. It's a difficult decision to make and it can depend a lot on issues other than raw compute speed, such as portability, programmer productivity and so on. Hope this helps, Donn Seeley University of Utah CS Dept donn@utah-cs.arpa 40 46' 6"N 111 50' 34"W (801) 581-5668 decvax!utah-cs!donn
mikem@uwstat.UUCP (11/30/84)
> > A user on our VAX-11/780 are running performance-tests on > different machines. The program is aprox. 7000 lines of > FORTRAN. The performance on the 780 is only 0.75 of a 750 > running VMS. The comparison was done between computer-bound > parts of the program. When it comes to I/O it's even worse. I don't know about the i-o, but compute times could easily be like this if one machine had a Floating Point Unit and the other did not. My limited timings indicate that with FPA units, a 750 and 780 are about equivalent on floating point intensive code. -- Mike Meyer -- Phone (608) 262-1157 EASY ARPA: mikem@statistics CORRECT ARPA: mikem@wisc-stat.arpa UUCP ...!{allegra,ihnp4,seismo,ucbvax, pyr_chi,heurikon,uwm-evax}!uwvax!uwstat!mikem