tim@VAX1.CC.UAKRON.EDU (Timothy H Smith) (04/25/89)
We have just obtained a 4D/240 and are using it for scaler floating point operations. We do not need a vector machine. We were told the floating point speed was about 12 mflops. Well it turns out to be about 4 mflops. When the fortran compiler runs at any optimzation level the numbers it generates are all bad. Turn off the optimzation and all is fine. Also when the machine is running some reasable jobs the interactive response goes to nothing. I mean 5 minutes for a ls. I know the machine need more memory, but I don't expect this. What is the state of the SGI compilers. Not so good as far as I can tell. Try to run power fortran and none of the results are good. Power fortran was bought to utilize all 4 processors.. It would be nice if it worked. Can anyone from SGI comment on this. Our machine is a 4D/240S with 16meg of memory. thanks, tim@vax1.cc.uakron.edu *** my comments are mine and do not reflect my orginization ***
bron@bronze.SGI.COM (Bron Campbell Nelson) (04/28/89)
In article <139@VAX1.CC.UAKRON.EDU>, tim@VAX1.CC.UAKRON.EDU (Timothy H Smith) writes: > We have just obtained a 4D/240 and are using it for scaler > floating point operations. We do not need a vector machine. > We were told the floating point speed was about 12 mflops. > Well it turns out to be about 4 mflops. The double-precision linpack benchmark number for 1 processor is 4mflops. The sales person was probably talking about collective speed for the machine as a whole (although why they didn't therefore claim it was a 16mflop machine I don't know). I have run some benchmark jobs that see a 3x speedup when running on 4cpus (using the automatic parallelizing FORTRAN tools). Such a job can indeed run at 12mflops. In fact, I get over 20mflops on one of the llnl kernels (disclaimer: parallel speed up is *very* application dependent; your mileage may vary). > When the fortran compiler runs at any optimzation level the numbers > it generates are all bad. Turn off the optimzation and all is fine. There is a known bug in the optimizer. If you have the Power Fortran option, there is something you can try: move /usr/lib/uopt to /usr/lib/uopt.orig, and link /usr/lib/uopt_mp to /usr/lib/uopt. This will cause the Multi-Processing optimizer to be used in place of the normal optimizer (uopt_mp also works on normal scalar code). It should do all the same optimizations as the normal optimizer, and in addition has the bug (we know about) fixed. We do not (yet) ship this as the standard optimizer since at the time of the software release, we had not had enough time to be sure the bug fix wouldn't break something else in one of the other languages (C, Pascal, PL/I, etc.). Fortran codes should be fine. In fact, I believe that no problems have been uncovered in any other language either up to this point, so the other languages should be fine too. > ... Try to run power fortran and none of the results are good. > Power fortran was bought to utilize all 4 processors.. It would > be nice if it worked. Can anyone from SGI comment on this. I've been running power fortran for a long time and I get quite good results; perhaps I can help. If you have a specific question or problem, send me email and we'll try to resolve it. -- Bron Campbell Nelson bron@sgi.com or possibly ..!ames!sgi!bron These statements are my own, not those of Silicon Graphics.
thant@horus.SGI.COM (Thant Tessman) (04/28/89)
In article <139@VAX1.CC.UAKRON.EDU>, tim@VAX1.CC.UAKRON.EDU (Timothy H Smith) writes: > We have just obtained a 4D/240 and are using it for scaler > floating point operations. We do not need a vector machine. [stuff deleted] > Also when the machine is running some reasable jobs the interactive > response goes to nothing. I mean 5 minutes for a ls. > I know the machine need more memory, but I don't expect this. [stuff deleted] > > Our machine is a 4D/240S with 16meg of memory. > I once worked with a 4D/120 with 8 meg. It was next to useless. I don't think they should sell them like that (with that little memory). You have twice as much memory but you also have twice as many processors. Run gr_osview to see if it is spending all its time swapping. The compilers are from MIPS and are generally considered excelent. If you really are getting different answeres with optimised versus non-optimised code, you should report it to the hotline as a bug. (Narrow it down and post it to the net?) > > thanks, > > tim@vax1.cc.uakron.edu > > *** my comments are mine and do not reflect my orginization *** ditto thant@sgi.com