ssroy@phoenix.Princeton.EDU (Steve Scot Roy) (05/03/89)
We are considering buying a 4D/120SX with 4 processors as a compute server here and I would like to know if anyone out there has more information about them. How well do these things parallelize? There were some postings a while ago with one person claiming 75% ( factor of 3 speedup with 4 processors) and another claiming that his parallelizer had bugs. How code dependant is the speedup? These things claim to be -executable- compatible with single processor machines, how is this done? If you use the parallelizing optimizer on it and then run it on a single processor machine, does it work? Can you tell each processor to run independantly for so each of many users sees a single processor? Do all languages parallelize? Specificly, does C parallelize? How much better is the 2XX series than the 1XX series? What is the difference? What speed do people -really- see on these beasts? How much memory do these things need? Some postings said they were useless with 8Meg, another was complaining at 16Meg; how much is enough and how much does it cost? In general, how stable and bug free are these things? Will I wonder every time a program isn't working whether it is my fault or its? Thanks a lot in advance. Steve Roy ssr@acm.princeton.edu
bron@bronze.SGI.COM (Bron Campbell Nelson) (05/03/89)
In article <8089@phoenix.Princeton.EDU>, ssroy@phoenix.Princeton.EDU (Steve Scot Roy) writes: > > We are considering buying a 4D/120SX with 4 processors as a compute > server here and I would like to know if anyone out there has more > information about them. Actually, I believe the marketing designation is 4D/140 for the 4cpu version (the 4D/120 has 2cpus). > How code dependant is the speedup <for parallelism> ? Totally code dependent. Parallelization is done at the Fortran DO loop level. Different iterations of a loop are executed in parallel on different processors. If your code spends most of its time executing inside such a loop (or loops), and the loop(s) can be parallelized (not all can be), you should see good speed up. > These things claim to be -executable- compatible with single processor > machines, how is this done? If you use the parallelizing optimizer on > it and then run it on a single processor machine, does it work? Yes, it does. In fact, you can compile and run the code on any of the 4D series of machines, and run the same executable on any other. For example, do code development on a Personal Iris (4D/20), and run the result on the multi-processor. As to how it works: When the program starts up, the intialization routines figure out how many processes you want. The default is to ask the o.s. how many cpus are on the machine, and use that. Alternately, you can set a shell environment variable specifying the number. This number is remembered. When the parallel loop is encountered, the iterations are divided among the processes that are participating in the job. Division by 1 is perfectly ok. Each process does its piece, and then they all synchronize at the end. This means that if you run the version compiled for a multi-processor on a single processor machine, it works, but runs a little bit slower than that same program compiled for a uni-processor (you incur the multi-processing overhead without benifit of an extra processor). However, that executable can now be transported and run on a multi-processor without change. > Do all languages parallelize? Specificly, does C parallelize? Right now, only Fortran has compiler support for parallelism. C (or any other language) can make use of the multi-processing library routines, but you have to do the parallelism "by hand". > How much better is the 2XX series than the 1XX series? What is the > difference? The 2xx series uses the 25MHz chips, the 1xx uses 16MHz. The 2xx also has some significant memory interface changes, and a bigger 2nd level cache. For the number crunching codes I run, the 2xx cpus are pretty much uniformly twice as fast as the 1xx cpus (much more than the ratio of their clock speeds would make you think). Of course, your mileage may vary. > How much memory do these things need? Some postings said they were > useless with 8Meg, another was complaining at 16Meg; how much is enough Depends on the applications you run of course, but I for one wouldn't put any less than 16meg on a 2cpu system, nor less than 24meg with 4cpus. (Of course, more is always better! :-) > In general, how stable and bug free are these things? Will I wonder > every time a program isn't working whether it is my fault or its? I have never had the production hardware fail me. There is (was) the optimizer bug mentioned earlier, but amusingly enough, this was never a problem for multi-processed codes! (The mp optimizer had the bug fix.) Of course, I have a strong bias. -- Bron Campbell Nelson bron@sgi.com or possibly ..!ames!sgi!bron These statements are my own, not those of Silicon Graphics.