[comp.arch] Scalability?

brooks@maddog.llnl.gov (Eugene Brooks) (05/02/90)

In article <49622@lanl.gov> ryg@lanl.gov (Richard S Grandy) writes:
>I've got a glossy from SGI that shows the POWER center performance on 
>LINPACK (100x100, coded) as:
>		1 CPU	3.8 DP MFLOPS
>		4 CPU	16  DP MFLOPS
>		8 CPU	28  DP MFLOPS
>Does this mean than with 4 cpus you really get GREATER than a linear speedup?? 
I have documented superlinear speedups on linear system solvers for
machines with coherent caches hooked to a bus.  The effect can occur
when the problem size is larger than the cache on one processor, but small
enough to allow distribution of the data set in serveral caches without
cache spilling.  The data set in this case is 80K bytes.  As I recall, the
POWER series uses a 64K first level cache which is write-through backed up
with a 256 copy-back cache hooked to the bus.  One would expect that a
super linear effect would be possible given the size of the first level caches.


brooks@maddog.llnl.gov, brooks@maddog.uucp

rminnich@super.ORG (Ronald G Minnich) (05/02/90)

In article <1990May1.154558.24009@cs.rochester.edu> crowl@cs.rochester.edu (Lawrence Crowl) writes:
>A common trap is measuring parallel speedup is to use a "loaded" CPU.  For
>instance, the single processor might have been serving interrupts, clock
>icons, etc.  The result is that the performance for the single processor case
>is artificially low.  After taking care, the numbers usually turn out slightly
>less than linear.  
This is true, but it is also true that you can get superlinear speedup 
if the problem was so big that the single-processor machine just sat and
thrashed its guts out. I have seen this with my Mether DSM, where 
splitting the problem up decreased the working set enough that a 
two-or-four processor system could actually run, where a one-processor
version would sit and exercise disk arms. Kai Li also found a similar 
situation with Ivy. 
   To put it another way, if the runtime is infinity for one processor, 
and you can just about run it on two, then you get superlinear speedup :-)
ron
P.S. BTW, if anyone on this list is going to ICDCS, and you are doing 
     DSM implementations too, I would enjoy meeting you at ICDCS! 
     Send mail .... rminnich@super.org
-- 
rminnich@super.org