reid@Glacier.ARPA (Brian Reid) (05/08/85)
I've been watching people make claims about the wonderfulness of multiprocessors for 20 years. I have also watched a much smaller number of people actually build and measure the performance of multiprocessors. This newsgroup has recently been alive with various informed and uninformed metaphysical babbling about multiprocessors. I don't believe most of it. I have never seen a single particle of evidence, not one number, that says that a tightly-coupled (e.g. shared-memory) multiprocessor is in any way better than a uniprocessor of the equivalent aggregate speed. If you know how to build a 100-MIP uniprocessor CPU, or 10 10-MIP processors for the same instruction set, or 100 1-MIP processors, then it is always much better to have the uniprocessor. It might be cheaper to build the multiprocessor, but the uniprocessor is a better computer. For loosely-coupled architectures there are sometimes arguments about reliability through redundancy, though they tend not to hold water in practice because of peripherals. But for a shared-memory machine, the only reason to build a multiprocessor instead of a uniprocessor is to make it cheaper. Otherwise, the uniprocessor is easier to program, faster (no synchronmization cost), has a higher burst speed, and can perform any parallel computation that the multiprocessor can perform. Of course it is not always possible to build a uniprocessor as fast as one would like, so multiprocessors and vector machines have always been at the leading edge of the speed wars, but this is not because they are better computers but because people know how to build them. I am always interested in seeing hard data about computer architecture (or anything else, for that matter). I invite any of the proponents of radical architecure multiprocessors to show me numbers demonstrating their superiority over uniprocessors. -- Brian Reid decwrl!glacier!reid Stanford reid@SU-Glacier.ARPA
pop@mtu.UUCP (Dave Poplawski) (05/09/85)
> I have never seen a single particle of evidence, not one number, that says > that a tightly-coupled (e.g. shared-memory) multiprocessor is in any way > better than a uniprocessor of the equivalent aggregate speed. If you know > how to build a 100-MIP uniprocessor CPU, or 10 10-MIP processors for the > same instruction set, or 100 1-MIP processors, then it is always much better > to have the uniprocessor. It might be cheaper to build the multiprocessor, > but the uniprocessor is a better computer. I don't think you will get any arguments about this - anybody would rather have a 10,000 MFLOPS uniprocessor than a 10,000 MFLOPS multiprocessor (shared memory, hypercube, mesh or whatever) - I would if the price were comparable or even if the uniprocessor were more expensive. Avoiding the reprogramming effort to express the parallelism in sequential programs in sequential languages would probably make up the difference in cost. Even for new programs, most of us still find it easier to write a sequential program than a parallel (especially massively parallel) program, and in many cases there isn't that much parallelism in the problem in the first place. > Of course it is not always possible to build a uniprocessor as fast as one > would like, so multiprocessors and vector machines have always been at the > leading edge of the speed wars, but this is not because they are better > computers but because people know how to build them. Exactly - you answered your own question! As long as the fastest uniprocessor available is a couple of orders of magnitude slower than available multiprocessors, and there are people who want (need) to solve problems that are adaptable to the multiprocessor, then the multiprocessor will be better (for those people and those applications). There is no religion here, just technology. As soon as you can build a uniprocessor that is as fast as any multiprocessor, the multiprocessors will go away. -- Dave Poplawski Michigan Technological University uucp: {lanl, ihnp4, glacier}!mtu!pop arpa/csnet: pop%mtu@csnet-relay
chuck@dartvax.UUCP (Chuck Simmons) (05/10/85)
> ...But for a shared-memory machine, the only > reason to build a multiprocessor instead of a uniprocessor is to make it > cheaper.... > > Of course it is not always possible to build a uniprocessor as fast as one > would like... > > Brian Reid Seems like that makes 2 very good reasons to build hypercubes and friends. Chuck
nather@utastro.UUCP (Ed Nather) (05/11/85)
> As soon as you can build a uniprocessor > that is as fast as any multiprocessor, the multiprocessors will go away. > -- > Dave Poplawski This makes no sense to me. If you can build a really fast uniprocessor, why can't you run a bunch in parallel and get more thoughput? Are you suggesting there may be a computer so fast no problem can keep it busy? My friends, the astrophysicists, don't believe that for a minute. -- Ed Nather Astronony Dept, U of Texas @ Austin {allegra,ihnp4}!{noao,ut-sally}!utastro!nather
sambo@ukma.UUCP (Inventor of micro-S) (05/13/85)
In article <7202@Glacier.ARPA>, reid@Glacier.ARPA (Brian Reid) writes: > I invite any of the proponents of radical > architecure multiprocessors to show me numbers demonstrating their > superiority over uniprocessors. Perhaps you ought to also be interested in those that are equal in performance, etc. with uniprocessors, but cost less.
jans@mako.UUCP (Jan Steinman) (05/13/85)
In article <7202@Glacier.ARPA> reid@Glacier.ARPA (Brian Reid) writes: >For loosely-coupled architectures there are sometimes arguments about >reliability through redundancy, though they tend not to hold water in >practice because of peripherals. There are companies who are making piles of money doing exactly this. Most notable is Tandem. (I'm a former employee.) The peripherals in a Tandem (and I assume, their recent competitors) follow the scheme, with performance benefits in the case of disks. Except for some magic concerning defect mapping, the mirrored disks have identical images. Writes are performed in parellel, but the task of reading is given to whichever disk is currently positioned closest to the desired data, effectively cutting average read access in half. Some of Tandem's recent upstart competition have tried to spread this philosophy to other peripherals, but is was determined that the difficulty of reading the alternating-line listings produced by parallel printers offset the speed advantage. (I can't believe I wrote that! :-) -- :::::: Jan Steinman Box 1000, MS 61-161 (w)503/685-2843 :::::: :::::: tektronix!tekecs!jans Wilsonville, OR 97070 (h)503/657-7703 ::::::
doug@terak.UUCP (Doug Pardee) (05/13/85)
[I consider the word "always" as a personal challenge...] > If you know > how to build a 100-MIP uniprocessor CPU, or 10 10-MIP processors for the > same instruction set, or 100 1-MIP processors, then it is always much better > to have the uniprocessor. It's not what everyone else is discussing, but there is *too* an application where 10 10MIPS CPUs (MIMD) will beat 1 100MIPS CPU. That's where there are 10 totally independent jobs to be done. For example, multi-user operating systems. A single CPU would have to deal with the overhead of context switching. Which leaves me kinda confused... I gather from the preceding discussion that the Cray is a SIMD (vector) machine, and does quite nicely on achieving high performance working on a single job. So why then would anyone want to bog it down with a multi-user operating system? Wouldn't it make more sense to build a multi-micro system to run the operating system and for program development (one CPU for each user), thereby freeing up the vector CPU to actually *run* jobs (one after the other)? -- Doug Pardee -- Terak Corp. -- !{ihnp4,seismo,decvax}!noao!terak!doug ^^^^^--- soon to be CalComp
pop@mtu.UUCP (Dave Poplawski) (05/14/85)
> > As soon as you can build a uniprocessor > > that is as fast as any multiprocessor, the multiprocessors will go away. > > -- > > Dave Poplawski > > This makes no sense to me. If you can build a really fast uniprocessor, > why can't you run a bunch in parallel and get more thoughput? Are you > suggesting there may be a computer so fast no problem can keep it busy? > My friends, the astrophysicists, don't believe that for a minute. > > -- > Ed Nather The statement was made in a whimsical voice (couldn't you hear it). I don't think that anybody will ever do it, probably because it is impossible for the reason you stated. However, don't count out very fast uniprocessors - I wouldn't want to try to get a 100-fold speedup on something like troff by putting it on 100 (or even 200, or 300, or ...) cpu multiprocessor. Some problems just don't seem to be very amendable to parallel solution, at least not that parallel. An interesting question is whether the throughput you mention is realized on a single problem, or several independent ones. On a single problem that must be broken into cooperating processes, it is possible that a multiprocessor would be slower than the uniprocessor because of contention, communication costs, synchronization overhead and delay, etc. It all depends on the problem, the algorithm, the program, the multiprocessor, ... -- Dave Poplawski Michigan Technological University uucp: {lanl, ihnp4, glacier}!mtu!pop arpa/csnet: pop%mtu@csnet-relay
wcs@ho95b.UUCP (Bill Stewart) (05/14/85)
Multiprocessing has a number of advantages over uniprocessing, which in many circumstances outweigh the disadvantages. The primary reason for multiprocessor machines is of course technology - it's a lot easier to combine 100 10-MFLOP processors than to build a 1-GFLOP processor (and if you build one, you can combine 100 of THEM.) If the processing you really want to do is true uniprocessing, then probably the fast uniprocessor will win. But most computing is inherently multiprocessing - either there are multiple users, or the problem has a reasonable degree of parallel structure that can better be exploited on a multiprocessor (e.g. finite element calculations, network modelling, etc.) On the multiprocessor, each processor can potentially take its one job and grind away; the uniprocessor wastes a lot of overhead doing process switches, swapping and paging. On the other hand, the multiprocessor wastes power when there are idle processors, whereas the uniprocessor goes faster if it has fewer jobs to do. The value of either approach really depends on the application environment and the tradeoffs you have to make; neither can be ruled out. Bill Stewart -- Bill Stewart 1-201-949-0705 AT&T Bell Labs, Room 4K-435, Holmdel NJ {ihnp4,allegra,cbosgd,vax135}!ho95c!wcs
henry@utzoo.UUCP (Henry Spencer) (05/15/85)
> >For loosely-coupled architectures there are sometimes arguments about > >reliability through redundancy, though they tend not to hold water in > >practice because of peripherals. > > There are companies who are making piles of money doing exactly this. Most > notable is Tandem. (I'm a former employee.) The peripherals in a Tandem > (and I assume, their recent competitors) follow the scheme, with performance > benefits in the case of disks. Except for some magic concerning defect > mapping, the mirrored disks have identical images... Yes, but most multiprocessor systems do *not* duplicate all the peripherals. Replication of peripherals is unusual except on systems (like Tandem's) whose major goal in life is high reliability. I think Brian's point was that "reliability through redundancy" is not a significant advantage for multiprocessors unless peripherals are duplicated too, which they usually aren't. I know that C.mmp -- one of the multiprocessor systems that Brian has worked on -- was plagued by fast-but-unreliable disks and insufficient funding for full replication. -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
jqj@cornell.UUCP (J Q Johnson) (05/16/85)
I guess I believe that, at least in principal, a multiprocessor could be preferable to an equivalent uniprocessor (= same aggregate througput, same total cache, etc.). The argument is that both multiprocessor and uniprocessor suffer scheduler overhead, but that in some cases the overhead on a multiprocessor will be less. A trivial example is a uniprocessor versus an n-fold multiprocessor running n completely independent tasks; presumably the scheduler overhead in that case will be linear in the length of the task for the uniprocessor, but sublinear (perhaps even a constant) for the multiprocessor (assumption here is that scheduler only gets invoked when someone isn't currently running but wants to be).
mash@mips.UUCP (John Mashey) (05/17/85)
J. Q. Johnson (..!cornell!jqj) writes: > I guess I believe that, at least in principal, a multiprocessor could > be preferable to an equivalent uniprocessor (= same aggregate througput, > same total cache, etc.). The argument is that both multiprocessor and > uniprocessor suffer scheduler overhead, but that in some cases the > overhead on a multiprocessor will be less. A trivial example is a > that scheduler only gets invoked when someone isn't currently running > but wants to be)....... One can certainly find examples of this; in particular it is true if one has n independent tasks that 1) are very compute bound 2) are small enough that paging traffic is minimal. Otherwise, what you find is that the OS has to pay the price not in scheduling overhead, but in other coordination overhead. For example, either you use snoopy caches [and chew up bandwidth and basic cycle time] or handle cache consistency by various software mechanisms. Next, you have to handle TLB consistency, and then you must interlock terminal I/O, disk cache I/O, etc. In every MP implementation I've seen, you always had to add code to interlock against rare events. Hence, as long as you stay in user programs, you can be OK, but as soon as you spend significant time in the kernel, you pay some coordination price. [Note: the above applies to general-use, not special-case systems.] Complexity is like garbage. Hard work can keep the amount down, but won't make it go away. If you sweep it under the rug you'll be sorry. At best, you can at least choose a good location for the garbage dump. -- -john mashey UUCP: {decvax,ucbvax,ihnp4}!decwrl!mips!mash DDD: 415-960-1200 USPS: MIPS Computer Systems, 1330 Charleston Rd, Mtn View, CA 94043