chuck@amdahl.amdahl.com (Charles Simmons) (06/10/87)
In my recent posting, I forgot to mention that I was exposed to video tapes of computer generated animation by Eugene Miya at NASA. Eugene also mentions that I didn't nearly cover the possible applications for high-powered computers running graphics programs. Some of the other applications he mentions are weather forecasting, medical imaging, CAD/CAM, computer enhancement of satellite photos, and I think there were one or two more that I'm still forgetting. Eugene also asks me to mention the upcoming SIGGRAPH meeting to be held on Thursday, June 18, 1987 at the Exploratorium in San Francisco. The speaker will be Alvy Ray Smith, the vice president of Pixar. He will be talking on the history of computer graphics. A recent posting (5 June 87) in comp.graphics gives more details. -- Chuck
cowburn@rocky2.UUCP (06/10/87)
In article <8270@amdahl.amdahl.com>, chuck@amdahl.amdahl.com (Charles Simmons) writes: > In my recent posting, I forgot to mention that I was exposed to video > tapes of computer generated animation by Eugene Miya at NASA. Eugene > also mentions that I didn't nearly cover the possible applications > for high-powered computers running graphics programs. Some of the > other applications he mentions are weather forecasting, medical imaging, > CAD/CAM, computer enhancement of satellite photos, and I think there > were one or two more that I'm still forgetting. > perhaps another point to consider is that the need for graphics in the applications mentioned is still just the first part of the interpretative effort which has yet to be reduced satisfactorily to machine activity. Oil companies like to look at the maps before oil wells are dug, the day when the Cray spits out "Drill Here" ( and is accepted) is distant; the radiologist still likes to hold the film to the lighbox, and no doubt we would all be unhappy with some SUN in the operating room directing the surgeon where to make the first incision for the appendectomy. Yet these tasks are clearly reducible to machine activity by appropriate scene analysis and "AI"-style incorporation of preexisting expertise. There is, of course, some severe lag in incorporating even existing algorithms into satisfactory code for supercomputers.
oconnor@sunray.steinmetz (Dennis Oconnor) (06/12/87)
[ Are there really line eaters? ] Enuff on how graphics will chew MIPS. It will, but not like SIMULATION ! A revolution is occurring today in engineering, where more and more computers, chemicals, parts and systems are thoroughly SIMULATED before even a first prototype is built. On the horizon are even MORE complex simulations, like the effect of drugs on biological systems, and accurate faster-than-life atmospheric simulation. Even really simple simulations ( like 100k-transistor CPUs ) eat MIPS. Simulation will gobble up parrallism as well - what a bargain ! When it comes to producing new products in competive arenas, simulation has been shown to be faster and cheaper than traditional methods. The more MIPS you have, the more complex a system people will be trying to simulate. So bring on the MIPS ! ( MFLOPS too, of course. ) -- Dennis O'Connor oconnor@sungoddess.steinmetz.UUCP ?? ARPA: OCONNORDM@ge-crd.com "Everything { used_to_have | has | will_have } a niche, even my opinions."
dzzr@beta.UUCP (06/15/87)
Does somebody REALLY think we are going to run out of work before we run out of MIPS? Fat chance. My group is developing large-scale discrete event simulations in KEE (Knowledge Engineering Environment - an AI shell) with lots of extra LISP code defining events. We are running on Symbolics 36xx machines. Unfortunately, we have run the technology into the ground several times in the last few years (run times growing out of bound). Luckily, every time we ended up against the wall, technology would spring to the rescue with some kind of hardware speed-up or another. One of our current projects is causing smoke to pour out of a poor overworked 3600: lots of events (10's of thousands in a run) PLUS lots of graphics. Just when things were beginning to look bleak (again) along will come relief Real Soon Now in the form of faster hardware. Symbolics announced a new product called the Ivory LISP-on-a-chip architecture with potential speed increases of 3-5X. Texas Instruments has announced a similar VLSI LISP-on-a-chip product. In another hardware arena the SUN4 RISC machine sounds promising. We already have plans for all of these new, faster machines! --Doug
wood@dg_rtp.UUCP (Tom Wood) (06/16/87)
The recent discussion on how to feed your poor starving CPU leaves me a bit in the dark. What architectural features help in dealing with these various applications (graphics, simulation, object oriented AI, CAD)? For sake of discussion, I propose we normalize the O/S to Unix. Are these applications implemented using multiple processes? If so, a multiprocessor could be of help. If not, it seems you'ld have to try to build a "Cray": one fast monolithic processor. Personally, I believe the 90% solution to obtaining parallelism is to take advantage of multiple independent computations. (It's much easier to make 100 compiles go 100 times faster by using 100 machines than it is to make 1 machine go 100 times faster on each compile.) So how many fast CPUs can these applications benefit from? My guess is that most would have a hard time using more than 1, and that some may be able to use 2 or so. What would you do (short of rewriting the application) if you were given a machine with 10-15 fast CPUs to put on your desk? -- Tom Wood (919) 248-6067 Data General, Research Triangle Park, NC {the known world}!mcnc!rti-sel!dg_rtp!wood
franka@mmintl.UUCP (Frank Adams) (06/16/87)
In article <6328@beta.UUCP> dzzr@beta.UUCP (Douglas J Roberts) writes: >Does somebody REALLY think we are going to run out of work before >we run out of MIPS? Fat chance. To be fair, the person who presented that view stated that there are plenty of applications which will use all the processing power available, and then some. The question is, can we make effective use of greatly increased processing power in common computing contexts -- the PCs on people's desks; the Apollo and Sun workstations; the departmental VAX? I still think the answer is yes. It is true that a 100-fold increase in power in the next 3 months would create a temporary overcapacity; one it might take a couple of years to make much use of. But I see no sign of such a quantum leap. As long as progress continues at its current (amazing) rate, we will be able to use what becomes available in relatively short order. -- Frank Adams ihnp4!philabs!pwa-b!mmintl!franka Ashton-Tate 52 Oakland Ave North E. Hartford, CT 06108
herndon@umn-cs.UUCP (06/16/87)
In article <2120@dg_rtp.UUCP>, wood@dg_rtp.UUCP (Tom Wood) writes: > ... What architectural features help in dealing with > these various applications (graphics, simulation, object oriented AI, > CAD)? > ... If not, it seems you'ld have to try > to build a "Cray": one fast monolithic processor. Cray has been selling multiprocessor systems for a few years now. Even the machines Seymour Cray designed while at CDC were multiprocessors (central processor and lots of little peripheral processors) and have come in dual central processor models for some time. > > So how many fast CPUs can these applications benefit from? My guess is > that most would have a hard time using more than 1, and that some may > be able to use 2 or so. > Tom Wood (919) 248-6067 Many applications would benefit significantly from parallelism. I'm informed that ray-tracing graphics is trivially parallelizable to many many machines. Many types of simulation are also parallelizable (weather prediction, thermal simulations, flow simulation, ...). Most physical simulations can be mapped onto a planar 3D mesh for significant speed gain. Many applications are not easily serialized, but a surprising number are parallelizable. Rethinking algorithms for massive parallelism often yields surprinsing speedups. See the December 1986 CACM article on the connection machine for some interesting parallel algorithms. -- Robert Herndon Dept. of Computer Science, ...!ihnp4!umn-cs!herndon Univ. of Minnesota, herndon@umn-cs.ARPA 136 Lind Hall, 207 Church St. SE herndon.umn-cs@csnet-relay.ARPA Minneapolis, MN 55455
eugene@pioneer.UUCP (06/18/87)
In article <1658@umn-cs.UUCP> herndon@umn-cs.UUCP (Robert Herndon) writes: > > Cray has been selling multiprocessor systems for a few years now. >Even the machines Seymour Cray designed while at CDC were multiprocessors >(central processor and lots of little peripheral processors) and have >come in dual central processor models for some time. Evans and Sutherland has also been selling high performance multiprocessor graphics systems for years as well. In some ways they outperform Crays in floating point multiplication (send inquires to esvax). > Many applications would benefit significantly from parallelism. >I'm informed that ray-tracing graphics is trivially parallelizable Trivial eh? This is not a pursuit game. Trivial ray examples parallelize trivially. You only need look at the world around you. >to many many machines. Many types of simulation are also parallelizable >(weather prediction, thermal simulations, flow simulation, ...). >Most physical simulations can be mapped onto a planar 3D mesh for >significant speed gain. While you use the word "most" understand, many 3-D simulations and analyses use higher-order dimensions to hold other useful data: alpha channel for transparency in graphics systems, homogeneous coordinate systems, audio channel sync stuff, and so forth. At LANL and LLNL there are 7-D problems which is limited to the FORTRAN. Don't restrict your view to 3-D only because you can conceptualize in 3-D. From the Rock of Ages Home for Retired Hackers: --eugene miya NASA Ames Research Center eugene@ames-aurora.ARPA "You trust the `reply' command with all those different mailers out there?" "Send mail, avoid follow-ups. If enough, I'll summarize." {hplabs,hao,ihnp4,decwrl,allegra,tektronix,menlo70}!ames!aurora!eugene
grunwald@uiucdcsm.cs.uiuc.edu (06/19/87)
I have to agree with Eugene Miya on the ``ray tracing is trivally parallel only if you do trival things''. It's true that you can just sub-divide your view space. I moved a ray tracer to a hypercube and got almost linear speedup. However, the ``trival'' part dies soon. Doing larger scenes (which is what's desired anyway) requires sub-dividing your data-space as well. Suddenly, your problem is no longer as simple as it used to be & you start to make trade-offs between replication of data & how much data & speed-up & messaging. But, an hour or two is faster than waiting a day for the vax to crank out a picture.
fay@encore.UUCP (Peter Fay) (06/19/87)
In article <6240@steinmetz.steinmetz.UUCP> OCONNORDM@ge-crd.com@ARPA (Dennis Oconnor) writes: > >Enuff on how graphics will chew MIPS. It will, but not like SIMULATION ! >A revolution is occurring today in engineering, where more and more >computers, chemicals, parts and systems are thoroughly SIMULATED before >even a first prototype is built. On the horizon are even MORE complex >simulations, like the effect of drugs on biological systems, and >accurate faster-than-life atmospheric simulation. Even really simple >simulations ( like 100k-transistor CPUs ) eat MIPS. Simulation will >gobble up parrallism as well - what a bargain ! > >When it comes to producing new products in competive arenas, simulation >has been shown to be faster and cheaper than traditional methods. The >more MIPS you have, the more complex a system people will be trying >to simulate. So bring on the MIPS ! ( MFLOPS too, of course. ) I was wondering how long it would be before someone said this. There is no question in my mind that this will become a very huge market. Imagine simulating cache coherency of a 1000-MIP machine with 128 cpus and hierarchical caches. And that's only what we're doing today, not a few years from now. There's no question that this current work would be impossible without nice machines like our 35-MIPS multiprocessor. (Now we can run the Mach operating system - picture one Mach simulation task running with 2000 light-weight threads of control instead of one Unix job creating 2000 processes.) Of course it would be nice to have 30 times our current power, which is why we're designing it. peter fay research encore computer {allegra|decvax|ihnp4|linus|princeton|pur-ee|talcott}!encore!fay fay@multimax.arpa
bacon@alberta.UUCP (Dave Bacon) (06/23/87)
In article <2194@mmintl.UUCP>, franka@mmintl.UUCP (Frank Adams) writes: > In article <6328@beta.UUCP> dzzr@beta.UUCP (Douglas J Roberts) writes: > >Does somebody REALLY think we are going to run out of work before > >we run out of MIPS? Fat chance. > .... It is true that a 100-fold increase in > power in the next 3 months would create a temporary overcapacity; one it > might take a couple of years to make much use of. But I see no sign of such If the production of such machines were limited (which could be expected with such a quantum leap). I think the one could easily find the people who could use such machines, especially if they were priced in the same range as current machines and were capable of using current software with minimal change. We had one person here, who used ~100 hours of cray time creating a molecular animation sequence, and we would be QUIT happy to be able to do things like that in (almost) real time. Granted: not EVERYBODY could make reasonable use of such machines but, if you give me an immensely powerful machine, I'm sure I could find at least a dozen people who would be happy to use it within a week (assuming the software could be ported over that fast). Stephen Samuel {ihnp4,seismo!mnetor,vax135}!alberta!edm!steve
gwu@vlsi.cs.cmu.edu (George Wu) (06/23/87)
I doubt a 100-fold increase in computing power would create an over-capacity. If someone did come out with such a machine, he'd charge a bundle and clean-up! Like today's supercomputers, or maybe more like the supercomputers a few years ago, it'd be very expensive to get time on such a machine, let alone purchase one. Access would be limited to a few customers, those who really need the power and can afford to pay for it. Of course, if enough different vendors came out with such machines, forcing the dollars per MIPs down, maybe there would be such excess capacity. George
eugene@pioneer.arpa (Eugene Miya N.) (06/23/87)
>In article <2194@mmintl.UUCP>, franka@mmintl.UUCP (Frank Adams) writes: >> In article <6328@beta.UUCP> dzzr@beta.UUCP (Douglas J Roberts) writes: >> >Does somebody REALLY think we are going to run out of work before >> >we run out of MIPS? Fat chance. >> .... It is true that a 100-fold increase in >> power in the next 3 months would create a temporary overcapacity; one it >> might take a couple of years to make much use of. But I see no sign of such > Actually, I have given some thought to this temporary overcapacity as a useful measure. It took 3 weeks before one of our X-MPs (A 4-CPU upgrade) became saturated. I think this might give a useful measurement on the derviative of user usage. Other sites using other smaller CPUs reported at SIGMETRICS that their machine became saturated at the 3 month mark. This makes a lot of assumptions, but I feel that it makes a somewhat useful subjective measure. From the Rock of Ages Home for Retired Hackers: --eugene miya NASA Ames Research Center eugene@ames-aurora.ARPA "You trust the `reply' command with all those different mailers out there?" "Send mail, avoid follow-ups. If enough, I'll summarize." {hplabs,hao,ihnp4,decwrl,allegra,tektronix,menlo70}!ames!aurora!eugene
johnw@astroatc.UUCP (John F. Wardale) (06/24/87)
In article <2120@dg_rtp.UUCP> wood@dg_rtp.UUCP (Tom Wood) writes: >Personally, I believe the 90% solution to obtaining parallelism is to >take advantage of multiple independent computations. (It's much easier >to make 100 compiles go 100 times faster by using 100 machines than it >is to make 1 machine go 100 times faster on each compile.) This may be true, but for most "real" problems, some well know person determined that the average code spends 90% of its time executing 10% of its code. This and other related studys show that a large fraction of problems that have no, or very limited parrallelism. A couple mounths ago there was some discussion of somebody's challenge (with a moderate cash prize) .... As I recall, you had to speed up a general problem (not limited to HIS problem set, but he could reject anything that was "embarrisingly parrallel" [like the 100 compiles example]) by a factor of 100, and you could use as many processors as you wanted. Did anyone save any of these? Has anyone won the prize yet? John W - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Name: John F. Wardale UUCP: ... {seismo | harvard | ihnp4} !uwvax!astroatc!johnw arpa: astroatc!johnw@rsch.wisc.edu snail: 5800 Cottage Gr. Rd. ;;; Madison WI 53716 audio: 608-221-9001 eXt 110 To err is human, to really foul up world news requires the net!
hwe@beta.UUCP (Skip Egdorf) (06/25/87)
In article <1873@ames.UUCP>, eugene@pioneer.arpa (Eugene Miya N.) writes: > Actually, I have given some thought to this temporary overcapacity as a > useful measure. It took 3 weeks before one of our X-MPs (A 4-CPU > upgrade) became saturated. I think this might give a useful measurement > on the derviative of user usage. Other sites using other smaller CPUs > reported at SIGMETRICS that their machine became saturated at the 3 > month mark. This makes a lot of assumptions, but I feel that it makes > a somewhat useful subjective measure. > > From the Rock of Ages Home for Retired Hackers: > > --eugene miya > NASA Ames Research Center > eugene@ames-aurora.ARPA When Los Alamos went from 6 to 7 Crays a few years back, the Cray folks said that we did something that they had never seen... Gone from 6 saturated Crays at 6 PM to 7 saturated Crays at 6:30 the next morning. Give me a hundred mips on my desk tomorrow, and I will use it up on BOTH graphics and simulation. Skip Egdorf hwe@lanl.gov
bradley@think.uucp (Bradley Kuszmaul) (06/25/87)
In article <337@astroatc.UUCP> johnw@astroatc.UUCP (John F. Wardale) writes: >This may be true, but for most "real" problems, some well know >person determined that the average code spends 90% of its time >executing 10% of its code. > >This and other related studys show that a large fraction of >problems that have no, or very limited parrallelism. Actually, I interpret the 90%-10% rule as indicating that there might be a lot of parallelism in problems. Since 90% of the time is spent executing a very small amount of code, it seems likely that there is a lot of data involved. It is also likely that the dependencies between data are something smaller than "completely connected", and so it is likely that different parts of the data can be processed in parallel. Now, I am the first to admit that everything I just said has a lot of "maybes" and "likelys" in it, and while I personally believe that in fact there is a lot of parallelism lying around, I would not expect you to believe it based on such a weak argument. My feeling is that the lesson to be learned here is not "There is X amount of parallelism lying around in current applications". The lesson is "Rules of thumb, such as Amdahl's Law, don't have much to say about parallel computing". >A couple mounths ago there was some discussion of somebody's >challenge (with a moderate cash prize) .... As I recall, you had >to speed up a general problem (not limited to HIS problem set, but >he could reject anything that was "embarrisingly parrallel" [like >the 100 compiles example]) by a factor of 100, and you could use >as many processors as you wanted. Did anyone save any of these? >Has anyone won the prize yet? As far as I know, no one has seriously tried to solve the Karp problem. I think it is only a matter of time. The problem is that in order to even have a chance at solving the Karp problem, you have to have at least 100 processors in your parallel machine (otherwise no way are you going to get a factor of 100 speedup). Most of the machines which have <1000 processors in them (e.g. BBN butterfly or Intel IPSC) have relatively slow communciations networks, and thus find it difficult not to lose a factor of 10 due to that sort of inefficiency. If we assume (handwave handwave) that parallel computing carrys come constant performance cost (say a factor of 10) then we need at least 1000 processors. There are very few machines with >1000 processors in them, most of them are not quite general purpose enough to solve the problem. Company plug: The Connection Machine, which has 64000 processors in it, probably has the best chance of solving the Karp problem, but it seems a little unfair to compare 64000 one bit processors to a single one bit processor (for a performance improvement of 64000) or to compare a 64000 processor Connection Machine to a Cray (which is made out of much more expensive technology). My feeling is that the Karp problem should have been phrased as "compare your parallel machine with a single processor serial machine made out of about the same amount of hardware" (or perhaps the same cost?). This would pit a Connection Machine against about a third of a Cray 1, or perhaps a couple of Vax 8800's or a high end IBM mainframe. There are several applications for which the Connection Machine which can beat those machines by an order of magnitude. One can imagine that there are some applications for which the Connection Machine is not well suited, but, aside from simulating a network of Cray computers, I can't think of any :-) -Bradley Bradley C. Kuszmaul, Thinking Machines Corporation, Cambridge, MA bradley@think.com bradley@think.uucp (i.e. seismo!think!bradleWit
mac@uvacs.CS.VIRGINIA.EDU (Alex Colvin) (06/25/87)
> > When Los Alamos went from 6 to 7 Crays a few years back, the Cray folks > said that we did something that they had never seen... > Gone from 6 saturated Crays at 6 PM to 7 saturated Crays at 6:30 the next > morning. Some load isn't spread out by adding processors. I hear that the Los Almos Crays spend much of their time processing characters in the tty drivers and screen editors.
pase@ogcvax.UUCP (Douglas M. Pase) (06/26/87)
In article <2120@dg_rtp.UUCP> wood@dg_rtp.UUCP (Tom Wood) writes: Personally, I believe the 90% solution to obtaining parallelism is to take advantage of multiple independent computations. (It's much easier to make 100 compiles go 100 times faster by using 100 machines than it is to make 1 machine go 100 times faster on each compile.) I hope I'm not missing the point, but I think you're off by 10% -- that is, the only approach to parallelism is, by definition, taking advantage of multiple independent computations. There's lots of levels to choose from, not just one. What you have mentioned here (with the 100 compiles) is parallelism at the process level. This approach is the easiest, the best understood, and many vendors provide commercial products which successfully take advantage of this type of parallelism. Honeywell has been doing this for years with CP-6, and Sequent and Apollo are two newer entries. (Oh, I see. I bet this level is what you meant by "independent" -- correct me if I still misunderstand.) The advantage of this level is that the overhead required to parallelize the computation is relatively small, and controlled by the system (eg in system locks and resource scheduling) - not introduced into the computation itself. The next level has multiple co-operating tasks, as in a producer/consumer relationship and similar approaches. At this level the overhead is built directly into the application. Sometimes the overhead required to distribute sufficient information to run multiple parallel tasks cancels any benefits that might have accrued. Keller's Readyflow system operates at this level. The Cray-X-MP, Sequent, Alliant, and some other shared memory machines can be operated at this level using some form of microtasking. All distributed memory machines (such as the Intel Hypercube and NCube's machine) are operated at this level. (By the way, "large grain dataflow" is at this level.) Another level is the instruction level. At this level, instructions are scheduled independently and in parallel. The MIT dataflow machine is an example of this. Operands accumulate in a "waiting-matching store" until all operands required by an operator have accumulated. At that time the operator and its operands are placed in a queue, and executed as soon as a processor becomes available. The Manchester dataflow machine works very similarly to the MIT machine. The Cray machines also take advantage of this level of parallelism. Perhaps the bottom level is the microcode level. Any machine (such as the DEC 8600 series) which pipelines its microcode is executing in parallel. The Goodyear MPP is the machine which offers the most parallelism at this level (although it's not exactly pipelined). It weighs in at 16K 8-bit processors. In article <astroatc.337> johnw@astroatc.UUCP (John F. Wardale) writes: This may be true, but for most "real" problems, some well know person determined that the average code spends 90% of its time executing 10% of its code. This and other related studys show that a large fraction of problems that have no, or very limited parrallelism. As you have stated the study, it does *not* support your conclusion. Tight loops in FORTRAN (shudder) programs may often be parallelized, by pipelining instructions. Kuck's work on parallelizing compilers have shown amazing improvements can be gained by pipelining and vectorizing DO loops. (Yes, both are a form of parallelism.) It is the data dependencies which determine the available parallelism, not the size of the code. A couple mounths ago there was some discussion of somebody's challenge (with a moderate cash prize) .... As I recall, you had to speed up a general problem (not limited to HIS problem set, but he could reject anything that was "embarrisingly parrallel" [like the 100 compiles example]) by a factor of 100, and you could use as many processors as you wanted. Did anyone save any of these? Has anyone won the prize yet? John W - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Name: John F. Wardale UUCP: ... {seismo | harvard | ihnp4} !uwvax!astroatc!johnw arpa: astroatc!johnw@rsch.wisc.edu snail: 5800 Cottage Gr. Rd. ;;; Madison WI 53716 audio: 608-221-9001 eXt 110 This seems pretty easy to me. Any fluid modeling or simulation problem such as numerical weather forcasting, dynamic air-flow analysis, oceanographic simulation, planet/galaxy formation, gas dispersion, etc., would benefit a lot from just about any level of parallelism. If this problem set isn't sufficiently "real", how about finite-element analysis, or image processing? I would rather solve any of these problems on the MPP than any single 8-bit processor. -- Doug Pase -- ...ucbvax!tektronix!ogcvax!pase or pase@Oregon-Grad (CSNet)
fu@hc.DSPO.GOV (Castor L. Fu) (06/26/87)
In article <1613@uvacs.CS.VIRGINIA.EDU> mac@uvacs.CS.VIRGINIA.EDU (Alex Colvin) writes: >> >> When Los Alamos went from 6 to 7 Crays a few years back, the Cray folks >> said that we did something that they had never seen... >> Gone from 6 saturated Crays at 6 PM to 7 saturated Crays at 6:30 the next >> morning. > >Some load isn't spread out by adding processors. I hear that the Los Almos >Crays spend much of their time processing characters in the tty drivers >and screen editors. Gosh, as far as I knew, there aren't any screen editors on the crays. In fact, it is quite common for people to do most of their code preparation on other computers which have friendly editors o-f46
bs@linus.UUCP (Robert D. Silverman) (06/26/87)
>The Goodyear MPP is the machine which offers the most parallelism at this >level (although it's not exactly pipelined). It weighs in at 16K 8-bit >processors. Slight correction here. The MPP consists of 16K 1-bit processors, not 8. Bob Silverman
baum@apple.UUCP (Allen J. Baum) (06/29/87)
-------- [] >Gosh, as far as I knew, there aren't any screen editors on the crays. >In fact, it is quite common for people to do most of their code preparation >on other computers which have friendly editors etc. Well, now you know differently. We are running VI on ours, and we may be getting EMACS in the near future -- {decwrl,hplabs,ihnp4}!nsc!apple!baum (408)973-3385
johnw@astroatc.UUCP (John F. Wardale) (06/29/87)
In article <1323@ogcvax.UUCP> pase@ogcvax.UUCP (Douglas M. Pase) writes: > In article <2120@dg_rtp.UUCP> wood@dg_rtp.UUCP (Tom Wood) writes: > to make 100 compiles go 100 times faster by using 100 machines than it > In article <astroatc.337> (ME) johnw@astroatc.UUCP (John F. Wardale) writes: > > This and other related studys show that a large fraction of > problems that have no, or very limited parrallelism. > >As you have stated the study, it does *not* support your conclusion. Tight >loops in FORTRAN (shudder) programs may often be parallelized, by pipelining >instructions. Kuck's work on parallelizing compilers have shown amazing >improvements can be gained by pipelining and vectorizing DO loops. (Yes, >both are a form of parallelism.) It is the data dependencies which determine >the available parallelism, not the size of the code. Right, and *MOST* programs have dependencies that limit the paralleliam to one or two digits worth. Tom's original posting was looking for three digits worth. To get that, you need a problem that is "embarrassingly parallel" or an amazing system (hw+sw). >This seems pretty easy to me. Any <long list of ap's> > would benefit a lot from just about any level of parallelism. True, ***BUT*** Can you achive >= 100x speed-up by parallelizing the examples you give. Use as many processors as you want! You can have a case of <beer,???> from me if you succeed! John W - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Name: John F. Wardale UUCP: ... {seismo | harvard | ihnp4} !uwvax!astroatc!johnw arpa: astroatc!johnw@rsch.wisc.edu snail: 5800 Cottage Gr. Rd. ;;; Madison WI 53716 audio: 608-221-9001 eXt 110 To err is human, to really foul up world news requires the net!
bs@linus.UUCP (Robert D. Silverman) (07/02/87)
In article <343@astroatc.UUCP> johnw@astroatc.UUCP (John F. Wardale) writes: >In article <1323@ogcvax.UUCP> pase@ogcvax.UUCP (Douglas M. Pase) writes: >> to make 100 compiles go 100 times faster by using 100 machines than it > >> In article <astroatc.337> (ME) johnw@astroatc.UUCP (John F. Wardale) writes: >> >> This and other related studys show that a large fraction of >> problems that have no, or very limited parrallelism. >> >>As you have stated the study, it does *not* support your conclusion. Tight >>loops in FORTRAN (shudder) programs may often be parallelized, by pipelining >>instructions. Kuck's work on parallelizing compilers have shown amazing >>improvements can be gained by pipelining and vectorizing DO loops. (Yes, >>both are a form of parallelism.) It is the data dependencies which determine >>the available parallelism, not the size of the code. > >Right, and *MOST* programs have dependencies that limit the paralleliam to >one or two digits worth. Tom's original posting was looking for three >digits worth. To get that, you need a problem that is "embarrassingly >parallel" or an amazing system (hw+sw). Interestingly enough, I have an algorithm (actually two separate algorithms) which can take FULL advantage of ALL the parallelism available. These algorithms are the Quadratic Sieve and the Elliptic Curve Method for factoring integers. I have demonstrated that N MIMD processors yield an N-fold speed-up for both of these algorithms running on a distributed network of SUN-3's. I have a paper describing the Quadratic Sieve work that will appear this fall in the Journal of Supercomputing. My current set of hardware consists of about 22 SUN's and they quite definitely yield a 22-fold speedup. If I could get hold of 100 SUN-3's on a single network file system I could meet the conditions of the Karp challenge. The Elliptic Curve Method is a random algorithm that does some group (ring) computations on a random elliptic curve. If the order of the group of that curve is divisible by ONLY small primes then the algorithm will succeed in factoring a very large number. If the group isn't divisible by only small primes, one gives up and tries another random curve. By giving separate, random curves to each processor N processors will run N curves at once rather than one after another. A consequence is that during the computations the processors need NOT communicate with one another. If one processor succeeds it simply halts the others. This sort of thing would also run very nicely on a SIMD machine. I suspect, however, that Karp would consider this sort of thing in the category of 'trivially parallelizable'. Comments Please? Bob Silverman