eugene@ames.UUCP (Eugene Miya) (05/20/85)
[My three cents worth.] SUMMARY 1) multiprocessors and uniprocessors 2) parallelism semantics (two places) Sequin Whorf 3) RISC comment -- ACM ref. 4) programming multiprocessors 5) programming Crays [not brief] 6) portability of C and other applications to supercomputers 7) models of parallel and distributed computation { /* start */ > > my qualifications are: i have no experience with multiprocessors. That's okay we all have to start some where. In a survey I did for NASA, I found a paper which pointed out there are really two important points of view: MULTIprocessors are a special case of UNIprocessors. [Just lots of them] UNIprocessors are a special case of MULTIprocessors [just simpler]. I can try to find that IEEE reference if you want it. Programming is going to be a real challenge because programming as we know it is going to get harder before it gets easier. Programming multiprocessors is a lot like 'systems' programming where 'system' has an artificial distinction with 'applications' programming. > my observation is that uniprocessors are not really uniprocessors at all > (at least the fast ones). they are really made up of multiple functional > units. these units might be rather specialized in function: multiplication, > one of the goals Is the PDP-11 a parallel processor? It has all these wide PARALLEL busses. Carlo Sequin in Proc. IEEE had a table on the degrees of parallelism from the BIT-level [Hardware] to the "Task [note software]"- level. Parallel is where you see it. > of risc architectures are that the compiler has an easier time to set up > the pipelines so that these functional units are indeed kept busy with useful This is one rationalization. RISCs have other values. For other rationalizations for RISC, see Dave Paterson's paper on RISCs in the January 85 CACM. > we have all been taught to think sequentially to solve > problems because of the programming languages that we were taught early on. > i think that its more natural to solve some problems non-sequentially (albeit > fuzzily). > danny chen > ihnp4!hou2b!dwc Ah! Whorf's hypothesis about languages! This is a problem. Consider the following: Suppose you only had tools [languages] which worked in parallel. How would your thinking change? A common comment is: "Put APL on the Cray," but no one to my knowledge has done this preferring to stick to FORTRAN-like languages: CFT, CIVIC, etc. and you can include languages like Pascal. A problem of course is that most supercomputers run batch operating systems. More, see below. You still have to do some things sequentially. Gordon Bell speaking at ACM 84 was distressed that many people were planning to build 'parallel' 5th gen. machines. He said they had little idea how hard it was, and that this type of research should have been done in the 1970s instead of the commercial over-emphasis on micros [last bit is my paraphasing]. We are [somewhat] blindly seeking more performance. Too many people are jumping in parallel processing without seeing some of the benefits they gained from sequential processing that they might lose going parallel. Simulation forced people to sit down and think sequentially. This is a subtle point. From Doug Pardee: >> If you know >> how to build a 100-MIP uniprocessor CPU, or 10 10-MIP processors for the >> same instruction set, or 100 1-MIP processors, then it is always much better >> to have the uniprocessor. We thought about orbiting a Cray around Venus for radar image processing, but it's not space qualified. Can you imagine orbiting all that freon? No, not always. >Which leaves me kinda confused... You are not alone. >I gather from the preceding discussion that the Cray is a SIMD (vector) Yes. >machine, and does quite nicely on achieving high performance working on >a single job. So why then would anyone want to bog it down with a >multi-user operating system? I am certain Eugene Brooks and the people at BRL and other sites will speak. There are two basic philosophies using supercomputers: interactive and batch execution. We have two Cray XMPs which mirror these two philosophies. Both run the Cray Operating System [it's a lot like EXEC*1100]. The production XMP/22 is used strictly in a batch mode. I have to submit a batch job to correct simple typographic errors. The second XMP/12 is on an interactive front-end. I find myself much more productive. It is easy to say interactive use is wasteful of cycles, but we do not really understand how to hook up micro front-ends very well. Cray RI has recently been looking at workstations, but distributed software is in short supply. There is one highly tauted version of a distributed emacs for PC to Cray work. It's not emacs. So I have two basic models: I can choose to use efficient cycles via batch. This means I have to to RJE, and may have to learn a second operating system, and a new set of commands, and so forth. There is a variant of this is in Remote Procedure Call. This is like programming one of those attach processor boxes. grrrr! The other model: a varient of direct interaction is that of process servers. Process servers do not have good models of distributed computation for things like graphics [a new batch job submitted for each interaction]. Richard Watson at LLNL is supposed to be working on one: NLTSS/LINCS. Problem: not ready yet. It is difficult to convey editing large data files [M bytes] on something like a VAX [slow]. U*wizards have argued with me that I should split files and cat them together. This is not always practical in the Cray world. Consider a 4KB on a side image. [Suppose I only have pixels, not objects] Do I have to store each row as a file in a directory with files numbered after rows? It is mmuch easier to deal with this as a single entity from the applications standpoint. A C compiler exists for the Cray. There are an incredible number of applications useful to a Cray, but many were not written portably. Most are written for VAX type [or 68K] machines. The pcc is certainly 'portable' but all these people writing applications in this supposedly portable language are taking sloppy short cuts. The worse is the "int is equivalent to pointer" problem with things like malloc. Unless greater care is taken, don't expect your programs to easily port to a Convex, Cray, SX-2, or whatever. >Wouldn't it make more sense to build a multi-micro system to run the >operating system and for program development (one CPU for each user), >thereby freeing up the vector CPU to actually *run* jobs (one after >the other)? YES. Convince Seymour Cray to do this. Come up with a good paradigm for distributed {interactive} programming. RPC for this type of thing at the applications level sucks. Do it fast, but do it good. The labs, oil companies, plane makers, Lucasfilm [ who knows?] will beat a path to your door. >-- >Doug Pardee -- Terak Corp. -- !{ihnp4,seismo,decvax}!noao!terak!doug > ^^^^^--- soon to be CalComp Brian Reid made some excellent comments in this group about multi-processor flaming, and I think he had a pretty good discussion [correction: monologue] about building 'balanced' systems in net.works. } /* end */ --eugene miya NASA Ames Research Center {hplabs,ihnp4,dual,hao,decwrl,allegra}!ames!aurora!eugene @ames-vmsb.ARPA:emiya@jup.DECNET