eugene@ames.UUCP (07/17/86)
Two other followins from Tom Deboni and Ian Kaplan came to me. I used to work with Tom and have respect for his ideas as I have also talked alot with Ian. My comments are indented From: deboni@pp.mcc.com (Tom DeBoni) Subject: Comments on the latest flaming on parallelism You notice I do not head this as "flaming on the latest comments..." There's a reason for that. Let's cut through the bafflegap and get at the heart of the matter. What do we really want of the ultimate parallel/concurrent/coordinate/ cooperative/multiprocessor (choose one) computer systems we all like to dream about and (hopefully) work toward? We want one that's smart enough to know what we want it/them to do without our having to exhaust ourselves saying so. In short words, we want ease of programming. That is serious. Programming anything is hard - so hard most of us aren't really sure what it means to do it "right". Programming a multiprocessor system can be as exponentially explosive in difficulty as it is in physical components, and possibly even worse (you will likely have to use all of those myriad components many times in the course of a program execution). Picture yourself a dance-hall caller: everybody obeys your Ah! the classic dance-hall model! calls and dances the steps you specify in unison. If ou have only a few dancers, this is analogous to programming a uniprocessor. Now imagine being each of the dancers simultaneously, do-se-do-ing like crazy, sometimes waiting, sometimes reaching, shuffling, twirling, etc., and with all the strange rhythms you can imagine. This is programming the contemporary multiprocessor system. Now try imagining this all going on asynchronously - without a controlling bass beat; this is what we're in store for in the near future (to an extent it's here already), when more difficult problems and more complex algorithms start forc- ing us out of lock-step mode. Every instruction, wait, send, receive, resume, variable, message, file, OS call, and hardware glitch will be the responsibility of the already up-to-his-neck-in-mud programmer. Do you want the job? I sure don't. So then what? Well, how about establishing some discipline about the use of these complicated systems, in much the same way that modern languages and OS practice did for programming contemporary uni-processors. We use block-struc- tured languages with lexical scoping and rational control structures because they ease the pain of thinking about our programs, and seldom care about the inefficiencies they involve. We use virtual systems and virtual memory for the economic and temporal convenience they afford, and almost never care about the mechanical "man behind the curtain" that gives us the illusion but not the reality; because we find the illusion just dandy, and the reality impractical anyhow. (Sure we'd all love to have a 100 MIP workstation with 1GB of ram, but who could afford one?) Avoiding the inefficiencies is a problem we'd rather not deal with until we are ultimately backed into that corner. No self-respecting programmer today starts out by playing tricks with machine instruction sequences to save clocks (such as loading a small constant and shifting it, to save floating an integer, which may take slightly more time), just as no one would purposely spaghettify code to avoid the use of null else clauses and thus minimize useless branching. These pathognomic practices are saved for the really tough situations, and go in when the rest of the code is well structured for understanding, use, and maintanence. And then only when they're needed. Optimization is a chimera that most of us don't care to chase, and with good reason: life's too short, and such obsessive-compulsive behavior is seldom rewarding enough to justify itself. So why not try to smooth out some of the rough edges and fill in some of the crevices in the parallel/call-it-what-you-will processing world? What am I talking about? About languages with clean semantics, with which automatic tricks can be played, and which give predictable, if inefficient, results. This may sound strange, coming from an architecture person, but I've been down there on the "killing floor", and I "ain't gonna study war no more". I became dissatisfied with advanced architectures when I realized how hard they were for humans to deal with. I thought the answer might be in new execution models, but had no idea what they might be; so, I started thinking about better languages and programming environments, believing that once we've figured out the details of what we really want to do, the rest will be engineering and technology. I'll take a language with a few good high-level mathematical constructs (vector, array, set, stream, etc.), and a mathematically water-tight interpretation (applicative semantics), and let the post processor worry about optimization. Not fair to assume magic on the part of the compiler or it's author, you say. Exactly. We won't need magic - the functionality will save us. And how do we get function definitions to map smoothly and efficiently onto finite state machines? We may not; but new execution models are what we're really hoping for; at best, they will take us the next steps, and at worst, we may be stuck with running functions on state-based machinery. But even in the latter case, I'm willing to bet we can smooth out the interface between the functions and the registers - make it smaller, more compact, and better behaved - and do so much more cost-effectively than we can continue to try to use dance-hall architectures by being the entire dance-hall ourselves. Humbly submitting the above, /* editing */ Thomas M. DeBoni Graduate of the Famous Flamers Institute, and Doctor of Sourcery (deboni@mcc, deboni@sally, cs.deboni@utexas-20, fill in the domains later...) From: nike!ll-xn!s3sun!sdcsvax!loral!ian (Ian Kaplan) Eugene: Some notes on your artice ("The Mythical MIPS (MegaFLOPS)"): > >One problem is that the world is not always parallel. You make use of parallelism where you find it. In some problems there is only obvious pipeline parallelism, in other one can get lots of horizontal parallelism. Part of the problem is that computer science people have been writing about "order" speed ups (e.g. the problem can be sped up by some factor that is a function of the number of processors). In real life (at least at the present time) the way things work is that you have a problem that does not run fast enough on the uni-processor you can afford (there are only a hand full of people that have problems that are too slow for Crays). You then look at putting the problem on a parallel processor. You can either figure out how to do it or you can't. Part of the problem is that many people in the computer science community are looking for general solutions. I suspect that we do not know enough yet about parallel processing to build general purpose machines. To be general purpose a parallel processor should not force the programmer to be aware of the physical architecture. To me this means small grain dataflow machines running something like SISAL, ID or Lucid. As you point out later, there are economics involved. There is great motivation for general pyurpose architectures and I get this high-level argument evey week. I dare say that man special purpose parallel machines do not work `efficiently.' >I see two communities who are not communicating: >physical scientists see "spatial" parallelism: all those difference >equations over a given space, they see meshes, but the computer science people >(typically the AI and compiler people) see "syntactic" parallelism, >they tend to see syntax trees like data flow graphs, for instance. Comment: not a single letter came from the AI community. I suspect the AI people (generally) don't know what they will do with "parallelism." The vision people are an exception. I think we might call the AI bluff soon. What do you mean by a mesh? Any network connected system (like a binary n-cube) is, as I understand it, refered to as a "mesh connected" system. One form of a mesh is a rectangular grid. While grids are good for solving problems that need only nearest neighbor communication, the message passing overhead is such that a grid is not very good for any general class of problems. Mesh: as you say: nearest spatial neighbor. Message passing is lower level, an implementation detail. What about the higher level? The general notation is still poor. We had some SISAL discussions about the annotation of boundary conditions (not PDE, but array boundaries). The concurrent constructs most recently surveyed by Andrews and Schneider in ACM CSs are OS oriented and typically of small numbers of processes like forking. >"The Concept" It seems to me that you ignore the importance of pipeline parallelism for a large class of problems. Many biological systems are pipelined, although baby production is not one of them. Dependance does not always mean that a problem cannot benifit from a parallel machine. Name some (preferably at least 3). That was my point about using Brooks' example. We confuse work and effort. >It would be argued by some that this is for more limited applications >but again those are spatially based problems tned to dominate. Why no >68Ks or 32Ks in a mesh? Again, I am not sure what you mean by mesh. The N-Cube system is the equivilant of a bunch of 68000's connected in a binary n-cube mesh. The Loral dataflow system uses a bunch of 32000's connected by a segmented busses. Off the shelf micro-processors like these are not very good at doing high speed message passing. This is an important component is system performance. If you use a standard processor you must provide front end hardware to support message passing. This is what we do. >Is it all marketing hype? How could the money be >better directed (for research purposes Experimenting with computer architectures is a very expensive business. Historically most of it has been done in the commercial world. This is not to say that research does not play an important part, but only in the commercial world will people do the dirty grunt work that is needed to produce a system that is really usable. > The "tightly coupled" .vs. "loosely coupled" debate went on >7-8 years ago before everyone got tired of it. It was sort of >the analog of the RISC vs. CISC debates of today. The net result >was sort of an agreement that there was a spectrum, not a dicotomy. Well there may be a spectrum, but I still do not understand why presumably intellegent researchers believe that shared memory machines (e.g., tightly coupled) like the NYU Ultracomputer or the RP3 are real large scale parallel processors. (As you might guess I work on a distributed memory, or loosely coupled, system). You make a good point here. Our researchers (numerical analysts) like the guys at NYU have been accused of just extending serial algorithms, few (but more coming) truly parallel algorithms exist. Dongarra was one of the people recently making this point. Are tightly coupled machines MISD? > The latest thing you see in parallel processing is the "real" >numerical analysts who are actually putting the problems on >machines. Until very recently, with a few exceptions from the ILLIAC >days, most parallel numerical analysis has been theoretical. > >Diane. . . There are those that point out that it took a long time for people to learn to use vector processors and that it will take at least as long for people to learn to use parallel processors. Ian Kaplan Loral Dataflow Group USENET: {ucbvax,decvax,ihnp4}!sdcsvax!sdcc6!loral!ian ARPA: sdcc6!loral!ian@UCSD Here is that distinction between parallel and vector again. It did not take long to learn. I would argue that vectorization in some ways is too easy. Everybody can think in terms of vectors and that's why I think machines like Crays, Convexes, and Alliants are doing so well. That's why the Hypercubes are adding vector options and you will see the older companies doing the same (oh I forgot IBM with the 3090, I left out CDC/ETA intentionally, could change). Vectors are very regular, should not parallel be? Well yes, until you get to MIMD. Most of big users would be happy with vectorizable loops with conditionals that's MIMD to them for now. You have convinced me more than ever that we have a terminology problem. From the Rock of Ages Home for Retired Hackers: --eugene miya NASA Ames Research Center com'on do you trust Reply commands with all these different mailers? {hplabs,ihnp4,dual,hao,decwrl,tektronix,allegra}!ames!aurora!eugene eugene@ames-aurora.ARPA