[net.arch] multiprocessors

eugene@ames.UUCP (Eugene Miya) (05/20/85)

[My three cents worth.]
			SUMMARY
	1) multiprocessors and uniprocessors
	2) parallelism semantics (two places)
		Sequin
		Whorf
	3) RISC comment -- ACM ref.
	4) programming multiprocessors
	5) programming Crays [not brief]
	6) portability of C and other applications to supercomputers
	7) models of parallel and distributed computation

{ /* start */
> 
> my qualifications are: i have no experience with multiprocessors.

That's okay we all have to start some where.  In a survey I did for NASA,
I found a paper which pointed out there are really two important points
of view:

MULTIprocessors are a special case of UNIprocessors. [Just lots of them]
UNIprocessors are a special case of MULTIprocessors [just simpler].
I can try to find that IEEE reference if you want it.

Programming is going to be a real challenge because programming as we
know it is going to get harder before it gets easier.  Programming
multiprocessors is a lot like 'systems' programming where 'system'
has an artificial distinction with 'applications' programming.

> my observation is that uniprocessors are not really uniprocessors at all
> (at least the fast ones).  they are really made up of multiple functional
> units.  these units might be rather specialized in function: multiplication,
> one of the goals

Is the PDP-11 a parallel processor?  It has all these wide PARALLEL busses.
Carlo Sequin in Proc. IEEE had a table on the degrees of parallelism
from the BIT-level [Hardware] to the "Task [note software]"- level.
Parallel is where you see it.

> of risc architectures are that the compiler has an easier time to set up
> the pipelines so that these functional units are indeed kept busy with useful

This is one rationalization.
RISCs have other values.  For other rationalizations for RISC, see Dave
Paterson's paper on RISCs in the January 85 CACM.

> we have all been taught to think sequentially to solve
> problems because of the programming languages that we were taught early on.
> i think that its more natural to solve some problems non-sequentially (albeit
> fuzzily).
> danny chen
> ihnp4!hou2b!dwc

Ah! Whorf's hypothesis about languages!  This is a problem.  Consider the
following:
	Suppose you only had tools [languages] which worked in parallel.
	How would your thinking change?
A common comment is: "Put
APL on the Cray," but no one to my knowledge has done this
preferring to stick to FORTRAN-like languages: CFT, CIVIC, etc.
and you can include languages like Pascal.  A problem of course is
that most supercomputers run batch operating systems.  More, see below.
You still have to do some things sequentially.  Gordon Bell speaking at
ACM 84 was distressed that many people were planning to build 'parallel'
5th gen. machines.  He said they had little idea how hard it was, and that
this type of research should have been done in the 1970s instead of
the commercial over-emphasis on micros [last bit is my paraphasing].

We are [somewhat] blindly seeking more performance.  Too many people
are jumping in parallel processing without seeing some of the benefits
they gained from sequential processing that they might lose going parallel.
Simulation forced people to sit down and think sequentially.  This is
a subtle point.

From Doug Pardee:

>> If you know
>> how to build a 100-MIP uniprocessor CPU, or 10 10-MIP processors for the
>> same instruction set, or 100 1-MIP processors, then it is always much better
>> to have the uniprocessor.

We thought about orbiting a Cray around Venus for radar image processing,
but it's not space qualified.  Can you imagine orbiting all that freon? 
No, not always.

>Which leaves me kinda confused...

You are not alone.

>I gather from the preceding discussion that the Cray is a SIMD (vector)
Yes.
>machine, and does quite nicely on achieving high performance working on
>a single job.  So why then would anyone want to bog it down with a
>multi-user operating system?

I am certain Eugene Brooks and the people at BRL and other sites will speak.
There are two basic philosophies using supercomputers: interactive
and batch execution.  We have two Cray XMPs which mirror these two
philosophies.  Both run the Cray Operating System [it's a lot like
EXEC*1100].  The production XMP/22 is used strictly in a batch mode.
I have to submit a batch job to correct simple typographic errors.
The second XMP/12 is on an interactive front-end.  I find myself much
more productive.

It is easy to say interactive use is wasteful of cycles, but
we do not really understand how to hook up micro front-ends
very well.  Cray RI has recently been looking at workstations, but
distributed software is in short supply.  There is one highly tauted
version of a distributed emacs for PC to Cray work.  It's not emacs.

So I have two basic models: I can choose to use efficient cycles via
batch.  This means I have to to RJE, and may have to learn a second
operating system, and a new set of commands, and so forth.  There is
a variant of this is in Remote Procedure Call.  This is like programming
one of those attach processor boxes.  grrrr!

The other model: a varient of direct interaction is that of process servers.
Process servers do not have good models of distributed computation for
things like graphics [a new batch job submitted for each interaction].
Richard Watson at LLNL is supposed to be working on one: NLTSS/LINCS.
Problem: not ready yet.

It is difficult to convey editing large data files [M bytes] on
something like a VAX [slow].  U*wizards have argued with me that
I should split files and cat them together.  This is not always
practical in the Cray world.  Consider a 4KB on a side image.
[Suppose I only have pixels, not objects] Do I have to store each
row as a file in a directory with files numbered after rows?
It is mmuch easier to deal with this as a single entity from the
applications standpoint.

A C compiler exists for the Cray.  There are an incredible number of
applications useful to a Cray, but many were not written portably.
Most are written for VAX type [or 68K] machines.  The pcc is certainly
'portable' but all these people writing applications in this supposedly
portable language are taking sloppy short cuts.  The worse is the
"int is equivalent to pointer" problem with things like malloc.

Unless greater care is taken, don't expect your programs to easily
port to a Convex, Cray, SX-2, or whatever.

>Wouldn't it make more sense to build a multi-micro system to run the
>operating system and for program development (one CPU for each user),
>thereby freeing up the vector CPU to actually *run* jobs (one after
>the other)?

YES.
Convince Seymour Cray to do this.  Come up with a good paradigm for
distributed {interactive} programming.  RPC for this type of thing
at the applications level sucks.  Do it fast, but do it good.  The
labs, oil companies, plane makers, Lucasfilm [ who knows?] will beat
a path to your door.

>-- 
>Doug Pardee -- Terak Corp. -- !{ihnp4,seismo,decvax}!noao!terak!doug
>               ^^^^^--- soon to be CalComp

Brian Reid made some excellent comments in this group about multi-processor
flaming, and I think he had a pretty good discussion [correction:
monologue] about building 'balanced' systems in net.works.

} /* end */

--eugene miya
  NASA Ames Research Center
  {hplabs,ihnp4,dual,hao,decwrl,allegra}!ames!aurora!eugene
  @ames-vmsb.ARPA:emiya@jup.DECNET