[comp.parallel] Responses to Distributed Memory Multiprocessor Query

retrac@rice.edu (John Carter) (06/09/89)
Several weeks back I posted a request for the specs of commercially
available distributed memory multiprocessors (which I didn't define).
The responses that I received are included below.  I'd like to thank
everyone who responded for their time and effort!  I appreciate it!
I'm sorry that I didn't post the summary earlier, but I've been out
of town.

Note: Several of the responses point where I can get more information
     (such as corporate addresses, books, or tech reports).  I have not
      yet followed them through, but when I do I'll post an update.

Note II: Followups are redirected to comp.arch.

******************************************************************

Hey.  Since you used it as an example, I thought
I'd let you know - Amatek is out of business.

Greg Frazier
-- 
Greg Frazier	    o	Internet: frazier@CS.UCLA.EDU
CS dept., UCLA	   /\	UUCP: ...!{ucbvax,rutgers}!ucla-cs!frazier
	       ----^/----
		   /

******************************************************************

Don't forget the Connection Machine 2 (CM2) from Thinking Machines Corp.,
Cambridge, MA.

-- David Browning
-- 
UUCP:...!{rutgers,ucbvax,sdcrdcf,{hao!cepu}}!ucla-se!math.ucla.edu!browning
ARPA: browning@math.ucla.edu        BITNET: browning%math.ucla.edu@INTERBIT

******************************************************************

The book "Highly Parallel Computing" by Almasi and Gottlieb 
(Benjamin/Cummings Publishing Co., Inc., c. 1989) has descriptions of a
fair number of machines in its later chapters.  Chapter 10, MIMD Parallel
Architectures, would probably have what you are looking for.  It doesn't
go into great technical depth, but I think it might have what you're looking
for.
-- 
Jonathan Sweedler  ===  National Semiconductor Israel
UUCP:    ...!{amdahl,hplabs,decwrl}!nsc!taux01!cjosta
Domain:  cjosta@taux01.nsc.com

******************************************************************

Transputers from INMOS: these are a family of processors from which distributed
memory multiprocessors can be built.  There are dozens of articles describing
them: I will mention only one - my own (!).  This is
    Solving problems with Transputers: background and experience
    John Wexler and Dominic Prior
    in Microprocessors and Microsystems (Butterworth Scientific, Guildford, UK)
    vol 13 No 2 March 1989, pp 67-78
    ISSN 0141 9331
    MCRPD 13 (2) 65-144 (1989)
The whole of the March and April issues are devoted to Transputers.

Answers to your questions: these are only order-of-magnitude figures,
because I don't have the references with me and I don't have time to look
them up:

> speed of nodes
There is a family of processors of various levels of performance.  The
top of the range in common use (not the latest thing)  is the T800-20
which runs at around 10 (RISCish) MIPS.  It has ANSI-standard 64-bit
floating-point on the chip, which can sustain 1.5Mflops.

> number of nodes
Any number.  The transputer is the node: the purchaser, or an OEM, can
assemble Transputers into any kind of network, either by hardwiring them
together or by connecting them through some kind of circuit-switched
network.  Meiko is an example of a company which builds large systems using
sometimes several hundred Transputers.

> speed and type of interconnect
Each processor has four "links" on the chip with it.  A link is an autonomous
DMA engine which can handle one inward and one outward transfer at any one
time.  Thus a Transputer can be running 8 transfers simultaneously with
processing.  A link on one Transputer can be connected directly to a link
on another with no intermediate logic or buffering or electronics; only
two wires are needed, because the transfers are bit-serial.  A link can run
at 20 Mbits/second, but I'm not going to explain how simultaneous transfers
in and out, or on several links, can affect one another's speeds.  Connections
are direct point-to-point: one link on one processor can talk to one link
on one other processor.  These connections are usually static, at least for
the lifetime of a run of a program.

> broadcast/multicast support
None in hardware, nor any support for message forwarding.  Those things have
to be done by software, or by non-Transputer hardware in the interconnection
network.

> memory size
4K bytes on the chip; 32-bit byte-addressing for off-chip memory.  User or
OEM can supply as much off-chip memory as desired.

> memory speed
It uses a 32-bit architecture, and it has a clock speed of 20MHz: off-chip
memory is accessed in 3 cycles, and there are 4Kbytes of single-cycle memory
on-chip with the processor.

> caching strategy
None (or, to put it another way, determined by software).  The processor's
on-chip memory can be used just like any other part of memory: a user, or a
compiler, or whatever, may choose to use it to hold frequently-needed code, or
to hold frequently-needed data, or may not bother to make any particular
decision about how to use it. If it does make a choice, it may make a static
decision ("I'll keep the high-priority code on-chip") or it may have a dynamic
strategy, which is equivalent to caching. Incidentally, just as there is no
caching, so there is no large battery of registers such as one commonly finds in
processors which are called RISC. INMOS took the line that on-chip memory could
serve the same purpose as dozens of registers. There is no memory protection
either, nor support for virtual memory.  These decisions are a bit eccentric,
but they are not without justification.  They are not immutable, either; they
may appear in newer Transputers.

> cost
Don't know.

I hope this is of some use,

John Wexler

******************************************************************

Hi! I think your work is very useful and would like to get a copy of your
final collection. I can think of the following items:
1. Inmos Ltd, "Transputer Reference Manual", Prentice-Hall.
2. Thinking Machines Corp., "Connection Machine Model CM-2 Technical Summary"
	Thinking Machines Tech. Rept. HA87-4.
3. Also, I suggest you send a mail or email to Prof. K. Hwang at Dept. of
	EE-Systems, USC, LA, CA90089-0781, "kaihwang@panda.usc.edu". He has
	some data on super and minisuper computers.

Bill Xu

******************************************************************

Commercial machines: NCube, IPSC, Ametek, Flex, {a zillion transputer things --
	ask Inmos}, etc.

Research machines: PASM, ZMob, etc.

... there are perhaps a hundred that I know of.

But you have a real problem:  Define "distributed memory."  E.g., is the BBN
butterfly a distributed memory machine?

						-hankd@ee.ecn.purdue.edu

PS: Are we doing your homework for you?

[NOTE: Homework?  No.    Legwork?  Yeah, somewhat.]

******************************************************************

You should get some information on the system from Cogent
Research.  I work for them, so I'm biased, but I wouldn't
work here unless I thought this was the neatest thing going.
The base system has from 2 to 32 processors, but is expandable
beyond that.  Unlike most parallel computers, this one comes
with a display, keyboard and mouse (sound, even); it runs a
UNIX-like parallel operating system, and has a user interface
based on NeWS.  The operating system is based on the Linda
parallel communication paradigm developed at Yale, which makes
parallel programming much easier.  I can have the marketing
people send you some more information if you are interested,
or you can contact them yourself.

Cogent Research
1100 NW Compton Drive
Beaverton, OR  97006
503/690-1450

******************************************************************

I do not have any specific data (yet, I hope to get some measurements
this summer), but the following machines are distributed memory parallel
computers:

    HP Mayfly  (formerly FAIM)
        - Hexagonal interconnect
    Myrias SPS-2  (Edmonton, Alberta)
        - 64 - 1024  68020s, 64 per cage/same backplane

Hopefully, someone with details will supply more information.  If not, let
me know.  I have more information about these machines, but I'm not sure
how current/accurate it is.

Lynn Sutherland
suther@skorpio.usask.ca

******************************************************************

Here at Intel Scientific Computers, we build the iPSC/2.  I'll see if I
can get some of our basic info stuff sent out to you.  For better
response tho', talk to your local iSC rep.  His name is Sam Welsh, and
can be contacted at

	Sam Welsh
	Intel Scientific Computers
	7322 SouthWest Freeway
	Suite 1490
	Houston, Texas 77074
	713-988-8086

Happy Days...
-- 
============================================================================
Shannon Nelson         "Live, from the arctic reaches of iSC Evaluation ..."
shannon@isc.intel.com                     ...!uunet!littlei!intelisc!shannon
(503) 629-7607                 Intel disclaims all knowledge of my existence.

******************************************************************

I saw your Comp.arch posting.  Here's some info about the Cogent Research
XTM system (a distributed memory system).

Here is a description of the Cogent Machine overall hardware.

The Cogent Workstation consists of two Inmos T800 transputers, each with
4 Mbytes of RAM, a 90 MByte internal hard drive and an 800K floppy drive.
The monitor is 1024 by 808 by 256 (8 bits per pixel).

A workstation can be connected to one or more computer servers, we call them
"Resource Servers."  Resource Servers consist of a backplane with 16 slots.
One slot is dedicated to a communication board to connect to other
Resource Servers or workstations.  The other 15 slots can hold processor
boards.  Each processor board contains 2 T800's again each with 4 MBytes of
RAM.  In addition to bus communication, there is a crossbar switch to
connect links on any two transputers in the Resource Server.  The idea is
that short messages will go rapidly over the bus, and longer messages will
go over the serial links.  A Resource Server can have up to 30 processors
with 120 Mbyte of RAM.  This is in addition to a workstation's 2 processors
and 8 Mbytes of RAM.

Our operating system is called XTMOS (our Workstation is called an "XTM
Workstation", hence XTM-OS).  It is highly compatible with the POSIX standard
A host of utilities that are normally provided with *IX systems are provided
with our system (such as grep, awk, sed, vi, emacs, diff, more, tar, compress,
etc...).

The underlying communication system implemented is a simplefied and extended
version of Yale Linda called "Kernel Linda".  This is implemented at a very
low level of our OS (for speed).  A full Yale Linda implementation is in
the works, however Kernel Linda also has some extensions not present in
Yale Linda (such as multiple tuple spaces, necessary for our multitasking
environment).

It is possible to write parallel programs using Kernel Linda that use all
the processors available to any one user (for instance a user with a workstation
and a resource server could have up to 32 processors available for his program).

We also have a very nice windowing package based on Sun's NeWS called "PIX".
PIX is our implementation of NeWS (if you're not familiar with NeWS, its
sort of an extension of Display Postscript).

Languages currently available are C, C++ and Fortran.


John  Roberts
robertsj@ogcadmin.ogc.edu

"hanging out at Cogent Research, Inc. and the Oregon Graduate Center"

******************************************************************


That's all.  Once again, a hearty thanks to those of you who replied!

John

John Carter               Internet: retrac@rice.edu
Dept of Computer Science  UUCP:     {internet node or backbone}!rice!retrac
Rice University
Houston, TX               Rootin' for the: Rockets, Reds, Dolphins & RedWings