retrac@rice.edu (John Carter) (06/09/89)
Several weeks back I posted a request for the specs of commercially available distributed memory multiprocessors (which I didn't define). The responses that I received are included below. I'd like to thank everyone who responded for their time and effort! I appreciate it! I'm sorry that I didn't post the summary earlier, but I've been out of town. Note: Several of the responses point where I can get more information (such as corporate addresses, books, or tech reports). I have not yet followed them through, but when I do I'll post an update. Note II: Followups are redirected to comp.arch. ****************************************************************** Hey. Since you used it as an example, I thought I'd let you know - Amatek is out of business. Greg Frazier -- Greg Frazier o Internet: frazier@CS.UCLA.EDU CS dept., UCLA /\ UUCP: ...!{ucbvax,rutgers}!ucla-cs!frazier ----^/---- / ****************************************************************** Don't forget the Connection Machine 2 (CM2) from Thinking Machines Corp., Cambridge, MA. -- David Browning -- UUCP:...!{rutgers,ucbvax,sdcrdcf,{hao!cepu}}!ucla-se!math.ucla.edu!browning ARPA: browning@math.ucla.edu BITNET: browning%math.ucla.edu@INTERBIT ****************************************************************** The book "Highly Parallel Computing" by Almasi and Gottlieb (Benjamin/Cummings Publishing Co., Inc., c. 1989) has descriptions of a fair number of machines in its later chapters. Chapter 10, MIMD Parallel Architectures, would probably have what you are looking for. It doesn't go into great technical depth, but I think it might have what you're looking for. -- Jonathan Sweedler === National Semiconductor Israel UUCP: ...!{amdahl,hplabs,decwrl}!nsc!taux01!cjosta Domain: cjosta@taux01.nsc.com ****************************************************************** Transputers from INMOS: these are a family of processors from which distributed memory multiprocessors can be built. There are dozens of articles describing them: I will mention only one - my own (!). This is Solving problems with Transputers: background and experience John Wexler and Dominic Prior in Microprocessors and Microsystems (Butterworth Scientific, Guildford, UK) vol 13 No 2 March 1989, pp 67-78 ISSN 0141 9331 MCRPD 13 (2) 65-144 (1989) The whole of the March and April issues are devoted to Transputers. Answers to your questions: these are only order-of-magnitude figures, because I don't have the references with me and I don't have time to look them up: > speed of nodes There is a family of processors of various levels of performance. The top of the range in common use (not the latest thing) is the T800-20 which runs at around 10 (RISCish) MIPS. It has ANSI-standard 64-bit floating-point on the chip, which can sustain 1.5Mflops. > number of nodes Any number. The transputer is the node: the purchaser, or an OEM, can assemble Transputers into any kind of network, either by hardwiring them together or by connecting them through some kind of circuit-switched network. Meiko is an example of a company which builds large systems using sometimes several hundred Transputers. > speed and type of interconnect Each processor has four "links" on the chip with it. A link is an autonomous DMA engine which can handle one inward and one outward transfer at any one time. Thus a Transputer can be running 8 transfers simultaneously with processing. A link on one Transputer can be connected directly to a link on another with no intermediate logic or buffering or electronics; only two wires are needed, because the transfers are bit-serial. A link can run at 20 Mbits/second, but I'm not going to explain how simultaneous transfers in and out, or on several links, can affect one another's speeds. Connections are direct point-to-point: one link on one processor can talk to one link on one other processor. These connections are usually static, at least for the lifetime of a run of a program. > broadcast/multicast support None in hardware, nor any support for message forwarding. Those things have to be done by software, or by non-Transputer hardware in the interconnection network. > memory size 4K bytes on the chip; 32-bit byte-addressing for off-chip memory. User or OEM can supply as much off-chip memory as desired. > memory speed It uses a 32-bit architecture, and it has a clock speed of 20MHz: off-chip memory is accessed in 3 cycles, and there are 4Kbytes of single-cycle memory on-chip with the processor. > caching strategy None (or, to put it another way, determined by software). The processor's on-chip memory can be used just like any other part of memory: a user, or a compiler, or whatever, may choose to use it to hold frequently-needed code, or to hold frequently-needed data, or may not bother to make any particular decision about how to use it. If it does make a choice, it may make a static decision ("I'll keep the high-priority code on-chip") or it may have a dynamic strategy, which is equivalent to caching. Incidentally, just as there is no caching, so there is no large battery of registers such as one commonly finds in processors which are called RISC. INMOS took the line that on-chip memory could serve the same purpose as dozens of registers. There is no memory protection either, nor support for virtual memory. These decisions are a bit eccentric, but they are not without justification. They are not immutable, either; they may appear in newer Transputers. > cost Don't know. I hope this is of some use, John Wexler ****************************************************************** Hi! I think your work is very useful and would like to get a copy of your final collection. I can think of the following items: 1. Inmos Ltd, "Transputer Reference Manual", Prentice-Hall. 2. Thinking Machines Corp., "Connection Machine Model CM-2 Technical Summary" Thinking Machines Tech. Rept. HA87-4. 3. Also, I suggest you send a mail or email to Prof. K. Hwang at Dept. of EE-Systems, USC, LA, CA90089-0781, "kaihwang@panda.usc.edu". He has some data on super and minisuper computers. Bill Xu ****************************************************************** Commercial machines: NCube, IPSC, Ametek, Flex, {a zillion transputer things -- ask Inmos}, etc. Research machines: PASM, ZMob, etc. ... there are perhaps a hundred that I know of. But you have a real problem: Define "distributed memory." E.g., is the BBN butterfly a distributed memory machine? -hankd@ee.ecn.purdue.edu PS: Are we doing your homework for you? [NOTE: Homework? No. Legwork? Yeah, somewhat.] ****************************************************************** You should get some information on the system from Cogent Research. I work for them, so I'm biased, but I wouldn't work here unless I thought this was the neatest thing going. The base system has from 2 to 32 processors, but is expandable beyond that. Unlike most parallel computers, this one comes with a display, keyboard and mouse (sound, even); it runs a UNIX-like parallel operating system, and has a user interface based on NeWS. The operating system is based on the Linda parallel communication paradigm developed at Yale, which makes parallel programming much easier. I can have the marketing people send you some more information if you are interested, or you can contact them yourself. Cogent Research 1100 NW Compton Drive Beaverton, OR 97006 503/690-1450 ****************************************************************** I do not have any specific data (yet, I hope to get some measurements this summer), but the following machines are distributed memory parallel computers: HP Mayfly (formerly FAIM) - Hexagonal interconnect Myrias SPS-2 (Edmonton, Alberta) - 64 - 1024 68020s, 64 per cage/same backplane Hopefully, someone with details will supply more information. If not, let me know. I have more information about these machines, but I'm not sure how current/accurate it is. Lynn Sutherland suther@skorpio.usask.ca ****************************************************************** Here at Intel Scientific Computers, we build the iPSC/2. I'll see if I can get some of our basic info stuff sent out to you. For better response tho', talk to your local iSC rep. His name is Sam Welsh, and can be contacted at Sam Welsh Intel Scientific Computers 7322 SouthWest Freeway Suite 1490 Houston, Texas 77074 713-988-8086 Happy Days... -- ============================================================================ Shannon Nelson "Live, from the arctic reaches of iSC Evaluation ..." shannon@isc.intel.com ...!uunet!littlei!intelisc!shannon (503) 629-7607 Intel disclaims all knowledge of my existence. ****************************************************************** I saw your Comp.arch posting. Here's some info about the Cogent Research XTM system (a distributed memory system). Here is a description of the Cogent Machine overall hardware. The Cogent Workstation consists of two Inmos T800 transputers, each with 4 Mbytes of RAM, a 90 MByte internal hard drive and an 800K floppy drive. The monitor is 1024 by 808 by 256 (8 bits per pixel). A workstation can be connected to one or more computer servers, we call them "Resource Servers." Resource Servers consist of a backplane with 16 slots. One slot is dedicated to a communication board to connect to other Resource Servers or workstations. The other 15 slots can hold processor boards. Each processor board contains 2 T800's again each with 4 MBytes of RAM. In addition to bus communication, there is a crossbar switch to connect links on any two transputers in the Resource Server. The idea is that short messages will go rapidly over the bus, and longer messages will go over the serial links. A Resource Server can have up to 30 processors with 120 Mbyte of RAM. This is in addition to a workstation's 2 processors and 8 Mbytes of RAM. Our operating system is called XTMOS (our Workstation is called an "XTM Workstation", hence XTM-OS). It is highly compatible with the POSIX standard A host of utilities that are normally provided with *IX systems are provided with our system (such as grep, awk, sed, vi, emacs, diff, more, tar, compress, etc...). The underlying communication system implemented is a simplefied and extended version of Yale Linda called "Kernel Linda". This is implemented at a very low level of our OS (for speed). A full Yale Linda implementation is in the works, however Kernel Linda also has some extensions not present in Yale Linda (such as multiple tuple spaces, necessary for our multitasking environment). It is possible to write parallel programs using Kernel Linda that use all the processors available to any one user (for instance a user with a workstation and a resource server could have up to 32 processors available for his program). We also have a very nice windowing package based on Sun's NeWS called "PIX". PIX is our implementation of NeWS (if you're not familiar with NeWS, its sort of an extension of Display Postscript). Languages currently available are C, C++ and Fortran. John Roberts robertsj@ogcadmin.ogc.edu "hanging out at Cogent Research, Inc. and the Oregon Graduate Center" ****************************************************************** That's all. Once again, a hearty thanks to those of you who replied! John John Carter Internet: retrac@rice.edu Dept of Computer Science UUCP: {internet node or backbone}!rice!retrac Rice University Houston, TX Rootin' for the: Rockets, Reds, Dolphins & RedWings