baden@lbl-csam.arpa (Scott Baden [CSR/Math]) (10/04/89)
I collected quite a few responses to my query on how
to connect an ensemble of suns into a multiprocessing team.
Here is a summary of what I received.
(It took some time for the mail to percolate overseas
and back, that is the reason for the delay in my replying.)
Some entries are undoubtably incomplete.
Corrections and additions are appreciated.
Thanks to all those who contributed!
Scott B. Baden
Lawrence Berkeley Laboratory
Berkeley, California
baden@csam.lbl.gov
...!ucbvax!csam.lbl.gov!baden
I heard about 9 different projects:
1. ISIS (Cornell)
2. Cosmic Environment (Caltech)
3. DOMINO (U. Maryland, College Park)
4. DPUP (U. of Colorado)
5. TORiS (Toronto)
6. LINDA (Yale, Scientific Computing Associates)
7. SR (U. Arizona)
8. MAITRD (U.C. Berkeley/U. Wash)
9. PARMACS (Argonne)
1. ISIS
ISIS is billed as "a toolkit for distributed and fault-tolerant
programming."
It runs on "UNIX on SUN, DEC, GOULD, and HP systems, although
ports to other UNIX-like systems are planned for the future."
The manual is 300 pages long.
If you want ISIS, send mail to
"croft@gvax.cs.cornell.edu," subject "I want ISIS".
2. COSMIC ENVIRONMENT
"The Cosmic Environment (CE) is a generic message-passing multicomputer
control environment ... The goal of CE is to provide a simple
and uniform interface for multicomputers and to allow for the writing
of truly portable application programs. CE is a distributed
environment directly accessible from any UNIX machine connected
to the same TCP/IP network."
"The CE currently supports the following machines:
1) iPSC/1
2) iPSC/2
3) Symult (formerly Ametek) 2010
4) The Cosmic Cube
5) A set of NFS connected work stations pretending to be a
real concurrent machine. Most people use SUN work stations.
People around here call such a cube a "ghost cube"."
You can obtain a programming guide by sending e-mail
to chuck@vlsi.caltech.edu, or postal mail to:
Charles L. Seitz
CS 256-80, Caltech
Pasadena, Ca 91125
3. DOMINO
DOMINO is a message passing environment for parallel computation.
See the Computer Science Dept. (U. Maryland) tech report # TR-1648
(April, 1986) by D. P. O'Leary, G. W. Stewart, and R. A. van de Geijn.
I quote:
"DOMINO is a set of C-language routines with a short assembly language
interface that allows multiple tasks to communicate and schedule
local tasks for execution. These tasks may be on a single processor
or spread among multiple processors connected by a message-passing
network."
You can get a copy of domino from netlib; to get instructions
send mail to one of:
na.netlib@na-net.stanford.edu
netlib netlib@research.att.com
with subject "send index for domino" (or you can put this
in the message-body.)
The members of the DOMINO project can be reached through
ARPANET or NANET at the following addresses.
oleary@mimsy.umd.edu
na.oleary@su-score.arpa
stewart@mimsy.umd.edu
na.pstewart@su-score.arpa
rvdg@mimsy.umd.edu
4. DPUP
DPUP stands for Distributed Processing Utilities Package.
What follows is an abstract from a technical report
written at the Computer Science Dept. at the University of Colorado
by T. J. Garner, et. al
"DPUP is a library of utilities that support distributed concurrent
computing on a local area network of computers.
The library is built upon the interprocess communication
facilities in Berkeley Unix 4.2BSD."
5. TORiS
TORis implements a shared memory communication model.
Contact Orran Krieger at the University of Toronto for more information:
UUCP: {decvax,ihnp4,linus,utzoo,uw-beaver}!utcsri!eecg!okrieg
ARPA: okrieg%eecg.toronto.edu@relay.cs.net
CSNET: okrieg@eecg.toronto.edu
CDNNET: okrieg@eecg.toronto.cdn
6. LINDA
Linda is a parallel programming language for shared memory
implementations. It is simple and has only six operators. C-linda
has been implemented for a network of SUNs in the internet domain.
With LAN-LINDA (also called TSnet) you can write parallel or
distributed programs in C and run them on a network of workstations.
TSnet has been tested on Sun and IBM RT workstations.
Contact David Gelernter (project head) or Mauricio Arango at:
gelernter@cs.yale.edu
arango@cs.yale.edu
TSnet and other Linda systems are being distributed through
Scientific Computing Associates.
Contact
Dennis Philbin
Scientific Computing Associates
246 Church St., Suite 307
New Haven, CT 06510
203-777-7442
7. SR
I quote:
"SR (Synchronizing Resources) is designed for writing distributed
programs. The main language constructs are resources and operations.
Resources encapsulate processes and variables they share;
operations provide the primary mechanism for process interaction.
SR provides a novel integration of the mechanisms for invoking
and servicing operations. Consequently, all of local and remote
procedure call, rendezvous, message passing, dynamic process
creation, multicast, and semaphores are supported. An overview of the
language and implementation appeared in the January, 1988, issue of
TOPLAS (ACM Transactions on Programming Languages and Systems 10,1, 51-86).
SR runs on various machines including (among others): Vax, Sun, NeXT,
and Multimax Encore.
"An SR program runs on one or more networked machines of the same
architecture.
"SR is available by anonymous FTP from Arizona.EDU (128.196.128.118 or
192.12.69.1).
[Copy over the README file for an explanation.]
You may reach the members of the SR project electronically at:
uunet!Arizona!sr-project
or by surface mail at:
SR Project
Department of Computer Science
University of Arizona
Tucson, AZ 85721
(602) 621-2018
8. MAITRD
"The maitr'd software is remote process server that is designed to
farm out cpu expensive jobs to less loaded machines. It has a small
amount of built-in intelligence, in that it attempts to send jobs to
the least loaded machine of the set which is accepting off-site jobs."
`Maitrd' is available via anonymous ftp from
june.cs.washington.edu (128.95.1.4) as ~ftp/pub/Maitrd.tar.Z.
There is also a heterogeneous systems rpc package `hrpc.tar.Z'.
Contact Brian Bershad at U. Washington (brian@june.cs.washington.edu.)
for more information.
A paper showed up in a Usenix newsletter in early 1986:
"Load Balancing With Maitrd"
(This is also a U.C. Berkeley C.S. Division Technical report)
9. PARMACS
David Levine at Argonne National Laboratory tells us about a
"generic package to do send/recv message passing" with
"different versions (c, c++, fortran) [that] work on different machines."
For more information, send email to netlib@mcs.anl.gov, with subject
(or body) ``send index from parmacs.''
For more information send email to
levine@mcs.anl.gov or by uucp: {alliant,sequent,rogue}!anlams!levine.
10. OTHER REFERENCES
========================================================================
Tom Slezak (slezak@lll-lcc.llnl.gov) wrote an article called
"Quick and dirty parallel processing on a network of workstations."
========================================================================
Bart Miller at U. Wisconsin, Madison (bart@cs.wisc.edu) has a package
written "several years ago for connecting processes together
to send messages.... The processes can be on the same machine,
or different machines (it doesn't matter)."
========================================================================
Try the book "Portable Programs for Parallel Processors" by
James Boyle, Ralph Butler, Terrence Disz, Barnett Glickfeld, Ewing
Lusk, Ross Overbeek, James Patterson, and Rick Stevens; ISBN 0-03-014153-2.
>From the preface:
"This book describes a set of tools [written in C] that were developed
at Argonne National Laboratory to enable us to explore issues of
program performance and portability on a fairly broad range of parallel
machines." ... "The original set of tools was developed only for
shared-memory machines. Later, we added the tools to support message
passing among machines that do not have shared memory."
========================================================================
At New Mexico State University there is
"a TCP Remote Message Passing Service which runs on [a]
network of Suns." A daemon is used to control message passing
activity. This software has been in use for "a couple of years."
Contact:
Cari Soderlund
Computing Research Laboratory
New Mexico State University
Box 30001
Las Cruces, NM 88003
cari@nmsu.edu
========================================================================
R. Kannan in the Concurrent Engineering Research Center
at West Virginia University (kannan@cerc.wvu.wvnet.edu)
reports on the development of a tool to handle message passing
activity in a network transparent fashion.
He also mentions a shared memory system called
"User Level Shared variables," and gives us a contact:
Don Libes
Factory Automation Systems Division
NIST (previously NBS)
Gaithersburg, MD 20899
=========================================================
DPSK, DISTRIBUTED PROBLEM SOLVER KERNEL
contact Gregg Podnar (gwp@edrc.cmu.edu)
at the Engineering Design Research Center at CMU
========================================================================
simon%castle.edinburgh.ac.uk@NSFnet-Relay.AC.UK
of Meiko Scientific Ltd in Edinburgh, Scotland replies:
"Meiko's CS-Tools package handles message passing in a hetergeneous network
of Suns and Meiko transputer boxes. It's generally used as a vehicle for
accessing transputer power, but can run simply on a sun network.
Meiko can be contacted in North America at:
Meiko Scientific
Reservoir Place, 1601 Trapelo Road
Waltham MA 02154
Phone: 617 890 7676
========================================================================
Kevin Hammond at the Univ. of Glasgow (kh%cs.glasgow.ac.uk@NSFnet-Relay.AC.UK)
has some examples of socket/RPC code that could help novices.
(Kevin reports some difficulties with the Sun documentation.)