baden@lbl-csam.arpa (Scott Baden [CSR/Math]) (09/13/89)
I'd like to tie together our local bevvy of suns into a multicomputer for solving scientific problems. I need simple facilties to fork off processes and to handle message passing. Are there any software packages out there in netland to do this? Scott B. Baden Lawrence Berkeley Laboratory Berkeley, California baden@csam.lbl.gov ...!ucbvax!csam.lbl.gov!baden
okrieg@eecg.toronto.edu (Orran Y. Krieger) (09/16/89)
I'd like to tie together our local bevvy of suns into a multicomputer for solving scientific problems. I need simple facilties to fork off processes and to handle message passing. Are there any software packages out there in netland to do this? Scott B. Baden Lawrence Berkeley Laboratory Berkeley, California baden@csam.lbl.gov ...!ucbvax!csam.lbl.gov!baden Since you specified message passing this is probably not of interest to you. However, I developed for my MASc an implementation of distributed shared data on a network of Sun's. A brief abstract from a talk I am giving is included below (in latex format). The code is free and (hopefully) mainly debugged. \centerline{\Large{\bf Distributed Shared Data:}} \centerline{\Large{\bf An Alternative Approach to Interprocess Communication}} \centerline{\Large{\bf for Distributed Systems}} \vspace*{.1 in} \centerline{Orran Krieger and Michael Stumm} \centerline{University of Toronto} \vspace*{.1 in} In this talk we will describe algorithms that implement the {\em shared data model} of interprocess communication for distributed systems. Traditionally, communication among processes in a distributed system is based on the {\em data passing} model, which includes message passing mechanisms. Although the data passing model is a logical extension of the underlying communication hardware, the {\em shared data} model has a number of advantages, which have made it the focus of recent study\footnote{For example, work by Li, Bisiani and Forin, Cheriton, and Gelernter.}. The shared data model provides a simpler abstraction to the application programmer. Processes use shared data in the same way they use normal local memory; that is, shared data is accessed through {\tt read} and {\tt write} operations. Since the access protocol is consistent with the way sequential applications access data, there is a more natural transition from sequential to distributed applications. Another advantage of the shared data model is that it hides the remote communication mechanism from the processes, again substantially simplifying the programming of distributed applications. We have found the size of programs\footnote{as measured by lines of source code} using the shared data model typically to be half the size of equivalent programs that use the data passing model. With a shared data facility, complex structures can be passed by reference. In contrast, with the data passing model complex data structures that are passed between processes must be explicitly packed and unpacked. In this talk we describe how the shared data model can be realized and then focus our attention on a new algorithm we have developed called TORiS. The most original feature of TORiS is that it makes {\em optimistic} assumptions in order to reduce the amount of communication required between processes. All of the algorithms we describe, including TORiS, have been implemented on a network of SUN workstations. We compare performance results of the TORiS implementation with other implementations of the shared data model.
childers@decwrl.dec.com (Richard Childers) (09/18/89)
baden@lbl-csam.arpa (Scott Baden [CSR/Math]) writes: >I'd like to tie together our local bevvy of suns into a multicomputer >for solving scientific problems. I need simple facilties to >fork off processes and to handle message passing. >Are there any software packages out there in netland to do this? I recall reading about a remote execution daemon that had been developed experimentally a few years ago ... at UC Berkeley. I recall reading about it in the USENIX journal, oh, about three years ago. I'll bet a few hours skimming through old USENIX journals might benefit you. As I recall, the daemon, when given a job, polled the network to see what machines weren't busy and passed the job out to appropriate servers, also running this daemon. When they were done, it was returned. I think it didn't fly because the overhead was too high, networking and on the machines that ran the daemon, to make it cost-effective in the environment it was conceived in. But it would still seem like the ideal core for a loose approach to multiprocessing, provided you had a front end that was capable of dividing the task up into independent pieces that could be processed w/o dependencies .... > Scott B. Baden > > Lawrence Berkeley Laboratory > Berkeley, California > baden@csam.lbl.gov > ...!ucbvax!csam.lbl.gov!baden -- richard -- * * * Intelligence : the ability to create order out of chaos. * * * * ..{amdahl|decwrl|octopus|pyramid|ucbvax}!avsd.UUCP!childers@tycho *
kyw@cs.purdue.edu (Ko-Yang Wang) (09/18/89)
In article <6489@hubcap.clemson.edu> avsd!childers@decwrl.dec.com (Richard Childers) writes: >baden@lbl-csam.arpa (Scott Baden [CSR/Math]) writes: > >>I'd like to tie together our local bevvy of suns into a multicomputer >>for solving scientific problems. I need simple facilties to >>fork off processes and to handle message passing. >>Are there any software packages out there in netland to do this? You may want to look at ISIS from Cornell. ISIS provides some useful functions that allow you to implement distributed and fault-tolerant systems relatively easy. It contains utlities for lightweight tasks, message passing, broadcast, synchronizations, etc. They have a beta version (V1.3.1) for an array of different architectures including Suns(SunOS), Vaxs(4.3Bsd), Gould, HP Spectirum 300(HPUX), PC/RT(AIX), MACH, Next, APOLLO. Better yet, this package is free and is actively supported by the ISIS group. Contact: ISIS Project (Attn: M. Schimizzi) Department of Computer Science Upson Hall Cornell University Ithaca, New York 14853} Phone: 607-255-9198 E-mail: schiz@cs.cornell.edu Contact person: Margaret Schimizzi >I recall reading about a remote execution daemon that had been developed >experimentally a few years ago ... at UC Berkeley. > >I recall reading about it in the USENIX journal, oh, about three years ago. > >I'll bet a few hours skimming through old USENIX journals might benefit >you. As I recall, the daemon, when given a job, polled the network to see >what machines weren't busy and passed the job out to appropriate servers, >also running this daemon. When they were done, it was returned. It is not very difficult to implement (call back) rpc to do this. > >I think it didn't fly because the overhead was too high, networking and on >the machines that ran the daemon, to make it cost-effective in the environment >it was conceived in. But it would still seem like the ideal core for a loose This depends on your application. I designed a project for a graduate course of computer networks in which the students split a very long inner-product into a set of independent vector operations and run them on a set of Suns. If the vector is long enough the overhead is tolerable. This applies to matrix multiply and the gangs also. However, for more general problems, the data dependences make spliting the program into tasks very difficult. >approach to multiprocessing, provided you had a front end that was capable >of dividing the task up into independent pieces that could be processed w/o >dependencies .... Well, this front end is much, much more difficult to build than the distributed daemons. In general, the technology for generating code (such as compilers) for distributed machines is still at its infancy. The last requirement (w/o dependences) is too strong for most applications. You should be willing to handle few dependences with message passing. I don't think there is any tools out there that can handle this yet. > >> Scott B. Baden >> >> Lawrence Berkeley Laboratory >> Berkeley, California >> baden@csam.lbl.gov >> ...!ucbvax!csam.lbl.gov!baden > >-- richard > > * ..{amdahl|decwrl|octopus|pyramid|ucbvax}!avsd.UUCP!childers@tycho * Ko-Yang ------------------------------------------------------------------------------- Ko-Yang Wang | kyw@cs.purdue.edu | ----- ------ ___`___ Department of Computer Sciences | Parallel/Distributed | __|__ [] | --- Purdue University | Processing, CAPO | | | --- West Lafayette, IN 47907 | (317) 494-0813 |`-------, \| [===] -------------------------------------------------------------------------------
brian@june.cs.washington.edu (Brian Bershad) (09/19/89)
avsd!childers@decwrl.dec.com (Richard Childers) writes: >I recall reading about a remote execution daemon that had been developed >experimentally a few years ago ... at UC Berkeley. >I recall reading about it in the USENIX journal, oh, about three years ago. >I'll bet a few hours skimming through old USENIX journals might benefit >you. As I recall, the daemon, when given a job, polled the network to see >what machines weren't busy and passed the job out to appropriate servers, >also running this daemon. When they were done, it was returned. >I think it didn't fly because the overhead was too high, networking and on >the machines that ran the daemon, to make it cost-effective in the environment >it was conceived in. But it would still seem like the ideal core for a loose >approach to multiprocessing, provided you had a front end that was capable >of dividing the task up into independent pieces that could be processed w/o >dependencies .... The system was called Maitrd and it 'flew' quite well. I think it is still in use on a cluster of 750's and a 785 at Berkeley for balancing compiles and text formatting. The majority of the overhead came from not having any kind of network file system. File preprocessing was done locally and shipped to whichever server was currently available. For C programs, this was easy. It was a bit hairier for pascal code, which had intermodule type checking (with files being treated as types). There was almost no overhead to running the daemons. Servers decided whether they were willing to accept or not, and that information was only broadcast when it changed. Clients round-robined the jobs off to the available servers only when their own load was above threshold. The biggest drawback was the effort required to build a new remote service which had to act as a shell for the job on the remote end. But, since there were only a few tasks that had to be load-balanced, the one-time investment was worth it. RPC would not have made this any easier. A paper showed up in a Usenix newsletter in early 1986: "Load Balancing With Maitrd" People have been doing this kind of stuff for years. Brian
jevans@uunet.UU.NET (David Jevans) (09/25/89)
The JIPC system from the University of Calgary is an inter-sun ipc system with remote creation etc. one of the features of jipc is its transparent support of non-homogenous networks (ie data type conversion if required). It is simple to get distributed programs running under jipc, and a network traffic monitor/debugger is available. A number of sites use jipc on small to large networks of suns and vaxen, including several universities and the Naval Research Lab in DC. Contact the University of Calgary dept of comp sci. -dj David Jevans University of Calgary, Department of Computer Science Calgary, Alberta, Canada T2N 1N4