[comp.unix.wizards] Load balancing rsh

tamir@ucla-cs.UUCP (06/06/87)

In the contrib directory of 4.3 BSD there is a program called dsh
that behaves like rsh except that it picks the least loaded
machine (out of a list of machines) on which to run the command.
The program starts out by getting "bids" from the list
of servers.  It picks the least loaded one and uses rsh
to run the commands on it.

I am looking for a "second generation" version of this program.
There seem to be many ways to improve functionality and efficiency.
I am interested in any information about an improved version
of dsh or a completely new program with similar functionality.

			   Yuval Tamir

Internet: tamir@cs.ucla.edu
    UUCP: ...!{ihnp4,ucbvax,sdcrdcf,trwspp,randvax,ism780}!ucla-cs!tamir

jbn@glacier.UUCP (06/07/87)

In article <6450@shemp.UCLA.EDU> tamir@CS.UCLA.EDU (Yuval Tamir) writes:
>In the contrib directory of 4.3 BSD there is a program called dsh
>that behaves like rsh except that it picks the least loaded
>machine (out of a list of machines) on which to run the command.
>...
>I am looking for a "second generation" version of this program.

      It's amusing to hear this from someone in the UCLA computer
science department.  See Popek and Walker, "The LOCUS Distributed
Operating System" (ISBN 0-262-16102-8), which describes a system which
among many other things offers exactly the facilities Tamir wants,
and is, according to the book, running at the UCLA Computer Science
Department on a large scale (a figure of a half million connect hours
is mentioned, spread across a large number of machines of different types.)
Locus supposedly supports reliable, distributed, redundant files, transparent
load-sharing, transparent inter-machine forking, transparent execution of
programs on remote machines of a different CPU architecture.

      Whatever happened to LOCUS, anyway?

					John Nagle

tamir@CS.UCLA.EDU (06/07/87)

In article <17092@glacier.STANFORD.EDU> jbn@glacier.UUCP (John B. Nagle) writes:
#In article <6450@shemp.UCLA.EDU> tamir@CS.UCLA.EDU (Yuval Tamir) writes:
#>In the contrib directory of 4.3 BSD there is a program called dsh
#>that behaves like rsh except that it picks the least loaded
#>machine (out of a list of machines) on which to run the command.
#>...
#>I am looking for a "second generation" version of this program.
#
#      It's amusing to hear this from someone in the UCLA computer
#science department.  See Popek and Walker, "The LOCUS Distributed
#Operating System" (ISBN 0-262-16102-8),  . . .
#      Whatever happened to LOCUS, anyway?

To clear up any misunderstandings:
Locus is alive and well and living on a large number of machines at UCLA!
However, it has not been ported to Suns, IBM RTs, Apollos, or HP workstations.
What I am looking for is a generic 4.[23] BSD program that can be easily
ported to most of these machines.  The functionality I want is *very simple*:
poll some machines for their load average, pick the minimum,
and use rsh to run the program on the least loaded machine.
Input will be through standard input, output will go to standard output.

			   Yuval Tamir

Internet: tamir@cs.ucla.edu
    UUCP: ...!{ihnp4,ucbvax,sdcrdcf,trwspp,randvax,ism780}!ucla-cs!tamir

djl@mips.UUCP (Dan Levin) (06/09/87)

Perhaps more interesting in the long term would be a better version of
"on", the RPC based remote execution tool.  If it knew how to use
the facilities of the stat daemon, it could apply a reasonable 
weighted formula to the load information based on CPU type, memory available,
disk type, size of executing image, OS, etc.  This would provide a
truely useful tool, which would run on any box supporting RPC.

The next move, of course, is to teach the shell how to do this for the
general case. Some interesting work was done by some guys at Princeton
on this, they are giving a paper at USENIX on the topic.

-- 
			***dan

decwrl!mips!djl                  mips!djl@decwrl.dec.com