[comp.parallel] using network machines as compute servers

larsa@nada.kth.se (Lars Andersson) (12/01/89)

Dear NET!

I want to develop (or rather: I want some one to give me :-) )
a few SIMPLE tools to use for building "compute servers"
running on a set of machines connected in a network. I'm talking about
coarse grained parallelism here but I guess this is relevant also for shared
memory and hypercube type machines. I'm sure there are a number
of ways to approach this problem and even a number of solutions around, but 
faced with the rather bewildering fauna of "parallel operating systems", models
for parallel computations etc. (LINDA, ISIS, Cosmic Environment, MACH (?) ...)
one becomes reluctant to invest work in any one of these, without knowing 
which will have the broadest base in the near future.

There is a large class of problems that can require a huge number of similar,
SCALAR DOMINATED, calculations be carried out, with perhaps a periodic 
gathering and analysis of the solutions. Hence for the parallel subproblems,
vectorization is not interesting, nor is communication speed between the
processors critical. I believe this is called trivial parallelism.

In this situation, one doesn't really care what machine the thing runs on as
long as one is able to access as many cheap CPU cycles as possible, such as the
joint resources of the local net.

What I have in mind is therefore solutions which allow one to run the
subproblems on machines with different architecture, not necessarily 
with a common NFS server. 

Consider the following: One "master" puts  "tasks" in a batch, there to be
picked up by the "slaves" as each completes its current task. When a slave
completes a task it puts the solution in an appropriate batch, there to be
picked up and stored by the master (depending on the problem, one might want
to write directly to a file ...). 

Within my limited knowledge, the only system that has something like this 
"built in" is ISIS (the NEWS service). However, it's not clear to me that this 
is correct or suited for this kind of application, or if it's the only or best 
(most portable) solution. Any comments on the above would be appreciated. 
In particular, examples or code fragments pertaining to this situation would 
be gratefully recieved. Where lies the future?

Yours,

Lars Andersson

larsa@math.kth.se

disclaimer: I'm a mathematician, not a computer scientist.

wen-king@csvax.caltech.edu (Wen-King Su) (12/04/89)

>From: larsa@nada.kth.se (Lars Andersson)

I recommend Cosmic Environment because:

    1) Cosmic Environment runs without modification on all implementations
       of BSD, System V, and Xenix OS with socket support.  

    2) CE's message model is simple and intuitive -- just like malloc and free.

    3) CE message primitives are general and satisfies most programming needs.

    4) CE message primitives can be simply and efficiently emulated in almost
       all multicomputer programming systems.  Therefore, porting CE programs
       to other systems is quite simple.  This is a way to protect your
       investiment in your programs.

<Consider the following: One "master" puts  "tasks" in a batch, there to be
>picked up by the "slaves" as each completes its current task. When a slave
<completes a task it puts the solution in an appropriate batch, there to be
>picked up and stored by the master (depending on the problem, one might want
<to write directly to a file ...). 

This can't be that difficult.  You write a batch process (listed below).
The master put tasks into the batch by sending 'ADD' messages to the batch
process.  The slave process gets a task by sending a 'GET' message to the
batch process and waiting for a 'ADD' message in reply.  You can use another
batch process to collect the results, or add more message types to this
batch program to handle collection of results.

The 'body' of 'ADD' messages contains task specification.  In the simplest
case, it can be a string to be passed to shell.  And the slave process
simply do a "system(message->body)" when it gets the 'ADD' message it asked for.

----------------------------------------------------------------------------
#include <cube/cubedef.h>

typedef struct MHEAD MHEAD;

struct MHEAD {	MHEAD *next;		/* for making a queue */
		short type;		/* type of the message */
		short node, pid;	/* reference of the sender process */
		short count;		/* number of bytes in message body */
		char  body[1]; };	/* actually 'count' bytes long */

MHEAD *task_head, *task_tail;	/* queues of tasks to be work on */
MHEAD *wait_head, *wait_tail;	/* queues of processes waiting for work */

#define ADD 1
#define GET 2

main()
{
    MHEAD *mp, *tp, *wp;

    while(mp = (MHEAD *) xrecvb())	/* repeatedly get a message */
    {
	if(mp->type == ADD)		/* message to add a task.  */
	{
	    if(!wait_head)		/* task not waited for -> queue it */
	    {
		if(task_head) task_tail = task_tail->next = tp;
			 else task_tail = task_head       = tp;
		tp->next = 0;

	    } else			/* waited for -> dispatch task */
	    {
		wp = wait_head; wait_head = wp->next;
		xsend(mp,wp->node,wp->pid); xfree(wp);
	    }

	} else

	if(mp->type == GET)		/* message to get a task */
	{
	    if(!task_head)		/* no task available -> queue it */
	    {
		if(wait_head) wait_tail = wait_tail->next = wp;
			 else wait_tail = wait_head       = wp;
		wp->next = 0;

	    } else			/* available -> dispatch task */
	    {
		tp = task_head; task_head = tp->next;
		xsend(tp,mp->node,mp->pid); xfree(mp);
	    }
	}
    }
}

----------------------------------------------------------------------------

segall@caip.rutgers.edu (Ed Segall) (12/04/89)

> Consider the following: One "master" puts  "tasks" in a batch, there to be
> picked up by the "slaves" as each completes its current task. When a slave
> completes a task it puts the solution in an appropriate batch, there to be
> picked up and stored by the master (depending on the problem, one might want
> to write directly to a file ...). 

This sounds like it it perfectly suited for Linda.

--Ed Segall
--


uucp:   {...}!rutgers!caip.rutgers.edu!segall
arpa:   segall@caip.rutgers.edu

carriero-nicholas@YALE.EDU (Nicholas Carriero) (12/05/89)

In article <7300@hubcap.clemson.edu> wen-king@csvax.caltech.edu (Wen-King Su) writes:
>>From: larsa@nada.kth.se (Lars Andersson)
><Consider the following: One "master" puts  "tasks" in a batch, there to be
>>picked up by the "slaves" as each completes its current task. When a slave
><completes a task it puts the solution in an appropriate batch, there to be
>>picked up and stored by the master (depending on the problem, one might want
><to write directly to a file ...). 
>
>This can't be that difficult.  You write a batch process (listed below).
>The master put tasks into the batch by sending 'ADD' messages to the batch
>process.  The slave process gets a task by sending a 'GET' message to the
>batch process and waiting for a 'ADD' message in reply.  You can use another
>batch process to collect the results, or add more message types to this
>batch program to handle collection of results.
>
>The 'body' of 'ADD' messages contains task specification.  In the simplest
>case, it can be a string to be passed to shell.  And the slave process
>simply do a "system(message->body)" when it gets the 'ADD' message it asked for.
>
[54 Lines of code to implement a batch process deleted.]
>

You're right, it's not that difficult.  Why should the user have to
think about (let alone design and implement) a 'batch' process she
doesn't need?

Attached is a skeletal solution in C-Linda for the master and the
worker (i.e. the control structure for the whole problem), but this
isn't a Linda ad (really!)---solutions with a similar structure could
be given in any number of systems.

-Nick Carriero

#include "prog_defs.h"

master()
{
	int		i;
	struct result	r;
	struct task	t;
	int		tasks;

	/* Create workers. */
	for (i = 0; i < NW; ++i) eval("worker", worker());

	/* Generate tasks. */
	for (tasks = 0; new_task(&t); ++tasks) out("task", t);

	/* Consume results. */
	for (; tasks; --tasks) {
		in("result", ? r);
		update(&r);
	}
	
	/* Kill workers. */
	t.type = DIE;
	for (i = 0; i < NW; ++i) out("task", t);
}

worker()
{
	struct result	r;
	struct task	t;
       
	while (1) {
		in("task", ? t);
		if (t.type == DIE) return;
		compute(&t, &r);
		out("result", r);
	}
} 

wen-king@csvax.caltech.edu (Wen-King Su) (12/06/89)

>From: carriero-nicholas@YALE.EDU (Nicholas Carriero)
<In article <7300@hubcap.clemson.edu> wen-king@csvax.caltech.edu (Wen-King Su) writes:
>>>From: larsa@nada.kth.se (Lars Andersson)
<><Consider the following: One "master" puts  "tasks" in a batch, there to be
>>>picked up by the "slaves" as each completes its current task. When a slave
<><completes a task it puts the solution in an appropriate batch, there to be
>>>picked up and stored by the master (depending on the problem, one might want
<><to write directly to a file ...). 
>>
<>This can't be that difficult.  You write a batch process (listed below).
>
<[54 Lines of code to implement a batch process deleted.]
>
<You're right, it's not that difficult.  Why should the user have to
>think about (let alone design and implement) a 'batch' process she
<doesn't need?
>
<Attached is a skeletal solution in C-Linda for the master and the
>worker (i.e. the control structure for the whole problem), but this
<isn't a Linda ad (really!)---solutions with a similar structure could
>be given in any number of systems.

[a bunch of C-Linda lines deleted]

First of all, you are right; the same solution given in C-Linda can just as
easily be written in the generic C language used by Cosmic Environment.  Let
me make it clear that CE is not produced by somebody sitting around, thinking
about how one might program a multicomputer.  We do not have a person who
earns his degree by creating the Cosmic Environment.  We are all multicomputer
users and application programmers --- I am an author of CE and my project is
distributed discrete-event simulation.

Cosmic Environment is the result of people like me who wanted to protect our
investments.  All too frequently, we find ourselves wasting valuable time
supporting the same programs on several incompatible systems.  We find that
our programs got stuck with systems that becomes obsolete.  We also find
that, although many systems are full-featured and come with everything plus
a kictchen sink, they are almost always mis-focused, and we are unable to do
want we wanted without a lot of work.

We understand that the programmer knows exactly what he need, and the system
aught to give him all the tools he will need, no less but no more.  (We mean
it, and to illustrate that, the programming manual for CE is only 24 pages
long, typesetted in 11 points fonts.)  We have settled on a set of simple,
easy to understand, and easy to implement C primitives which we share with
the public in the form of the Cosmic Environment.  It makes our jobs simpler
and it protects the value of our work.  We use it daily for applications on
several multicomputers we own and for the set of SUN workstations we have.
This week, I am using a CE program that runs a batch which serves a set of
programs running on a dozen workstations.  Each program runs a copy of the
CMU COSMOS circuit simulator to evaulate the performance of one student's
circuit layout for the final grade in their VLSI design class.  Here, my
requirement is a distributed system that allows me to run a program that I
do not want to modify (or know how to modify).  A batch in CE comes handy.

Now, back to the problem at hand.  If the programmer finds it suits his need
to approach a problem using the master/worker model you described, then no
problems.  If you or I want to suggest an alternative solution, no problems
either.  But if the programmer needed a batch, I know I would want to have
simple ways to build my own d*mn batch.

/*------------------------------------------------------------------------*\
| Wen-King Su  wen-king@vlsi.caltech.edu  Caltech Corp of Cosmic Engineers |
\*------------------------------------------------------------------------*/

vnrao@cs.utexas.edu (Nageshwara Rao Vempaty) (12/08/89)

Here is my 2 cents worth.

  I had been using the COSMIC environment for about 8 months now.  I am 
quite happy with it and would recommend it to any one interested in 
implementing Parallel /  Dist.  programs.  The following are what I like most :

1.  The system is stable.  

2.  It can be ported to a wide variety of environments.  

3.  Very compact.  New users can learn fast.

Performance wise,  we were able to get good results - so I am happy with it.

VNRao