[comp.sys.next] Mach's support on distributed process ?

ree@uhccux.uhcc.hawaii.edu (Seung Hee Ree) (09/26/89)

	Hi!

	Here are questions concerning Mach's capability in a network
environment.  

	Scenario:

	Suppose there is a NeXT fileserver with a hard-disk ('server'
hereafter) with several diskless NeXT's ('client' hereafter), all
connected by ethernet.  Whenever disk-io's are requested from the
client, the server does the io and sends the result to the client over
the ethernet.

	Problem:

	Generally speaking, this is fine.  But suppose, just suppose
someone on one of those client NeXT's wants to do an "egrep" on the
pattern, say, "elvis" on vary large files--like Webster dictionary.
Then, the SERVER has to read those files, transfer them to the client
over the ETHERNET, and then the cpu on the client can do the search.
Even though the ethernet is quite fast, the transfer of data over the
ethernet does slow down the speed of a job and is unnecessary.

	Experiment:

	To investigate this problem, I did a simple experiment.  I
logged on to one of the diskless NeXT and did an "egrep Ree *" on
about 4M of text files.  This took about 40 seconds.  But when I
"rlogin" to the server and repeated the same task, it took only 20
seconds.  So (roughly by a factor of 2) the data transfer over the
ethernet slows down the speed.

	Solution:

	For a job like "grep," the task requires a heavy disk-io, but
is interested in only the final result.  If this is the case, doing
"egrep" on a client is clearly a waste: the actual io and searching
should be done by the cpu on the server and only the result should be
transfered to the client.
	But there is a catch.  If all the clients request for "egrep"
to the server simultaneously, the server will be bottlenecked.  So
there has to be some kind of decision making involved as to whether the
server should do a job for the client or not, depends on the status of
the server.

	Question:

1)	Is Mach 'intelligent' enough to make this decision by itself?
( Many evidences are against it.  It is also questionable if this kind
of 'intelligence' in OS is feasible at all. )

2)	Is Mach 'aware' of this problem -- i.e., does it provide any
support for the programmers to solve this problem?

( There could be an object in Mach which 
	a) accepts a request from a process
	b) makes a decision about who --the server or the client--
	   should do the work depends on how busy the server is
	c) distributes the work
	d) sends the results back to the requestor.
Is there anything like this?)

3)	If none of the above is true, the decision making is left to
the application itself.  Is there any class/object in the Application
Kit for this purpose?
( Listener/Speaker looks helpful.  But I haven't tried them out yet.  I
am still waiting for the 1.0 to arrive. :-) )


	If you have answers to above questions, you should definitely
reply.  If you have comments, you should reply absolutely.  Now, if
you have source codes, .... tell me when you are coming to Hawaii, I
will buy you a drink.


					Ree


------------------------------------------------------------------------
                              Seung Hee    Ree                
                    Software Engineering Research Lab.
         Information & Computer Science, University of Hawaii
ree@uhccux.uhcc.hawaii.edu                      ree@uhics.ics.hawaii.edu
------------------------------------------------------------------------

merlin@smu.uucp (David Hayes) (09/26/89)

In article <4927@uhccux.uhcc.hawaii.edu> ree@uhccux.UUCP (Seung Hee Ree) writes:
>	But there is a catch.  If all the clients request for "egrep"
>to the server simultaneously, the server will be bottlenecked.  So
>there has to be some kind of decision making involved as to whether the
>server should do a job for the client or not, depends on the status of
>the server.

	Actually, this is not a problem.  Even if some clients were to
run the grep themselves, and others request that the server do it for
them, you would still have the bottleneck.  The bottleneck is the fact
that the server must read the file for each grep, whether it is local
or run on a client.  Disk access is your problem here, not CPU power.


David Hayes	School of Engineering	Southern Methodist University
merlin@smu.edu	uunet!smu!merlin
"Argue for your limitation, and, sure enough, they're yours." - Richard Bach