[comp.lang.prolog] Lin vs. specific sharing

kale@m.cs.uiuc.edu (05/05/89)

Cross posted from comp.parallel

Regarding  Carrierro and Gelernter's "Linda in Context" article in a
recent CACM issue:

I like various aspects of Linda for various reasons.
However, I had trouble swallowing their arguments showing SUPERIORITY
of Linda over ALL the rest of approaches.

Here, I want to talk about one particular aspect of Linda:
the tuple space provides a uniform abstraction:
One uses these same tuple-space operations to specify different kinds
of sharing of information. Is this good or bad?

To set up the contrast, let me briefly introduce Chare Kernel, our own
machine independent parallel programming system.
At run time, the system consists of (a large number of) processes
(called chares) which communicate with each other via messages AND
share data in specific ways.
Programmers write the chare definitions, and use calls for
creating a chare and sending a message to another chare whose instance ID
is known to the sending chare.
In addition, they specify different types of variables
("pseudo global variables")
that are used for specific types of sharing of information between chares.
For (the simplest) example,
if there is information that is fixed at the initialization time, and
many chare instances need access to it, it is declared as a read-only variable.
There are currently about 6-7 types of such pseudo-global variable types (**).
Among them is a type called dynamic table.
Variables of this type have functionality that subsumes the
functionality of tuple-spaces.
(Take my word for it, for now, for the sake of argument)

The questions are: Which one is a better programming model?
Which one is more efficient to implement?
The chare kernel provides all ways of sharing of information that
Linda can provide.
However, it requires programmers to specify the type of sharing.

It may be argued that tuple-space is conceptually easier because, say, the
programmers do not have to think about who the recipient of a message is.
However, in most (or at least some?) cases, the programmers ALREADY know that.
Linda simply forbids them from stating it!
For example, in a divide-and-conquer application, when a child
process produces a result, it can simply send it as a message meant
for its parent, (whose address it had received in the creation message, say).
I think this is CONCEPTUALLY simpler than just depositing the response
in the tuple space with some tag for identification.
In the situations when the programmers really would find it difficult
to think about whom they are communicating with, or when they are
creating genuinely globally shared data, sure, they could use the more
general primitives.
However, it seems pointless to force an abstraction when it does not fit.

From the efficiency stand-point, the tuple-space abstraction 
takes its toll on non-shared memory machines. The compiler has to
figure out information that the programmer was willing to provide,
and the compiler may not be able to figure it out in all cases.

So, compared to "specific sharing" in chare kernel, "generative
communication" of Linda is potentially a source of inefficiency,
and (more important) is not conceptually better either.

**: in the Chare Kernel design. The core language implementation
exists on shared memory machines as well as non shared memory ones (ipsc/2).
Most of the pseudo-global variables are implemented on shared memory
machines, and will be implemented on ipsc/2 soon.
The algorithms for all the pseudo-global variables have been designed.
From the point of view of the discussion above, all that matters is
that such primitives can be efficiently supported (without any compile
time analysis).