kale@kale.cs.uiuc.edu (L. V. Kale') (05/05/89)
Regarding Carrierro and Gelernter's "Linda in Context" article in a recent CACM issue: I like various aspects of Linda for various reasons. However, I had trouble swallowing their arguments showing SUPERIORITY of Linda over ALL the rest of approaches. Here, I want to talk about one particular aspect of Linda: the tuple space provides a uniform abstraction: One uses these same tuple-space operations to specify different kinds of sharing of information. Is this good or bad? To set up the contrast, let me briefly introduce Chare Kernel, our own machine independent parallel programming system. At run time, the system consists of (a large number of) processes (called chares) which communicate with each other via messages AND share data in specific ways. Programmers write the chare definitions, and use calls for creating a chare and sending a message to another chare whose instance ID is known to the sending chare. In addition, they specify different types of variables ("pseudo global variables") that are used for specific types of sharing of information between chares. For (the simplest) example, if there is information that is fixed at the initialization time, and many chare instances need access to it, it is declared as a read-only variable. There are currently about 6-7 types of such psueo-global variable types (**). Among them is a type called dynamic table. Variables of this type have functionality that subsumes the functionality of tuple-spaces. (Take my word for it, for now, for the sake of argument) The questions are: Which one is a better programming model? Which one is more efficient to implement? The chare kernel provides all ways of sharing of information that Linda can provide. However, it requires programmers to specify the type of sharing. It may be argued that tuple-space is conceptually easier because, say, the programmers do not have to think about who the recepient of a message is. However, in most (or at least some?) cases, the programmers ALREADY know that. Linda simply forbids them from stating it! For example, in a divide-and-conquer application, when a child process produces a result, it can simply send it as a message meant for its parent, (whose address it had received in the creation message, say). I think this is CONCEPTUALLY simpler than just depositing the response in the tuple space with some tag for identification. In the situations when the programmers really would find it difficult to think about whom they are communicating with, or when they are creating genuinely globally shared data, sure, they could use the more general primitives. However, it seems pointless to force an abstraction when it does not fit. >From the efficiency stand-point, the tuple-space abstraction takes its toll on non-shared memory machines. The compiler has to figure out information that the programmer was willing to provide, and the compiler may not be able to figure it out in all cases. So, compared to "specific sharing" in chare kernel, "generative communication" of Linda is potentially a source of inefficieny, and (more important) is not conceptually better either. **: in the Chare Kernel design. The core language implementation exists on shared memory machines as well as non shared memory ones (ipsc/2). Most of the pseudo-global variables are implemented on shared memory machines, and will be implemented on ipsc/2 soon. The algorithms for all the pseudo-global variables have been designed. >From the point of view of the discussion above, all that matters is that such primitives can be efficiently supported (without any compile time analysis).
braner@tcgould.tn.cornell.edu (Moshe Braner) (05/08/89)
[responding to the claim that Linda enforces excessive abstraction] With Chares: parent process tells child it's ID, child is programmed send return result to that destination (the parent). With Linda: parent process looks for result with a specific tag, e.g., a specific string in the first element of the tuple. Child is programmed send return result in a tuple with that tag. I don't see how Linda is restrictive there. In fact, it is less restrictive, but easily allows what is needed here. The tag, suitably chosen, can make the program more readable. The fact that the tuple is put into a common space allows a debugger to examine it before the parent gets it. Etc. As for efficiency, only actual benchmarks can tell. All the approaches for global data are easier to implement on shared-memory machines, of course. It's tough with distributed memory, and will always be less efficient than straight message-passing. But the goal is (for some of us, anyway) an 80%-efficient application created and maintained with one-third the human work as compared with the 90%-efficient optimized implementation... - Moshe Exclaimer: Wow!