udi%wisdom.weizmann.ac.il@CUNYVM.CUNY.EDU (Udi Shapiro) (04/15/89)
In the recent Communications of the ACM (April 1989, pp. 444-458) Carriero and Gelernter compare Linda to "the Big Three" families of parallel languages: concurrent object oriented languages, concurrent logic languages, and functional languages. While I accept many of their statements concerning concurrent object oriented languages and functional languages, I find the comparison with concurrent logic languages misguided. The dining philosophers ----------------------- Carriero and Gelernter did not have to set up a straw-man for the purpose of comparison --- Ringwood in the January 1988 issue of Communications has done a fine job for them. With its 70 lines of (incomplete) code, six procedures and four illustrative figures, Ringwood's Parlog86 program for solving the dining philosophers problem is indeed an easy turf. This embarrassingly cumbersome program is not an appropriate basis for comparison with concurrent logic languages. The reason is that a dining philosopher can be specified by a much simpler concurrent logic program: phil(Id,[eating(LeftId,done)|Left],Right) :- % Left is eating, wait till phil(Id,Left,Right). % he is done. phil(Id,Left,[eating(RightId,done)|Right]) :- % Right is eating wait till phil(Id,Left,Right). % he is done. phil(Id,[eating(Id,Done)|Left]^,[eating(Id,Done)|Right]^) :- % Atomically grab both forks ... eat, when done, unify Done=done ..., phil(Id,Left,Right). The program is independent of the number of philosophers dining. A dinner of n philosophers can be specified by the goal: phil(1,Fork1,Fork2), phil(2,Fork2,Fork3),... , phil(n,Forkn,Fork1). whose execution results in each of the Fork variables being incrementally instantiated to a stream of terms eating(Id,done), with the Id's on each Fork reflecting the order in which its two adjacent philosophers use it. The program is written in the language FCP(^) (a variant of Saraswat's FCP(\downarrow)). The unannotated head arguments are matched against the corresponding terms in the goal atom, like in Parlog86 and GHC. In contrast, the ^-annotated terms are unified with the corresponding terms, as in Prolog. A goal atom can be reduced with a clause if both the input matching and the unification succeed. The reduction of a goal with a clause is an atomic action: it either suspends or fails without leaving a trace of the attempt, or it succeeds with some unifying substitution, which is applied atomically to the variables it shares with other processes. The key to the simplicity of the program is indeed the ability of FCP(^) to specify atomic test unification: the philosopher atomically tries to grab both forks, excluding other philosophers from grabbing them. The mutual exclusion is obtained by unifying the head of the Fork stream with a term containing the unique Id of the philosopher. (Atomic test unification is incorporated in all the languages in the FCP family, but not in Parlog86 or GHC. Programs exploting atomic test unification cannot be easily written in the latter two languages.) The deadlock-freedom of the program is guaranteed by the language semantics; specifically, if two or more processes compete for unifying the same variable(s), one succeeds and the other will see the variable(s) instantiated (deadlock-free atomic unification can be implemented using a variant of the two-phase locking protocol; see Taylor et al., Chapter 38 in the Concurrent Prolog book). However, like the Parlog86 and the Linda programs, this program is not starvation-free. It can be further enhanced to achieve this as well. The need for assigning unique identifiers to philosophers can be eliminated if the language is strengthened with read-only variables or with a test-and-set primitive. We consider the second extension: in FCP(^,!), an !-annotated term in the clause head denotes an output assignment of the term to a variable in the corresponding position in the goal. The assignment fails if the corresponding position is a non-variable term. It can be mixed (but not nested) with ^-annotations, and all assignments and unifications should be done atomically. In FCP(^,!), an anonymous dining philosopher can be specified by: phil([eating(done)|Left],Right) :- % Left is eating, wait till phil(Left,Right). % he is done. phil(Left,[eating(done)|Right]) :- % Right is eating wait till phil(Left,Right). % he is done. phil([eating(Done)|Left]!,[eating(Done)|Right]!) :- % Atomically grab both forks ... eat, when done, unify Done=done ..., phil(Left,Right). A dinner of n anonymous philosophers: phil(Fork1,Fork2), phil(Fork2,Fork3),... , phil(Forkn,Fork1). is of course not as interesting. Each Fork variable is bound to a stream of eating(done). Embedding Linda --------------- Sure the picture of Linda vs. concurrent logic languages looks different after rehabilitating the dining philosophers. But is this the whole story? Carriero and Gelernter like to think of Linda as very different from all the three models they explore. I don't think this is so, and would like to claim that Linda incorporates fundamental aspects of the concurrent logic programming model, although perhaps in a degenerated form. One aspect is the matching that takes place in an 'in' and 'rd' operation: it is a degenerate form of unification. Another aspect is shared data structures. Both in Linda and in concurrent logic programs processes communicate by creating a shared data structure. A stream of messages between two concurrent logic processes is just a shared data structure constructed incrementally from binary tuples. However, a shared structure in Linda cannot contain the equivalent of logical variables (although the proposed extension to Linda of nesting Tuple Spaces is a step in this direction); the ability to construct and communicate structures with logical variables (incomplete terms, incomplete messages) is the source of the most powerful logic programming techniques. Another difference is that in a concurrent logic language a shared structure can only be increased monotonically (by further instantiating variables in it), and can only be reclaimed by garbage collection when unreferenced. In Linda, tuples can be explicitly deleted from the shared Tuple Space. The similarity between the cooperative incremental construction of logical terms and Linda Tuple Space operations is perhaps best illustrated by showing a concurrent logic program implementation of Linda's primitives. For simplicity, the concurrent logic program represents the Tuple Space as an incomplete list of terms of the form tuple(OutId,InId,T). (The same ideas can be applied with a more efficient data structure, such as a search tree or a hash table.) The operations are defined, of course, so that several processes can perform them concurrently on the same Tuple Space. Each 'in' and 'out' operation must be given a unique identifier for the purpose of mutual exclusion. An 'out' operation with tuple T and identifier OutId inserts the term tuple(OutId,_,T) at the tail of the Tuple Space list. % out(Id,T,Ts) :- Tuple T inserted to Tuple Space Ts with an 'out' % operation with identifier Id. out(Id,T,[tuple(Id,_,T)|Ts]^). % Found tail of Ts, insert T. out(Id,T,[_|Ts]) :- % Slot taken, out(Id,T,Ts). % keep looking. 'out' can be enhacned to return the new tail of the Ts stream. A subsequent 'out' can be applied to this tail, thus avoiding searching the entire Ts for each 'out'. An 'in' operation with tuple template T and identifier InId finds a term tuple(_,InId',T') in the Tuple Space such that T' unifies with T and InId' unifies with InId. % in(Id,T,Ts) :- Tuple T in Tuple Space Ts deleted by an 'in' operation with % identifier Id. in(Id,T,[tuple(_,Id^,T^)|Ts]). % Found unifiable tuple, delete it. in(Id,T,[tuple(_,Id',T')|Ts]) :- (Id,T)=\=(Id',T') | % Tuple deleted or doesn't unify, in(Id,T,Ts). % keep looking. A third argument, which returns a Tuple Space in which deleted tuples are removed can be added as an optimization. The precise interrelation between 'in' and 'rd' are not specified in the Linda papers. Assuming it is defined so that a tuple deleted from the tuple space by 'in' should not be visible to any subsequent 'rd', the corresponding concurrent logic program is: rd(T,[tuple(_,InId,T^)|Ts]) :- % Found tuple var(InId) | true. % that has not been deleted yet. rd(T,[tuple(_,InId,T')|Ts]) :- (foo,T)=\=(IdIn,T') | % Tuple deleted or doesn't unify, rd(T,Ts). % keep looking. The program assumes that 'foo' is not used as an identifier. It uses the 'var' guard construct to check that InId is still a variable, i.e. the tuple has not been deleted by a concurrent 'in' process. If weaker synchronization is allowed between 'in' and 'rd', then the corresponding logic program may use weaker synchronization as well. These programs can be executed 'as is' under the Logix system (a concurrent logic programming environment developed at the Weizmann Institute). A Logix run of the following goal, consisting of several concurrent operations on an initially empty Tuple Space Ts: out(1,hi(there),Ts), in(2,hi(X),Ts), rd(hi(Y),Ts), in(3,hi(Z),Ts), out(4,hi(ho),Ts) results in the following substitution: X = there Z = ho Ts = [tuple(1, 2, hi(there)), tuple(4, 3, hi(ho)) | _] indicating that hi(there) was inserted to Ts with operation 1, deleted with 2 and hi(ho) was inserted with 4 and deleted with 3. In this particular run, in(3,hi(Z),Ts) reached the second tuple before rd(hi(Y),Z), so the 'rd' process remains suspended. If we execute an additional 'out' operation: out(5,hi(bye),Ts) then both 'out' and the suspended 'rd' terminate, resulting in the substitution: Y = bye Ts = [tuple(1, 2, hi(there)), tuple(4, 3, hi(ho)), tuple(5, _,hi(bye)) | _] Note that the last tuple inserted has not been deleted yet, so its InId is still uninstantiated. Using test-and-set, 'in' and 'out' can be implemented even more simply in FCP(^,!), without using unique identifiers. A Tuple Space is represented by an incomplete list of terms tuple(T,Gone), where Gone is instantiated to 'gone' if T was deleted. out(T,[tuple(T,Gone)|Ts]!). % Found tail of Ts, insert T out(Id,T,[_|Ts]) :- % Slot taken out(Id,T,Ts). % keep looking in(T,[tuple(T^,gone!)|Ts]). % Found unifiable tuple, delete it in(T,[tuple(T',Gone)|Ts]) :- (T,foo)=\=(Id,Gone) | % Tuple deleted or doesn't unify in(T,Ts). % keep looking Supporting 'eval' in a language that already supports dynamic process creation seems redundant, so we do not address it. Embeddings of concurrent object oriented languages and of functional languages in concurrent logic languages have been reported elsewhere. The trivial embeddings of Linda in FCP(^) and in FCP(^,!) are a further evidence to the generality and versatility of concurrent logic programming. The DNA example --------------- Another example in Carriero and Gelernter's paper is the DNA sequence analyzer program. They show a functional program and a Linda program for solving it, and argue that the functional program is only slightly preferable on readability and elegance grounds, and that the Linda program is better in giving explicit control of parallelism to the programmer, and in not relying on a sophisticated compiler to decide on its parallel execution. We show below a concurrent logic program solving this problem that, we believe, is uniformly better than the other two. It is shorter, clearer (to us at least) and specifies a more efficient algorithm than the other two. The program specifies a "soft-systolic" algorithm, along the lines of the verbal description of the problem by Carriero and Gelernter (See Chapter 7 in the Concurrent Prolog book for the concept of systolic programs, and Chapter 8 for an analysis of their complexity). An array process network is spawned, one 'cell' process for each pair (i,j), with near-neighbours connections. Each 'cell' process receives the right H and P values from its right neighbour and the bottom H and Q values, as well as the diagonal (bottom right) H value from the bottom neighbour, computes the local values of the functions p, q, and h, and sends them to its top and right neighbours. The top neighbour also receives the H value from the right. % h(Xv,Yv,Hm) :- Hm is the similarity matrix of vectors Xv and Yv. h(Xv,Yv,Hm) :- matrix(Xv,Yv,Hm,Ts). matrix([A|Xv],Yv,[Hv|Hm]^,Ts) :- row(A,Yv,Hv,Left,Ts,Bs), matrix(Xv,Yv,Hm,Bs). matrix([],Yv,[]^,Bs) :- initialize_bottom(Bs). row(X,[Y|Yv],[H|Hv]^,Left,[Top|Ts]^,[Bottom|Bs]^) :- cell(X,Y,H,Left,Right,Top,Bottom), row(X,Yv,Hv,Right,Ts,Bs). row(X,[],[]^,(0,0)^,[]^,[]^). initialize_bottom([(0,0,0)^|Ts]) :- initialize_bottom(Ts). initialize_bottom([]). cell(X,Y,H^,(H,P)^,(Hr,Pr),(Hr,H,Q)^,(Hbr,Hb,Qb)) :- compare(X,Y,D), P := max(Hr-1.333,Pr-0.333), Q := max(Hb-1.333,Qb-0.333), H := max(Hbr+D,max(P,max(Q,0))). compare(X,X,1^). compare(X,Y,-0.3333^) :- X=\=Y | true. The program is more efficient than the other two in two respects. First, it avoids recomputation of the auxiliary functions p and q (although some future Crystal compiler may automatically transform the functional program into a more intricate functional program that avoids the recomputation). Second, it specifies a regular process structure with local communication only. This structure can be mapped --- manually, using Turtle programs, or automatically --- to a fine grain concurrent multicomputer in a way that near processes reside in near processors. Mapping efficiently the other two programs, which are not specified by process networks and have non-local communication, is more difficult. Conclusions ----------- Carriero and Gelernter conclude that "the tools in concurrent logic languages are too policy-laden and inflexible to serve as a good basis for most parallel programs". Knowing that the policy of concurrent logic languages results in the "more elegant alternative" (i.e. Linda) being just 6 clauses away, I wonder if Carriero and Gelernter still stand by this statement. The major question is what price one has to pay for the generality of concurrent logic languages. Implementations of concurrent logic languages on sequential computers show that, although they can't compete with conventional languages like C, they are certainly usable, and score well when compared with efficient Prolog implementations. (This is in spite of the large number of processes created during a computation. In the short Logix session required to compile, test and debug the programs shown in this note, 145,611 processes were created and 6 minutes and 29 seconds of Sun/3 CPU were consumed). Parallel implementations show good speedups but yet low speeds. A fair estimate is that more than 30,000 lines of FCP code were developed using the Logix system, including the Logix system itself. So concurrent logic languages are certainly for real. However, the time in which the loss in performance in using concurrent logic languages is small enough to be easily compensated by the gain they offer in expressiveness has yet to come.
carriero@YALE.EDU (Nicholas Carriero) (04/17/89)
We've gotten a few email messages about the CACM article. I suspect we will being getting a few more. So expect a response to Udi's interesting comments in a forthcoming note that will attempt to address his and others' remarks. We will probably send it sometime towards the end of next week. Udi, I wouldn't want to keep you in suspense: Carriero and Gelernter conclude that "the tools in concurrent logic languages are too policy-laden and inflexible to serve as a good basis for most parallel programs". ... I wonder if Carriero and Gelernter still stand by this statement. Yes, foursquare. Sometime ago someone posted a request for a bibliography. I sent something (probably to the wrong address) which hasn't appeared, so I've appended it below. -Nick Carriero In response to a request for a bibliography on Linda, I've cut and pasted from four sources to produce this somewhat strange collection of citations. Sorry about the variety of formats, but I haven't time to clean this up. While I believe I have eliminated duplicate citations, I have not tried to rationalize the citation labels used, so there might be some duplicates or inconsistencies in those. Two refs with incomplete data: 1) U Tokyo paper in OOPSLA 88 on Objects and Tuple Space. 2) Lothar Borrmann and Martin Herdieckerhoff, Parallel Processing Performance in a Linda System. Siemens AG, Munich, Germany Three new Yale TRs: 1) The Implementation and Performance of Hypercube Linda Robert Bjornson, Nicholas Carriero and David Gelernter March, 1989 2) Multiple tuple spaces in Linda David Gelernter March, 1989 3) Information Management in Linda David Gelernter April, 1989 Now for TeX-like citations: \bibitem{BoHeKl88} Borrmann, Herdieckerhoff, Klien, {\it Implementation of the Linda Concept on a Hierarchical Multiprocessor}\/, Conpar Proceedings, 1988. % To appear in CACM: \bibitem{CaGele88} Carriero and Gelernter, {\it Linda in Context}\/, Yale, April 1988 % To appear in Computing Surveys \bibitem{CaGe88} Carriero and Gelernter, {\it How to write parallel Programs}\/, Yale, May 1988. \bibitem{Gel88} David Gelernter, {\it Parallelism to the People}\/, BYTE, Jan. 1989. \bibitem{KrAhCaGe88} Krishnaswarmy, Ahuja, Carriero, Gelernter, {\it The Architecture of a Linda Coprocessor}\/, Proceeding Int. Architecture conference, 1988. \bibitem{Xu88} Xu, {\it A fault-tolerant network kernel for Linda}\/, MIT/LCS/TR-424, MIT Lab. for Comp. Sci., Aug. 1988. % BIBTEX file for the Linda publications. % % Entries are alphabetical with respect to the first author's % last name. An author's works are ordered chronologically % with the most recent article first. The key for an entry % is the author's last name concatenated with the last two % digits of the publication year. A lower case character is % added if an author has more than one publication in a single year. % % DJ Berndt @article{Ahuja88, author = {Ahuja, Sudhir and Carriero, Nicholas J. and Gelernter, David H. and Krishnaswamy, Venkatesh}, title = {Matching Language and Hardware for Parallel Computation in the {L}inda Machine}, journal = {IEEE Transactions on Computers}, volume = {37}, number = {8}, pages = {921--929}, month = {August}, year = {1988} } @article{Ahuja86a, author = {Ahuja, Sudhir and Carriero, Nicholas J. and Gelernter, David H.}, title = {{L}inda and Friends}, journal = {IEEE Computer}, volume = {19}, number = {8}, pages = {26--34}, month = {August}, year = {1986}, annote = {A good presentation of Linda stressing its support for an uncoupled programming style. MA} } @inproceedings{Ahuja86b, author = {Ahuja, Sudhir and Carriero, Nicholas J. and Gelernter, David H.}, title = {Progress Towards a {L}inda Machine}, booktitle = {Proceedings of the International Conference on Computer Design}, month = {October}, year = {1986}, pages = {97--101} } @manual{Berndt89, author = {Berndt, Donald J.}, title = {{C}-{L}inda Reference Manual}, organization = {Scientific Computing Associates}, month = {January}, year = {1989} } @techreport{Bjornson88, author = {Bjornson, Robert and Carriero, Nicholas J. and Gelernter, David H. and Leichter, Jerrold S.}, title = {{L}inda, the Portable Parallel}, institution = {Yale University Department of Computer Science}, month = {January}, year = {1988}, type = {Research Report}, number = {520} } @inproceedings{Carriero88, author = {Carriero, Nicholas J. and Gelernter, David H.}, title = {Applications Experience with {L}inda}, booktitle = {Proceedings of the ACM Symposium on Parallel Programming}, month = {July}, year = {1988} } @phdthesis{Carriero87, author = {Carriero, Nicholas J.}, title = {Implementing Tuple Space Machines}, school = {Yale University}, address = {New Haven, Connecticut}, year = {1987}, note = {Department of Computer Science} } @inproceedings{Carriero86a, author = {Carriero, Nicholas J. and Gelernter, David H. and Leichter, Jerrold S.}, title = {Distributed Data Structures in {L}inda}, booktitle = {Proceedings of the ACM Symposium on Principles of Programming Languages}, month = {January}, year = {1986}, address = {St. Petersburg, Florida} } @article{Carriero86b, author = {Carriero, Nicholas J. and Gelernter, David H.}, title = {The {S}/{N}et's {L}inda Kernel}, journal = {ACM Transactions on Computer Systems}, pages = {110--129}, month = {May}, year = {1986} } @inproceedings{Gelernter87, author = {Gelernter, David H. and Jagganathan, S. and London, T.}, title = {Environments as First Class Objects}, booktitle = {Proceedings of the ACM Symposium on Principles of Programming Languages}, month = {January}, year = {1987} } @article{Gelernter86, author = {Gelernter, David H.}, title = {Domesticating Parallelism}, journal = {IEEE Computer}, volume = {19}, number = {8}, month = {August}, year = {1986}, pages = {12--16}, note = {Guest editor's introduction.} } @article{Gelernter85a, author = {Gelernter, David H.}, title = {Generative Communication in {L}inda}, journal = {ACM Transactions on Programming Languages and Systems}, volume = {7}, number = {1}, month = {January}, year = {1985}, pages = {80--112} } @inproceedings{Gelernter85b, author = {Gelernter, David H. and Carriero, Nicholas J. and Chandran, S. and Chang, S.}, title = {Parallel Programming in {L}inda}, booktitle = {Proceedings of the International Conference on Parallel Processing}, month = {August}, year = {1985} } @phdthesis{Gelernter82, author = {Gelernter, David H.}, title = {An Integrated Microcomputer Network for Experiments in Distributed Processing}, school = {State University of New York at Stony Brook}, address = {Stony Brook, New York}, year = {1982}, note = {Department of Computer Science} } @inproceedings{Lucco86, author = {Lucco, S.}, title = {A Heuristic {L}inda Kernel for Hypercube Multiprocessors}, booktitle = {Proceedings of SIAM Conference on Hypercube Multiprocessors}, month = {September}, year = {1986} } @inproceedings{Whiteside88, author = {Whiteside, Robert A. and Leichter, Jerrold S.}, title = {Using {L}inda for Supercomputing on a Local Area Network}, booktitle = {Proceedings of Supercomputing 88}, year = {1988} }