wg@tubopal.UUCP (Wolfgang Grieskamp) (07/11/90)
Hello, Here comes the summarization of my query "Applicative languages on transputer systems" 2 weeks ago. Mails will be quoted with only minor shortcuts, because I guess, that the projects sketched out are also of interest, and might supplement the raised references. I hope, everyone quoted here agreed to a quotation this way in mailing to me. (Some people offer technical reports. I do not cut this, but please think twice wether you are realy interested on the topic before mailing to them). My general impression is, that there are not less people working on or interested at implementations of applicative languages on parallel architectures which are not classical dataflow (in sense of hardware) and which do not base on an unbounded amount of shared memory. Wolfgang Grieskamp PS. Some people asked for more information about the OPAL project and for reports. I must appologize that currently no language report is available (its not my authority). Regarding to parallelism nothing is realized yet - I posted to learn. I will append a brief overview on OPALs conception and current state of implementation for sequential machines. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - From: skill@qucis.queensu.ca (David Skillicorn) Have you heard of the language Lucid? It is a functional language all of whose data types are infinite streams. It has both a textual and a graphical form (operator nets). It is naturally implemented on dataflow machines. Many of the issues you raise have been solved within the Lucid framework. Dynamic to static transformation can be managed by adding extra dimensions to streams (i.e. a 1-dimensional stream can be regarded as encoding tail recursion, a two dimensional stream as encoding nested recursion etc.). An extensive set of semantic transformations exist and can be used to transform programs, both during program derivation and during compilation. We have an implementation running on transputers. We also have a partitioning phase in the compiler that is capable of using information gained from abstract interpretation to partition code for multiple transputers, using a set of heuristics ranging from random to simulated annealing. Some references: Wadge and Ashcroft, Lucid: the Dataflow Programming Language, Academic Press Skillicorn, Techniques for Compiling and Executing Dataflow Graphs, IFIP World Congress, September 1989. If you send me your postal address I will also send you a technical report. [...] - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - From: gkj@doc.ic.ac.uk (Guido K. Jouret) Organization: Dept. of Computing, Imperial College, London, UK. At Imperial we're currently one year into a project called Transformations for Parallelism where we are using Hope+ and extending it via a special annotation language called Caliban with which you can describe static process networks. We are looking at 4 different kinds of parallel architectures. The first is the fully static, Meiko-type machine, the second is the static process but fully interconnected type machine (next-generation Transputers), the third is the dynamic process model which represents machines like Alice or Flagship, and the fourth is SIMD machines. We are starting from a common functional specification of a program and via transformations aim to derive program forms for a specific kind of architecture. My part of this project is the SIMD component. The first phase of my work consisted of analyzing SIMD machines and trying to derive a set of higher-order functions which capture the functionality of existing SIMD machines. The second phase has been to derive a transformation algebra using these primitives. The third phase consists of deriving data- parallel programs using these primitives from initial specifications. I'm currently working on extending my work into deriving systolic algorithms via these data-parallel operations. I've got an initial paper which describes the first and second phases. I'm working on the systolic stuff at the moment and trying to write that up soon. [...] - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - From: phjk@doc.ic.ac.uk (Paul Kelly) Your work looks very similar to ideas being developed here at Imperial College, in two little related projects called T4P (Transformation for Parallelism) and FAST (Functional programming for Arrays of Transputers). We are very busy implementing right now, and the only proper explanation I know of is my PhD work, now published (a year ago) as Functional Programming for Loosely-coupled Multiprocessors Paul HJ Kelly Pitman/MIT Press (1989) For machines like transputers we are planning to use the Caliban notation presented in the book, to control static (and mostly-static) process networks. We will also try to build various "harnesses" to implement common parallel program structures (as expressed as higher-order functions) using a finely-tuned load-balancing etc. run-time system. [...] - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - From: kh@cs.glasgow.ac.uk (Kevin Hammond) Organization: Comp Sci, Glasgow Univ, Scotland I have a paper describing an implementation of Dactl (generalised graph-rewriting as an intermediate code) on a transputer net. The implementation is more at the simulation stage than finished though. I have used Dactl to implement a dialect of Standard ML. This is a "fine-grained" approach with dynamic scheduling based on ZAPP. Let me know if you'd like a copy of this paper (also to appear in my book -- Pitman). I'm now working on GRIP with Simon Peyton Jones, this is arguably a transputer-like implementation (a cross between a shared memory and distributed memory system). Murray Cole (murray@cs.glasgow.ac.uk) has written a book on "algorithmic skeletons" for coarse-grained implementation of applicative languages (Pitman), he's currently working on a transputer implementation. Paul Kelly (phjk@doc.ic.ac.uk) has also written a book in the Pitman series. He's working on a language called Caliban which expresses control notions dynamically at compile-time (with bounded computation). Again, this is a coarse-grain approach (though this could be disputed). There's a project at Southampton under the control of Hugh Glaser (hg@ecs.soton.ac.uk) whose intention is to implement this scheme on a transputer network. Sorry about plugging the Pitman series (Parallel and Distributed Computing) like this, but they seem to cornered the UK PhD publication market in this area! Rita Loogen (rita@rwthinf.uucp) is working on an implementation of a functional logic language on a transputer system. Her thesis has been published by Springer-Verlag. [The german title is "Parallele Implementierung funktionaler Programmiersprachen", translated "Parallel Implementation of functional Programming Languages"] David McBurney (dlm@sys.uea.ac.uk) has been working on an implementation of Concurrent Clean on the transputer. Clean is described in a paper by Rinus Plasmeijer, Marko van Eekelen et al (Proc FPCA 87, LNCS #274, pp364-384). [...] - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - From: sjohnson@iuvax.os.indiana.edu (Steve Johnson) Organization: Indiana University, Bloomington [...] I am investigating topics in hardware synthesis using functional algebra as a transformation medium. One of the initial topics had to do with translating functional "behavioral specifications" into stream-based "structural descriptions". A synopsis of the results of my disseratation can be found in the 11th ACM Principles of Programming Languages Symposium (POPL). The title of the article is, "Applicative Programming and Digital Design." Here at Indiana we have implemented a core tranformation system for "Digital Design Derivation" (DDD) which is used to manipulate stream-networks in order to obtain hardware descriptions. An early description of DDD appears in the book "VLSI Specification, Verification and Synthesis" by G. Birtwhistle and P.A. Subrahmanyam (eds, Kleuer 1987). A more recent develop of the algebra will appear in the proceedings of a work shop, Mathematical Aspects of Hardware Specification, Verification and Synthesis by M. Leeser and G. Brown (eds., Springer, 1991?). Of course, I believe that the work we are doing in hardware synthesis has significant applications in parallel program development, and conversely, but I am not currently working on that topic. I am also interested in the use of streams in functional programming. I believe it is the key to realistic compilation for parallelism. I am involved in a project for MIMD parallelism. Our research vehicle is a BBN Butterfly. We have implemented a parallel graph processing system called DSI, based on a "lazy cons". The I/O discipline, based on streams, integrates device drivers directly in the graph space. We (Me, David Wise, Daniel Friedman, John O'Donnell (now at Glasgow), and others) have, for a number of years, been looking at aspects of systems programming in functional languages. A paper published in 1988 (I think) by O'Donnell the journal on Lisp and Symbolic Computation, called "Dialogues", is a nice treatment, I think. [...] = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = END OF SUMMARIZATION Some Language and (sequential) Implementation Aspects of OPAL - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Let me quote from an internal paper: " (..) o The overall appearence of OPAL programs exhibits a strong algebraic flavour. o OPAL programs are built up from so-called "structures" that are organized in a hierachical usage relation and comprise import and export interfaces. o The orientation towards algebraic specification languages as well as the interest in efficient implementations leads to a strong typing discipline. This does, however, not exclude features like parameterization, overloading, and polymorphism. o The requirements of producing truly efficient code make (given the current state of our knowledge) eager evaluation mandatory. o In order to study programming concepts for distributed systems, OPAL comprises streams and stream processing functions. These are in particular also used for input and output. In addition, their presence allows us to also experiment with lazy calling mechanisms. (..) it should not come as an suprise that OPAL resembles other applicative languages such as ML or - even more closely - HOPE". There is one further feature, that is in particular important for optimizing compilers: o The programmer may formulate guarded rewriting rules, so called laws, to supplement his (constructive) equations. So far about the language design. A language report is currently not available; the language is still in move, and nothing draft shall leave the laboratory, so the professor told. The (sequential) implementation of OPAL I'm contributed to trys to match the following goals: o There are a lot of theoretical well-known optimizing source transformations: recursion simplification via associativity and commutativity, fusion of producers and consumers of data objects, tupling to avoid multiple traversal of data structures, partial evaluation, and others. The compiler trys to constitute a strategy which applys a couple of this transformations via a builtin term rewriting system (which supports pattern-constructs for indexed variables and position-independent notation). o Keys to efficient compilation are suspected on the imperative level (in respect to its applicative origin). The compiler constitutes a "lazy reference counting" garbage collection scheme, runtime-detected selective update, cell reusage, and tail-recursion "modulo constructor" removal. The referencing counting effort shall be hold down by global sharing analysis (we hope). Single-threatet parameter globalization and stacking - especially of closures - will be applied. We expect to finish a prototype which produces portable C Code "tomorrow", but an official version not before end of the year. People interested on an elder report from August 1989 about some of this topics (from which I quoted also the language design issues above) might mail to wolfram@opal.cs.tu-berlin.de or tub!tubopal!wolfram who is one of the authors (If you have already asked for reports please mail again. I'm not sure if you are interested on this more classical aspects). Regarding to parallel implementations I guess, that base concepts of the sequential implementation can be take over when walking to coarse-grain architectures. On applicative level I assume that parallelization by deriving stream networks from plain function systems has a lot to do with (reverting) fusion and tupling. The decision when to make eager data structures to streams might base on similar conditions as with plain (non-reverted) fusion (ie. objects are consumed in comparable order then produced). On imperative level attention has to be paid on the sharing problem in respect to partitioning. But this, however, is a bit speculative, and I will study the problems more deeper in near future. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Wolfgang Grieskamp SERIOUSLY: "Alle Lust will Ewigkeit!" wg@opal.cs.tu-berlin.de tub!tubopal!wg wg%opal@DB0TUI11.BITNET