[comp.sys.transputer] Appl. Lang. on transputer systems

wg@tubopal.UUCP (Wolfgang Grieskamp) (07/11/90)

Hello,

Here comes the summarization of my query "Applicative languages on
transputer systems" 2 weeks ago.

Mails will be quoted with only minor shortcuts, because I guess, that
the projects sketched out are also of interest, and might supplement the
raised references. I hope, everyone quoted here agreed to a quotation
this way in mailing to me. (Some people offer technical reports. I
do not cut this, but please think twice wether you are realy interested 
on the topic before mailing to them). 

My general impression is, that there are not less people working on or
interested at implementations of applicative languages on parallel
architectures which are not classical dataflow (in sense of hardware) 
and which do not base on an unbounded amount of shared memory. 


Wolfgang Grieskamp


PS. Some people asked for more information about the OPAL project 
and for reports. I must appologize that currently no language report 
is available (its not my authority). Regarding to parallelism nothing 
is realized yet - I posted to learn. I will append a brief overview on 
OPALs conception and current state of implementation for sequential
machines.


- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

From: skill@qucis.queensu.ca (David Skillicorn)

Have you heard of the language Lucid? It is a functional language all
of whose data types are infinite streams. It has both a textual and
a graphical form (operator nets). It is naturally implemented on
dataflow machines. Many of the issues you raise have been solved within
the Lucid framework.

Dynamic to static transformation can be managed by adding extra dimensions
to streams (i.e. a 1-dimensional stream can be regarded as encoding
tail recursion, a two dimensional stream as encoding nested recursion etc.).

An extensive set of semantic transformations exist and can be used to
transform programs, both during program derivation and during compilation.

We have an implementation running on transputers. We also have a partitioning
phase in the compiler that is capable of using information gained from
abstract interpretation to partition code for multiple transputers,
using a set of heuristics ranging from random to simulated annealing.

Some references:

Wadge and Ashcroft, Lucid: the Dataflow Programming Language, Academic Press

Skillicorn, Techniques for Compiling and Executing Dataflow Graphs,
IFIP World Congress, September 1989.

If you send me your postal address I will also send you a technical report.

[...]

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

From: gkj@doc.ic.ac.uk (Guido K. Jouret)
Organization: Dept. of Computing, Imperial College, London, UK.

At Imperial we're currently one year into a project called Transformations
for Parallelism where we are using Hope+ and extending it via a special
annotation language called Caliban with which you can describe static process
networks.  We are looking at 4 different kinds of parallel architectures.  The
first is the fully static, Meiko-type machine, the second is the static process
but fully interconnected type machine (next-generation Transputers), the third
is the dynamic process model which represents machines like Alice or Flagship,
and the fourth is SIMD machines.  We are starting from a common functional
specification of a program and via transformations aim to derive program forms
for a specific kind of architecture.  My part of this project is the SIMD
component.  The first phase of my work consisted of analyzing SIMD machines and
trying to derive a set of higher-order functions which capture the functionality
of existing SIMD machines.  The second phase has been to derive a transformation
algebra using these primitives.  The third phase consists of deriving data-
parallel programs using these primitives from initial specifications.  I'm
currently working on extending my work into deriving systolic algorithms via
these data-parallel operations.  I've got an initial paper which describes the
first and second phases.  I'm working on the systolic stuff at the moment and
trying to write that up soon.

[...]

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

From: phjk@doc.ic.ac.uk (Paul Kelly)

Your work looks very similar to ideas being developed here at Imperial
College, in two little related projects called T4P (Transformation for 
Parallelism) and FAST (Functional programming for Arrays of Transputers).

We are very busy implementing right now, and the only proper explanation
I know of is my PhD work, now published (a year ago) as

	Functional Programming for Loosely-coupled Multiprocessors
	Paul HJ Kelly
	Pitman/MIT Press (1989)

For machines like transputers we are planning to use the Caliban notation
presented in the book, to control static (and mostly-static) process 
networks.  We will also try to build various "harnesses" to implement
common parallel program structures (as expressed as higher-order functions)
using a finely-tuned load-balancing etc. run-time system.

[...]

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

From: kh@cs.glasgow.ac.uk (Kevin Hammond)
Organization: Comp Sci, Glasgow Univ, Scotland

I  have a  paper  describing an implementation   of Dactl (generalised
graph-rewriting  as  an  intermediate code) on  a transputer net.  The
implementation is more at  the simulation stage  than finished though.
I have  used Dactl to  implement a dialect of Standard  ML.  This is a
"fine-grained" approach with dynamic scheduling based on ZAPP.  Let me
know if you'd like a copy of this paper (also to appear in my  book --
Pitman).  I'm  now working on  GRIP with Simon   Peyton Jones, this is
arguably a transputer-like  implementation  (a cross between  a shared
memory and distributed memory system).

Murray Cole    (murray@cs.glasgow.ac.uk)  has   written  a   book   on
"algorithmic   skeletons"  for  coarse-grained     implementation   of
applicative languages (Pitman), he's currently working on a transputer
implementation. 

Paul Kelly (phjk@doc.ic.ac.uk) has also written a  book  in the Pitman
series.   He's working on  a language called  Caliban  which expresses
control      notions  dynamically   at    compile-time  (with  bounded
computation).  Again, this  is  a coarse-grain  approach (though  this
could be   disputed).  There's  a project  at  Southampton   under the
control of Hugh  Glaser  (hg@ecs.soton.ac.uk)  whose intention  is  to
implement this scheme on a transputer network.

Sorry about  plugging   the  Pitman series  (Parallel and  Distributed
Computing) like this, but they seem to cornered the UK PhD publication
market in this area!

Rita Loogen (rita@rwthinf.uucp) is  working on an implementation  of a
functional logic language on a transputer system.  Her thesis has been
published by Springer-Verlag. [The german title is "Parallele 
Implementierung funktionaler Programmiersprachen", translated "Parallel
Implementation of functional Programming Languages"]


David  McBurney   (dlm@sys.uea.ac.uk)  has   been   working    on   an
implementation  of  Concurrent Clean   on the transputer.     Clean is
described in a paper  by Rinus Plasmeijer,  Marko  van Eekelen  et  al
(Proc FPCA 87, LNCS #274, pp364-384).

[...]

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

From: sjohnson@iuvax.os.indiana.edu (Steve Johnson)
Organization: Indiana University, Bloomington

[...]

I am investigating topics in hardware synthesis using functional algebra
as a transformation medium.  One of the initial topics had to do with
translating functional "behavioral specifications" into stream-based
"structural descriptions".  A synopsis of the results of my disseratation
can be found in the 11th ACM Principles of Programming Languages Symposium
(POPL).  The title of the article is, "Applicative Programming and
Digital Design."  Here at Indiana we have implemented a core tranformation
system for "Digital Design Derivation" (DDD) which is used to manipulate
stream-networks in order to obtain hardware descriptions.  An early
description of DDD appears in the book "VLSI Specification, Verification
and Synthesis" by G. Birtwhistle and P.A. Subrahmanyam (eds, Kleuer 1987).
A more recent develop of the algebra will appear in the proceedings of
a work shop, Mathematical Aspects of Hardware Specification, Verification
and Synthesis by M. Leeser and G. Brown (eds., Springer, 1991?).

Of course, I believe that the work we are doing in hardware synthesis has
significant applications in parallel program development, and conversely,
but I am not currently working on that topic.

I am also interested in the use of streams in functional programming.  I
believe it is the key to realistic compilation for parallelism.  I am
involved in a project for MIMD parallelism.  Our research vehicle is
a BBN Butterfly.  We have implemented a parallel graph processing
system called DSI, based on a "lazy cons".  The I/O discipline, based
on streams, integrates device drivers directly in the graph space.
We (Me, David Wise, Daniel Friedman, John O'Donnell (now at Glasgow), and
others) have, for a number of years, been looking at aspects of systems
programming in functional languages.  A paper published in 1988 (I think)
by O'Donnell the journal on Lisp and Symbolic Computation, called "Dialogues", 
is a nice treatment, I think.

[...]

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = 
END OF SUMMARIZATION

Some Language and (sequential) Implementation Aspects of OPAL 
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Let me quote from an internal paper:

" (..)

o	The overall appearence of OPAL programs exhibits a strong
	algebraic flavour.

o	OPAL programs are built up from so-called "structures"
	that are organized in a hierachical usage relation and comprise
	import and export interfaces.

o	The orientation towards algebraic specification languages as 
	well as the interest in efficient implementations leads
	to a strong typing discipline. This does, however, not exclude
	features like parameterization, overloading, and polymorphism.

o	The requirements of producing truly efficient code make (given
	the current state of our knowledge) eager evaluation mandatory.

o	In order to study programming concepts for distributed systems,
	OPAL comprises streams and stream processing functions. These
	are in particular also used for input and output. In addition,
	their presence allows us to also experiment with lazy calling
	mechanisms.

(..) it should not come as an suprise that OPAL resembles other applicative 
languages such as ML or - even more closely - HOPE". 

There is one further feature, that is in particular important for optimizing
compilers:

o	The programmer may formulate guarded rewriting rules, so called
	laws, to supplement his (constructive) equations.

So far about the language design. A language report is currently not
available; the language is still in move, and nothing draft shall 
leave the laboratory, so the professor told.

The (sequential) implementation of OPAL I'm contributed to trys to
match the following goals:

o	There are a lot of theoretical well-known optimizing source
	transformations: recursion simplification via associativity
	and commutativity, fusion of producers and consumers of data
	objects, tupling to avoid multiple traversal of data structures,
	partial evaluation, and others. The compiler trys to constitute
	a strategy which applys a couple of this transformations via
	a builtin term rewriting system (which supports pattern-constructs
	for indexed variables and position-independent notation).
	
o	Keys to efficient compilation are suspected on the 
	imperative level (in respect to its applicative origin). 
	The compiler constitutes a "lazy reference counting" garbage 
	collection scheme, runtime-detected selective update, cell 
	reusage, and tail-recursion "modulo constructor" removal. The
	referencing counting effort shall be hold down by global
	sharing analysis (we hope). Single-threatet parameter 
	globalization and stacking - especially of closures - will be 
	applied.

We expect to finish a prototype which produces portable C Code 
"tomorrow", but an official version not before end of the year. 

People interested on an elder report from August 1989 about some 
of this topics (from which I quoted also the language 
design issues above) might mail to wolfram@opal.cs.tu-berlin.de or
tub!tubopal!wolfram who is one of the authors (If you have already asked
for reports please mail again. I'm not sure if you are interested on 
this more classical aspects).

Regarding to parallel implementations I guess, that base concepts of 
the sequential implementation can be take over when walking to 
coarse-grain architectures. On applicative level I assume that 
parallelization by deriving stream networks from plain function 
systems has a lot to do with (reverting) fusion and tupling. The 
decision when to make eager data structures to streams might base on similar
conditions as with plain (non-reverted) fusion (ie. objects are consumed 
in comparable order then produced). On imperative level attention has to 
be paid on the sharing problem in respect to partitioning.  But this, 
however, is a bit speculative, and I will study the problems more deeper
in near future.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Wolfgang Grieskamp  			SERIOUSLY: "Alle Lust will Ewigkeit!" 
wg@opal.cs.tu-berlin.de        tub!tubopal!wg         wg%opal@DB0TUI11.BITNET