[comp.parallel] Level of synchronization

klietz@uxe.cso.uiuc.edu (Alan Klietz) (08/25/89)

The kinds of algorithms and languages that are practical in
terms of performance on a parallel architecture seem to be
critically related to the synchronization level of the hardware.

For example, in a coarse grained system (e.g. Intel cube, shared memory,
transputer, etc. ), some sort of message passing paradigm is often
used.  In a fine grained system (e.g. Connection Machine) a dataflow
approach is often used.

Some questions:

Can we identify a taxonomy of algorithms based on levels of
synchronization that are best suited for each type of architecture?
For example, it is pretty obvious that a real time application
involving numerous discrete inputs in a auto factory is more
appropriate to a message passing/queueing model, whereas solving
a Schroedinger wave equation over a vector field is well suited
to a dataflow approach.  Can we generalize this based on fundamental
characteristics of the problem domain and/or algorithm?

Is one level of synchronization inherently more general than the
other?   For example, under what conditions can fine grained
synchronization effectively simulate coarsed grained synchronization?

Is the best approach a combined model?  Where PE's can roam 
independently and then chain gang together when necessary?
That is, something that acts like, say, a Transputer when you want
do to coarse grained MIMD and then acts like a Connection Machine
when you want to do fine grained SIMD?

--
Alan E. Klietz
University of Illinois at Urbana-Champaign
National Center for Supercomputing Applications
152 Computing Applications Building
605 E. Springfield Avenue
Champaign, IL  61820
Ph: +1 217 244 8024	       ARPA:  aklietz@ncsa.uiuc.edu