[comp.parallel] What makes a language "easy" to program in?

baldwin@cs.rochester.edu (Douglas Baldwin) (06/02/88)
[ I picked this off comp.lang.misc
	-Steve
]

	Having thought a fair bit about parallel programming and why
it's hard to do and how it ought to be done, here're my comments
(for what they're worth).

	Coordinating a multiprocessor to do something IS harder than
getting a uniprocessor to do it - someone has to indicate how the
overall application is broken up into processes, how and when those
processes communicate, how they have to be synchronized in order to
avoid a whole host of problems (deadlock, starvation, violation of
data dependencies in accessing data, etc. etc. etc.). When programmers
have to describe all of this explicitly, in addition to just describing
the basic function they want computed, it's no wonder that they have a
hard time. This suggests that IMPLICIT parallelism (i.e., parallelism
detected and managed by a compiler or run-time system) is preferable to
explicit parallelism.

	Conventional imperative languages essentially describe state
machines - not necessarily machines with a finite number of states,
but still machines whose current actions are dependent on the state
they are currently in. This means that programs (i.e., state machines)
written in these languages had better sequence through their states in
just the right order if they are to produce the results that programmers
expect. This dependence on the order of states makes all imperative
languages (i.e., the whole Algol family and its close relatives,
object-oriented languages) inherently sequential, and any attempt to
extract parallelism from these languages fights a losing battle
against this sequentiality. I'd even argue that attempts to reason
about the correctness of programs written in explicitly parallel
imperative languages face the same battle.

	Declarative languages (functional, dataflow, logic, etc.)
do seem better for implicitly parallel programming, although they
have problems of their own. The main one is that to varying degrees
all of the declarative languages of which I am aware have problems
expressing general data parallel computations - the problem is that
the cleanest way to describe a computation done on all elements of
a data structure is via some sort of recursive traversal of that
structure, and the recursion introduces apparent data dependencies
that serialize the computation. This problem could probably be fixed
by proper language design, and in fact that's what my research is
about - things called "constraint languages" that I think avoid a
lot of the problems that other languages have in supporting implicitly
parallel programming.

	Just to put in a plug for myself, more details on the above
argument can be found in "Why We Can't Program Multiprocessors the
Way We're Trying to Do It Now", Technical Report number 224, University
of Rochester Computer Science Department, by me. I'll be glad to have
copies sent to anyone who asks.

	One final comment on Mark's "would loops help GHC?" question.
I suspect not. I haven't used GHC myself, although I've dabbled in
Prolog and followed some of the literature on parallel logic languages.
To use any language you really need to think in its idioms and paradigms,
and if GHC is anything like other logic languages the right paradigms
are recursion, backtracking, simultaneous search for all solutions to
a goal, etc., not explicit loops. Maybe as you get more used to GHC
you'll stop wishing for loops so much.