[comp.parallel] Determinacy Issues in Parallel Programs

adamb@ncar.UCAR.EDU (Adam 'Velvis' Beguelin) (08/11/89)

I would like to discuss the value of determinacy in parallel programming.

If you knew a parallel program was determinate or nondeterminate
before you ran it, then it would make writing and 
debugging the program easier.  If the program was determinate
you would know that errors produced by the
program were not an artifacts of race conditions in your program.
If you knew that the program may be nondeterminate, you could look
to those portions of the program which may cause nondeterminacy
to see if these sections of the code contribute to the incorrectness
of the computation.  If you expected the program to be determinate 
and were told that it was nondeterminate, this would also be 
useful information.

Does anyone have any comments on the above argument?

Also related to determinacy, some languages (i.e. functional languages)
enforce determinacy, while most parallel languages do not.   Enforcing
determinacy restricts the number of computations one may easily express.
A client server model is naturally nondeterministic, for instance.
Does the average Joe parallel programmer need the ability of writing
nondeterministic programs?  Would he be better off with a language
that encouraged or enforced determinacy, even though the set of computations
expressible in such a language may be smaller?

Finally, does anyone have references to papers that discuss such topics?
I know of a papers on the Blaze and Linda languages which touch on these
issues but I'm not aware of anything that explicitly discusses the pros
and cons of determinacy in parallel programming.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Adam Beguelin 			Computer Science Department Box 430 
adamb@boulder.Colorado.Edu		     University of Colorado
303/492-7906				      Boulder, CO 80309-430

dinucci@cse.ogc.edu (David C. DiNucci) (08/18/89)

In article <6243@hubcap.clemson.edu> boulder!boulder!adamb@ncar.UCAR.EDU (Adam 'Velvis' Beguelin) writes:
>If you knew a parallel program was determinate or nondeterminate
>before you ran it, then it would make writing and 
>debugging the program easier.

(Hi Adam!)  I'm not sure what you mean by easier to write, since you
would do that before you ran it, but since most of your other comments
address debugging, so will I.  

>If the program was determinate you would know that errors produced by
>the program were not an artifacts of race conditions in your program.

You would also know that you could rerun the program within a debugger
and observe it closely without altering its behavior.

>If you knew that the program may be nondeterminate, you could look
>to those portions of the program which may cause nondeterminacy
>to see if these sections of the code contribute to the incorrectness
>of the computation.

...and if a run-time system knew which portions of the program could
allow nondeterminacy, just the results of those non-determinate choices
could be traced, yielding enough information to replay the entire
execution within a debugger while forcing the same behavior.

>If you expected the program to be determinate 
>and were told that it was nondeterminate, this would also be 
>useful information.

This is the real heart of the matter.  In most languages and models,
non-determinacy is an emergent phenomenon:  it's not specified
by the programmer in any one place, but rather creeps in based on how
everything works together, usually unnoticed by the programmer.  Yet,
the programmer *knows* whether the program *should* be determinate or
not, based on the problem specification.

>Also related to determinacy, some languages (i.e. functional languages)
>enforce determinacy, while most parallel languages do not.   Enforcing
>determinacy restricts the number of computations one may easily express.

A non-deterministic choice is one where the program does not specify
which choice is to be made.  (Thus, in my opinion, the concept of "fair
non-determinism" is confused.)  It does not mean that a choice will be
made in such a complex, pseudo-random way that the user will not be able
to guess how it was made, but this is the best that a functional language
can do when the problem statement includes non-determinacy.

There is an implicit assumption that the computer will make the choice
which can be acted on the soonest, even though this is not really part
of the definition of non-determinacy.)

>Does the average Joe parallel programmer need the ability of writing
>nondeterministic programs?  Would he be better off with a language
>that encouraged or enforced determinacy, even though the set of computations
>expressible in such a language may be smaller?

He/She needs nondeterminism if maximum performance is desired from a parallel
computer, and most people would agree that this is kind of important.

It comes down to the notion that there is often a set of acceptable
answers to a computation or sub-computation, each element of the set
representing a different function evaluated on the computation's inputs.
The elements may be simple permutations of each other, may have
different types of roundoff, may be different local minima in a state
space, etc., but they are all acceptable answers.  The goal is to
compute the one that can be computed the soonest, subject to the availability
of the input data, the temperature of the machine room, the activity on
the disk drive, etc.  Unless you wish to build all these parameters into
a function, you are going to need the ability to write non-deterministic
programs.

>Finally, does anyone have references to papers that discuss such topics?
>I know of a papers on the Blaze and Linda languages which touch on these
>issues but I'm not aware of anything that explicitly discusses the pros
>and cons of determinacy in parallel programming.

Try Chandy, Misra's book on Unity.

M. J. Manthey, "Hierarchy in Sequential and Concurrent Systems", in
"Characteristics of Parallel Algorithms", ed. Jamieson, Gannon,
Douglass, MIT Press, 1987.

T. J. LeBlanc, J Mellor-Crummey, "Debugging parallel programs
with instant replay", IEEE Trans. on Computers, April 87

You might also look at the LGDF2 (F-Nets) stuff we're working on,
especially in the HICSS-21 (1988), and COMPCON Spring '89 proceedings,
as well as tech reports 88-005 (Design of a Debugger based on LGDF2),
and 88-006 (A Formal model of computation for LGDF2).

In my soon-(I hope ;-)-to-be-finished dissertation, I provide a form
for expressing any determinate parallel program (shared memory or
message passing) so that it can be known determinate by inspection,
without restricting communication-control patterns between processes.

>
>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>Adam Beguelin 			Computer Science Department Box 430 
>adamb@boulder.Colorado.Edu		     University of Colorado
>303/492-7906				      Boulder, CO 80309-430
-- 
David C. DiNucci                UUCP:  ..ucbvax!tektronix!ogccse!dinucci
Oregon Graduate Center          CSNET: dinucci@cse.ogc.edu
Beaverton, Oregon

reiher@amethyst.Jpl.Nasa.Gov (Peter Reiher) (08/18/89)

In article <6280@hubcap.clemson.edu> dinucci@cse.ogc.edu (David C. DiNucci) writes:
>In article <6243@hubcap.clemson.edu> boulder!boulder!adamb@ncar.UCAR.EDU (Adam 'Velvis' Beguelin) writes:
>
>>If the program was determinate you would know that errors produced by
>>the program were not an artifacts of race conditions in your program.

. . . 

>He/She needs nondeterminism if maximum performance is desired from a parallel
>computer, and most people would agree that this is kind of important.

Our work on the Time Warp Operating System (TWOS) has demonstrated two levels of
determinism.  TWOS runs event-driven simulations on parallel processors, using
an optimistic model with rollback and message cancellation for synchronization.
We demand deterministic results from the execution of the simulation itself.
TWOS permits the use of no language constructs that would knowingly lead to
non-determinism, such as reading the system clock.  But we do not demand that 
TWOS proceed deterministically in getting there - in fact, our methods 
practically guarantee that it won't.  

A non-deterministic answer is only useful if you can guarantee how far off it
is, so that you can determine if it is in the reasonable range of correct
values.  If proper care is taken with round-off errors, etc., one can say
something about how far off the answer could be.  Being incautious with
synchronization of events is much harder to quantify.

The implications for debugging should be that the user can count on determinism
from his program, and can debug accordingly.  Of course, TWOS is an experimental
system, so it's still got its own bugs, some of which cause non-determinism.
Also, lacking the necessary hardware support, TWOS currently has no memory
protection, so a suitably incautious user can tromp on anything; one result
can be non-determinism.  Still, when we get non-deterministic results, we
usually find the problem in the system.  Determinism makes debugging much 
easier for the users.

I don't think that determinism is costly for our system's performance.  We're
getting pretty good performance, given the fundamentally difficult nature of
the problems.  Of course, the determinism is limited to the committed results
of the users' programs, so we do not pay the performance cost for making the
actions of TWOS itself deterministic.  That cost would be immense.

>The goal is to
>compute the one that can be computed the soonest, subject to the availability
>of the input data, the temperature of the machine room, the activity on
>the disk drive, etc.  Unless you wish to build all these parameters into
>a function, you are going to need the ability to write non-deterministic
>programs.

Or the operating system can mask these effects, as TWOS attempts to.  You may
pay some performance cost, but our experience suggests that, for at least one
important class of programs, the performance is still quite good.  Given that
deterministic programs are easier to deal with than non-deterministic programs,
you've got to balance what you pay against what you get.
			Peter Reiher
			reiher@amethyst.jpl.nasa.gov
			(DO NOT send to reiher@amethyst.uucp)
			. . . cit-vax!elroy!jato!jade!reiher