[comp.arch] Object atomicity for multiprocessor hardware

hankd@dynamo.ecn.purdue.edu (Hank Dietz) (09/04/90)
In article <41233@mips.mips.COM> mash@mips.COM (John Mashey) writes:
>What sounds simple (have some instructions to access bitfields)
>can have all sorts of ramifications, and the only way to figure them out
>is to track down ALL the details.  Simple things can surprise you in
>cycle time or latency hits.  

Actually, my pet problem in dealing with bitfields is the atomicity issue.
For example, parallel languages such as PCF (Parallel Computing Forum)
Fortran specify that *ALL* operations on language data objects are to be
enacted as atomic operations -- but, for example, how does one deal with
making parallel byte accesses within a single word atomic?  If that doesn't
sound bad enough, now imagine arbitrary bit fields being accessed in
parallel....  Of course, you get a similar problem when a multi-word object
is accessed by several processors simultaneously (unless you insert some
synchronization, you might get an object where each word came from a
different object value).

I have a PhD student (George Ju) working on compiler optimization of data
layout, with the primary goal of helping to maintain multiprocessor cache
coherence.  The result is that the impact of these atomicity mismatches is
significantly reduced, but the basic hardware problem remains.

So, how does one address atomicity issues in (multiprocessor) hardware?  I
think we can all agree at least that the processor ought not be doing
cutting and pasting of bitfields, because, if it does, that simply extends
the time during which the word containing the field is unavailable to
parallel processors...  not a good thing.  The question is hence how to make
the memory interface most efficiently simulate atomicity for arbitrary-size
data objects....  Thus far, I think the answer has been something along the
lines of "well, scientific codes only work on word-size objects anyway."

						-hankd@ecn.purdue.edu