hankd@dynamo.ecn.purdue.edu (Hank Dietz) (09/04/90)
In article <41233@mips.mips.COM> mash@mips.COM (John Mashey) writes: >What sounds simple (have some instructions to access bitfields) >can have all sorts of ramifications, and the only way to figure them out >is to track down ALL the details. Simple things can surprise you in >cycle time or latency hits. Actually, my pet problem in dealing with bitfields is the atomicity issue. For example, parallel languages such as PCF (Parallel Computing Forum) Fortran specify that *ALL* operations on language data objects are to be enacted as atomic operations -- but, for example, how does one deal with making parallel byte accesses within a single word atomic? If that doesn't sound bad enough, now imagine arbitrary bit fields being accessed in parallel.... Of course, you get a similar problem when a multi-word object is accessed by several processors simultaneously (unless you insert some synchronization, you might get an object where each word came from a different object value). I have a PhD student (George Ju) working on compiler optimization of data layout, with the primary goal of helping to maintain multiprocessor cache coherence. The result is that the impact of these atomicity mismatches is significantly reduced, but the basic hardware problem remains. So, how does one address atomicity issues in (multiprocessor) hardware? I think we can all agree at least that the processor ought not be doing cutting and pasting of bitfields, because, if it does, that simply extends the time during which the word containing the field is unavailable to parallel processors... not a good thing. The question is hence how to make the memory interface most efficiently simulate atomicity for arbitrary-size data objects.... Thus far, I think the answer has been something along the lines of "well, scientific codes only work on word-size objects anyway." -hankd@ecn.purdue.edu