cik@l.cc.purdue.edu (Herman Rubin) (08/29/88)
There are situations in which it is likely that processors operating in parallel may have to address the same item in memory at the same time. What are the consequences in program time for conflicts of this type? What hardware features are there to possibly circumvent this problem if it is serious? -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)
fpst@hubcap.clemson.edu (Steve Stevenson-Moderator) (08/29/88)
> There are situations in which it is likely that processors operating in > parallel may have to address the same item in memory at the same time. > What are the consequences in program time for conflicts of this type? > What hardware features are there to possibly circumvent this problem if > it is serious? This is a very old and serious problem. It has occupied an lot of pages in the performance literature. Hardware folks became aware of it early on. Given your ``residence'' you might try Coffman and Denning "Operating System Theory" --- old but has all the queuing stuff in it. There are probably newer treatments, but the problem's the same. Steve Stevenson fpst@hubcap.clemson.edu (aka D. E. Stevenson), fpst@prism.clemson.csnet Department of Computer Science, comp.parallel Clemson University, Clemson, SC 29634-1906 (803)656-5880.mabell
dfk@grad4.cs.duke.edu (David F. Kotz) (08/29/88)
In article <2886@hubcap.UUCP>, cik@l.cc.purdue.edu (Herman Rubin) writes: > There are situations in which it is likely that processors operating in > parallel may have to address the same item in memory at the same time. > What are the consequences in program time for conflicts of this type? > What hardware features are there to possibly circumvent this problem if > it is serious? The consequences of memory conflicts are quite serious and can completely bog down a parallel program. The results depend strongly on the particular architecture and, of course, the implementation of the algorithm. Usually the parallel processes in a program need not access the same memory location to conflict, as long as they access the same memory *module*. Standard memory modules can only service one memory access at a time. Thus all accesses to that module are serialized. Most parallel computers (and many uniprocessor computers) have several memory modules, so the best is for the processors spread out their accesses among the memories as much as possible, and to avoid accessing memory when not necessary. Further conflicts come from the communication network used to connect the processors to the memories. In a bus-based architecture, all of the memory accesses go through the single bus and are hence serialized at that point, providing an even worse bottleneck than serialization at the memory module. Other architectures include omega (Butterfly) networks, crossbar switches, and hypercubes. I have personally seen extreme cases of memory conflicts in a BBN Butterfly Parallel processor, not only from an program banging on one particular memory location (a shared counter or central pointer) but also due to misbalanced access to the memories of the processors. One hardware alternative is the combining network, which was planned to be used in the NYU Ultracomputer. Memory accesses through this hierarchical network are *combined* when they refer to the same memory location, so only one access reaches the memory module itself, and are re-expanded on the way back through the network. Multiple reads to a module would effect one read at the module and send the value back to all requesting processors. Multiple writes would arbitrarily choose one of the values to be the "last" one written and write that value in the location. Fetch-and-add is another, more complex operation, that accomplishes x=x+y atomically and also uses the combining feature. Some other reading: @book {stone:book, author = "Harold S. Stone", title = "High Performance Computer Architecture", year = 1987, publisher = "Addison-Wesley" } @article{gottlieb:nyu, author = "Allan Gottlieb and Ralph Grishman and Clyde P. Kruskal and Kevin P. McAuliffe and Larry Rudolph and Marc Snir", title = "The {NYU Ultracomputer} --- {Designing} an {MIMD} Shared Memory Parallel Computer", journal = ieeetc, year = 1983, month = feb, volume = "C-32", number = 2, pages = "175--189" } David Kotz Department of Computer Science, Duke University, Durham, NC 27706 ARPA: dfk@cs.duke.edu CSNET: dfk@duke UUCP: decvax!duke!dfk