aglew@ccvaxa.UUCP (10/31/87)
/* Written 1:28 pm Oct 26, 1987 by viggy@hpsal2.HP.COM in ccvaxa:comp.arch */ /* ---------- "H/W Write Buffers, S/W Synchronizat" ---------- */ Hardware Write Buffers and Software Synchronization I have been looking into a problem involving software synchronization using shared variables in tightly coupled private cache multiprocessor systems with hardware write buffers. I would like to hear other people's experiences with the problem: should an architecture allow the performance advantage of write buffers and restrict the way software synchronizes? Hardware write (store) buffers provide queueing to smooth out the total instruction flow by allowing the execution unit to proceed in spite of unpredictable delays caused by the storage unit (cache miss). In a shared memory, private cache (write back) multiprocessor system, a write buffer can cause temporary staleness of data. If such data is being shared between processes that are executing on different processors, as in the example below, there can be serious problems with inconsistencies, or deadlocks. Master Slave Create work; Consume work; Block; Completed++; available++; if (completed < available) if ((available - completed) > 1) wakeup(master); sleep; else sleep; else wakeup (slave); In this example, synchronization is accomplished through modification of shared variables 'available', and 'completed'. Changes to these variables are not instantaneously visible in the other processor modules. This causes caches to become temporarily stale, which causes the problem - both master and slave go to sleep forever. The question is not "how to synchronize with write buffers", but rather the follwoing: 1. How much code already uses this? 2. Is it difficult to write software with such a restriction?, and 3. Would it be appropriate to force software writers to identify shared variables? John Mashey, are you listening? Viggy Mokkarala (hplabs!hpda!viggy) (408)447-5983 19420 Homestead Road, Cupertino, CA 95014. /* End of text from ccvaxa:comp.arch */
aglew@ccvaxa.UUCP (11/02/87)
...> Write buffering There are already multiprocessor systems out there that have write buffering (even though the cache may be write through, it doesn't mean that the write immediately gets to memory). It seems that there are a lot of algorithms that don't really need an immediately consistent view of the data - they just need _eventually_ consistent data. (I thought that the term "eventually consistent" was my own invention until I heard a guy from Xerox PARC give a talk on it, wrt. to networked databases. I've been using it wrt to caches and memory systems. Same idea, different scale.) Some other posters have talked about read-modify-writes in connection with write buffering. They are not quite the same issue. Consider: Processor 1 Processor 2 TSET L ... STORE 1,A g: TSET L STORE 2,A BNZ G TCLR L LOAD R1 <- A The example is contrived - what I want is Processor 1 doing a series of writes, and then clearing a lock; Processor 2 acquiring a lock, and then looking at the data structure. You might conceivably let the test and set bypass the memory queues, in which case R1 might be loaded with 1 instead of 2. Or, you can require sequential semantics, so the TSET waits until all other processors' writes have gone through; equivalently, you might stylize and say that TCLR waits until all of this processor's writes have gone through. Except that waiting until the write buffers have emptied might take a long time, especially on a system with several layers of cache, and it might require a lot of expensive interprocessor communication. Requiring memory on all lock activities penalizes a lot of algorithms where the semaphore is actually the communications channel, not protecting other data structures. So, there should be locks that wait until memory is synchronized, and ones that don't. Also, because both locking and memory synchronization may take a long time, these activities should be split up so that optimistic algorithms can be used. Eg. START-SYNCHRONIZING-MEMORY FROM-OTHER-PROCESSORS FROM-THIS-PROCESSOR WAIT-UNTIL-SYNCHRONIZED So you can start the expensive operation as soon as you have written the stuff that needs to be synchronous, but keep doing other work while it proceeds. Andy "Krazy" Glew. Gould CSD-Urbana. USEnet: ihnp4!uiucdcs!ccvaxa!aglew 1101 E. University, Urbana, IL 61801 ARPAnet: aglew@gswd-vms.arpa I always felt that disclaimers were silly and affected, but there are people who let themselves be affected by silly things, so: my opinions are my own, and not the opinions of my employer, or any other organisation with which I am affiliated. I indicate my employer only so that other people may account for any possible bias I may have towards my employer's products or systems.