gwyn@brl-smoke.ARPA (Doug Gwyn ) (03/15/86)
In article <300@imagen.UUCP> geof@imagen.UUCP (Geoffrey Cooper) writes: >Mutex locking among N processors can be implemented using N single bit >variables. The only restriction is that it be possible to write any >one variable without affecting the others. It is not necessary that >reads and writes be atomic. > >The algorithm works by having one variable associated with each >processor. Each processor writes to its own variable and reads the >variables of all others. Fairness is assumed not to be at issue. If I understand you correctly, this scheme does not work. Consider: Start with all flags clear. Processor A wants to enter a critical region, so it starts to set the A-flag (after reading all the other processor flags and finding them all clear). Concurrently, processor B wants to enter the same critical region, so it starts to read the processor flags to see if some other processor is in the region. In particular, it tries to read the A-flag. Dependent on the outcome of a timing race, it is now possible for both processors to be executing in the critical region. I have never heard of a mutual exclusion scheme that did not at root depend on the existence of an interlocked test-and-set operation of some kind. I would be interested in hearing some details about other schemes if they exist.
chris@umcp-cs.UUCP (Chris Torek) (03/15/86)
In article <1828@brl-smoke.ARPA> gwyn@brl-smoke.UUCP writes: >I have never heard of a mutual exclusion scheme that did not >at root depend on the existence of an interlocked test-and-set >operation of some kind. I would be interested in hearing some >details about other schemes if they exist. Here is one that, while probably completely useless in practice, does not require a test-and-set scheme: Assign a shared address as the `owner' address. It contains the index number of the CPU owning the resource being locked. If a CPU owns it but does not want it, it increments it, modulo the number of CPUs. If a CPU does not own the resource, it continuously reads the location until its index appears. Unfortunately, this requires that each CPU busy wait for the resource, and that each check the owner address very often so as to keep the resource owner moving. If you are willing to dedicate a CPU as a `lock server', the algorithm can be changed to this: In addition to the current owner, there is a request address; and there is a distinguished `idle' index. If a CPU does not own the resource, but wants it, it puts its own index into the request address (continuously, alternating with reading the owner address until its index shows up). When a CPU that owns the resource is done with it, it writes the idle index into the owner address. The idle-index CPU's job is to hold on to the resource until someone else wants it. Again this requires a busy wait, but processors that neither own nor want the resource are not required to look at the owner address. To cut down on memory contention, one can make the idle-index CPU very slow, but this increases latency. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1415) UUCP: seismo!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@mimsy.umd.edu
kre@munnari.OZ (Robert Elz) (03/16/86)
In article <1828@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn) writes: > I have never heard of a mutual exclusion scheme that did not > at root depend on the existence of an interlocked test-and-set > operation of some kind. I would be interested in hearing some > details about other schemes if they exist. What about CSMA/CD - an ethernet is a shared resource, with no interlocked test & set, yet it seems that its occasionally possible to get exclusive access to the thing... The important properties are the ability to detect that a collision has occurrred, and the ability to go back and start again. I have used this scheme in the more conventional case of two processors with shared memory, and no test&set to do interlocking, the algorithm goes like this .. look see if the other processor has the resource locked, if so, wait set my lock bit look see if the other processor has the resource locked, if so, reset my lock bit, and go back to the start I own it. The "wait" can be a busy wait, or it can mean just try for the resource sometime later, whichever is appropriate. The "go back to the start" step should have some random delay associated with it to avoid continuous collisions (deadlock). I'd be interested to know if anyone can discover any way that this can be defeated (very interested, since this algorithm is actually in use, right now, on some hardware I have here...). I haven't actually thought this through for a case of N processors, N > 2, but I can't immediately see any reason it wouldn't function, after all, ethernets work with more than 2 processors... Robert Elz kre@munnari.oz seismo!munnari!kre ps: I am moving this discussion into net.sources.d where it belongs
jas@mulga.UUCP (03/16/86)
In article <1070@munnari.OZ> kre@munnari.OZ (Robert Elz) writes: >... >The important properties are the ability to detect that a collision >has occurrred, and the ability to go back and start again. >... > look see if the other processor has the resource locked, > if so, wait > > set my lock bit > > look see if the other processor has the resource locked, > if so, reset my lock bit, and go back to the start > > I own it. >... This algorithm looks like a simplified version of the Dekker's algorithm that they always give in undergraduate concurrent programming courses. In that context it is considered to have two problems which arise because you aren't allowed to assume anything about the relative speeds of the processes: (1) deadlock, maybe your scheme fixes this most of the time (I say "maybe" because while it's very unlikely that deadlock will occur, it is still possible in some degenerate cicumstances) (2) starvation, one process never gets a go, because the other one always manages to do its tests first; once again, in reality, while this is possible, it is *highly* unlikely. >I haven't actually thought this through for a case of N processors, >N > 2, but I can't immediately see any reason it wouldn't function, Just use generalised Dekker's algorithm, or one of the improved versions published recently. jas
geoff@desint.UUCP (Geoff Kuenning) (03/17/86)
In article <1828@brl-smoke.ARPA> gwyn@brl.ARPA writes: > I have never heard of a mutual exclusion scheme that did not > at root depend on the existence of an interlocked test-and-set > operation of some kind. I would be interested in hearing some > details about other schemes if they exist. Dijkstra invented one way back in 1965 [1] that depends only on atomic loads and stores. It has been extended by Knuth and later others [2] to prevent starvation. [1] Dijkstra, E.W. Solution of a problem in concurrent programming control. Comm. ACM 8, 9 (Sept. 1965), 569. [2] Eisenberg, M.A. and McGuire, M.R. Further comments on Dijkstra's concurrent programming control problem. The following algorithm is presented in [2]; it guarantees that each requesting processor will receive the resource within N turns. (NOTES: I translated this from GOTO loops in Algol; I have not tested it and it is not certified. The code as written assumes multiprocessors and uses spin loops for busy waiting.) #define N 4 /* Number of processors */ static char control[N]; /* Control array, starts all zero */ static int whohasit = 0; /* Current owner of the resource */ /* Acquire exclusive access to the resource for the i'th processor */ void acquire_resource (i) int i; /* Processor number */ { int j; /* Loop index */ while (1) { control[i] = 1; /* Request access to the resource */ for (j = whohasit; (j + 1) % N != whohasit; j++) { if (j == i) /* Have we reached ourselves? */ break; else if (control[j] != 0) j = whohasit; /* Somebody of higher priority */ /* ..wants it, restart the search loop */ } control[i] = 2; /* We think we have the resource now */ for (j = 0; j < N; j++) { if (j != i && control[j] == 2) break; /* Somebody else thinks THEY have it */ } if (j != N) continue; /* Somebody else thinks they have it, retry */ if (control[whohasit] != 0 && whohasit != i) continue; /* Somebody else really does have it */ whohasit = i; /* Now we really DO have it */ return; } } /* Release exclusive access to the resource */ void release_resource (i) { control[i] = 0; } -- Geoff Kuenning {hplabs,ihnp4}!trwrb!desint!geoff
geof@imagen.UUCP (Geoffrey Cooper) (03/17/86)
> In article <1828@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn) writes: > > I have never heard of a mutual exclusion scheme that did not > > at root depend on the existence of an interlocked test-and-set > > operation of some kind. You are correct, the scheme that you outlined doesn't work. The hint is that the correct scheme isn't fair. Try this code: #define N <number of processors> #define TRUE 1 #define FALSE 0 shared boolean locks[N]; lock(me) int me; /* my processor number */ { int i; do { locks[me] = TRUE; for ( i = 0; i < me-1 && ! locks[i]; i++ ); if ( i < me-1 ) locks[me] = FALSE; } while ( ! locks[me] ); do { for ( i = me+1; i < N && ! locks[i]; i++ ) } while ( i < N ); } unlock(me) int me; /* my processor number */ { locks[me] = FALSE; } Gold star to Robert Elz. I agree that there is a great similarity with aloha contention (and csma/cd). In these schemes, static priorities are replaced by random backoff times. This eliminates static priority but gives only a probablistic (though high) chance of success. The probablistic schemes are also somewhat fairer under expected network traffic than a static scheme would be. Note that each processor writes to only one variable and reads from all the others. Thus test-and-set is not needed. I developed this algorithm one day in deciding whether read-modify-write cycles were an absolute requirement for a bus architecture. Interestingly, atomic reads and writes are also unecessary, since no processor writes the lock belonging to any other processor. However, glitching of a value from true to false and back is not allows. I refer you to 3rd PODC Conference Proceedings, ACM 1984 (reprinted in SIGOPS OS.review Oct. 85: "Solved Problems, Unsolved Problems, and Nonproblems in Concurrency" by Leslie Lamport. The 2-processor version of the above is outlined as an oft-forgotten solved problem. Actually, there is a bug in the algorithm as presented in the paper, but it's probably a typo. As Chris Torek pointed out, a token contention scheme can also be used. This is a "fairer" scheme, but does require active participation on the part of all parties, even when they are not interested in the state of the lock. - Geof Cooper
taylor@glasgow.glasgow.UUCP (Jem Taylor) (03/18/86)
In article <1828@brl-smoke.ARPA> gwyn@brl.ARPA writes: >In article <300@imagen.UUCP> geof@imagen.UUCP (Geoffrey Cooper) writes: >>Mutex locking among N processors can be implemented using N single bit >>variables. [...] >>The algorithm works by having one variable associated with each >>processor. Each processor writes to its own variable and reads the >>variables of all others. Fairness is assumed not to be at issue. > >If I understand you correctly, this scheme does not work. Consider: > Start with all flags clear. > Processor A wants to enter a critical region, > so it starts to set the A-flag (after > reading all the other processor flags > and finding them all clear). Alternatively : Start with all flags clear. A sets its flag and *then* examines the other flags, finds them clear and enters. B concurrently sets it's flag and then examines the others, finds A set, and does not enter ... What can B now do ? Busy wait on the other flags ... but if C also joins in, there will be an exclusive deadlock even when A leaves the section and clears it's flag, since flags for B and C will still be set. Taking a tip from Ethernet, B (and C etc) could pause for some random interval (after first clearing it's flag) and try again from the beginning. As long as B's and C's delays are not (systematically) the same, one of them will enter the section eventually. Such a scheme has the provisos of good performance only under light to moderate load, and of unfairness, that Ethernet is known for - but (I'm told) Ethernet works. -Jem Taylor "Aerodynamically speaking, bees can't fly"
geoff@desint.UUCP (Geoff Kuenning) (03/18/86)
In article <1070@munnari.OZ> kre@munnari.OZ (Robert Elz) writes: > What about CSMA/CD - an ethernet is a shared resource, with > no interlocked test & set, yet it seems that its occasionally > possible to get exclusive access to the thing... Ethernet uses special hardware that can detect another signal during transmission. In effect, the exclusive access is detected after the fact: you get a different return code from the hardware if a transmission was garbled by a collision, and must retry later to get your exclusive access. A random-timeout scheme increases the probability that the net will be free at that later time. The CPU equivalent would be an instruction that interrupted if somebody else tried to write simultaneously; this is equivalent to a "test and set". > look see if the other processor has the resource locked, > if so, wait > > set my lock bit > > look see if the other processor has the resource locked, > if so, reset my lock bit, and go back to the start > > I own it. This is basically Dijkstra's algorithm (see my earlier posting); it has been proven to be correct (if B checks A's bit before it has been set, B's lock bit must have already been set and thus A is guaranteed to see it in the test after A sets its lock bit). The algorithm's only flaw is that it can starve a processor in theory (though in practice it's rarely a problem). A bigger problem in multiprocessors driven off the same clock is that it is possible for the algorithm to get into lock step and stick both processors in the wait-for-the-other loop. Anything that differs between processors (interrupts, clock, DMA I/O) has the potential for breaking this loop; another way is to have each processor count it's processor number down to zero before returning to the main loop. -- Geoff Kuenning {hplabs,ihnp4}!trwrb!desint!geoff
jim@alberta.UUCP (Jim Easton) (03/22/86)
*** >I have never heard of a mutual exclusion scheme that did not >at root depend on the existence of an interlocked test-and-set >operation of some kind. I would be interested in hearing some >details about other schemes if they exist. It is a sad commentary to note that someone has to go and "invent" a special instruction for this when most of the old machines had one that was ideally suited for the purpose and was useful for other things as well. The instruction was "exchange memory and A (register). The only requirement from the hardware is that it cannot be interrupted in mid execution - unavoidable with core memory. I believe that key to this problem is to have only one "item" that cannot be duplicated. Every process makes a grab for it but the one that gets it is the one that gets the resource. I personally call that "item" a "ticket". It is like a hockey ticket - the person who has it gets to sit in the corresponding seat at the game and the people that issue the tickets know that there cannot be collisions because there is only one ticket (/seat/game). Anyway that's how I think of it and if the machine has a way of arranging for a "ticket" the algorithm will work - otherwise I wouldn't bet on it. Jim Easton (..!alberta!jim)