rp@osc.COM (Rich Patterson) (06/13/89)
Hi, I need some help in finding information on coding atmoic bit operations on a Sun-4 (SPARC). I wasn't able to find a reference to a "Test and Set" operation in the Assembly guide that comes with our Sun-4. Are there any other references on SPARC assembly, Sun published or otherwise ? Any help or code would be appreciated. Please e-mail to the address below. Thanks, Rich P. rp@osc.com pacbell!osc!rp
dg@lakart.UUCP (David Goodenough) (06/16/89)
rp@osc.COM (Rich Patterson) sez: > Hi, > I need some help in finding information on coding atmoic bit > operations on a Sun-4 (SPARC). I wasn't able to find a reference to a > "Test and Set" operation in the Assembly guide that comes with our Sun-4. > Are there any other references on SPARC assembly, Sun published or > otherwise ? Any help or code would be appreciated. Please e-mail to the > address below. I have never understood the need for a test and set instruction, when you can make do with adc (add with carry). Allow me to explain: The point behind TAS is to allow a process to test if a flag is set or clear, and set it no matter what the result. But why does the test have to be in the same instruction? In fact all that is needed is the ability to capture the state of a bit, setting it as you do the capture, and test it later. If you think about the following: Bit clear (i.e. resource available) Task 1 grabs a copy of the bit and sets it, but does not test it. Bit is now set Task 1 gets swapped out, and Task 2 runs Task 2 grabs a copy of the bit, and sets it again. Task 2 tests thre copy it captured - finds it was set, and assumes the resource is not available. Task 2 gets swapped out, and Task 1 comes back Task 1 tests it's copy of the bit, finds it was clear, and proceeds to use the resource. Note that Task 1 got interrupted between sampling the bit, and testing it, _BUT_IT_DIDN'T_MAKE_ANY_DIFFERENCE_ - the system still worked. So the bottom line is all you need is the ability to capture the state of a bit, and set it no matter what, all in one atomic instruction. Add with carry works just nicely to do this: put the flag in memory, it is a whole byte, initialize it to 0x7f (i.e. all bits set, except the MS bit is clear) - the flag is now clear (resource is available). To get and set the bit do the following: set carry add with carry flag, flag jump on carry clear resource available Now if you break this sequence anywhere, it is still secure. Note that it assumes you can adc memory,memory - if you can't look for a rotate left instruction, which does about the same thing. To release the resource, simply move 0x7f to the flag byte after you've finished with the resource - that is trivial. -- dg@lakart.UUCP - David Goodenough +---+ IHS | +-+-+ ....... !harvard!xait!lakart!dg +-+-+ | AKA: dg%lakart.uucp@xait.xerox.com +---+
m5@lynx.uucp (Mike McNally) (06/20/89)
In article <577@lakart.UUCP> dg@lakart.UUCP (David Goodenough) writes: >I have never understood the need for a test and set instruction, when >you can make do with adc (add with carry). Allow me to explain: > >The point behind TAS is to allow a process to test if a flag is set or >clear, and set it no matter what the result. But why does the test have >to be in the same instruction? The example given by Mr. Goodenough in fact incorporates the changing of the state of the flag in one instruction (the add-with-carry). It is thus true that the sequence is unbreakable *at the OS level*: a normal OS will not reschedule while a task is in the middle of an instruction, because most CPU's won't allow interrupts in the middle of an instruction. (Note that this is not necessarily the case.) A real TAS instruction often comes with the proviso that the bus cycles used to fetch and store are not interruptable either. This guarantee is necessary in a multi- processor environment. I think that the x86 (x>0) series locks the bus on all XCHG instructions. The original chips required a LOCK prefix. I don't know whether or not the LOCK is honored with other read/write instructions. -- Mike McNally Lynx Real-Time Systems uucp: {voder,athsys}!lynx!m5 phone: 408 370 2233 Where equal mind and contest equal, go.
rec@dg.dg.com (Robert Cousins) (06/20/89)
In article <5742@lynx.UUCP> m5@lynx.UUCP (Mike McNally) writes: >In article <577@lakart.UUCP> dg@lakart.UUCP (David Goodenough) writes: >>I have never understood the need for a test and set instruction, when >>you can make do with adc (add with carry). Allow me to explain: >> >>The point behind TAS is to allow a process to test if a flag is set or >>clear, and set it no matter what the result. But why does the test have >>to be in the same instruction? > >The example given by Mr. Goodenough in fact incorporates the changing of >the state of the flag in one instruction (the add-with-carry). It is >thus true that the sequence is unbreakable *at the OS level*: a normal >OS will not reschedule while a task is in the middle of an instruction, >because most CPU's won't allow interrupts in the middle of an instruction. >(Note that this is not necessarily the case.) A real TAS instruction >often comes with the proviso that the bus cycles used to fetch and store >are not interruptable either. This guarantee is necessary in a multi- >processor environment. > >I think that the x86 (x>0) series locks the bus on all XCHG instructions. >The original chips required a LOCK prefix. I don't know whether or not >the LOCK is honored with other read/write instructions. Actually, the LOCK prefix was somewhat more powerful than orignally intended in initial 8086 family products. One could use the LOCK prefix before the REP prefix to build a locked string operation! Since these could be up to 64K iterations long and since the 8086 isn't that fast, it was theoretically possible to lock other processors from the bus for extended periods of time. There is another reason why atomic operations are useful: whenever there is some modicum of peripheral intelligence (as is commonly found with modern LAN controller chips), there arise cases in which memory discriptors need to be updated in a controlled fashion. For example, after building a packet in memory, the packet must be linked into the controller's out going packet list. Since the controller may be actively transmitting at that instant or worse yet, may be traversing links in list to find the next packet, an atomic operation makes possible a "seamless" insertion into the list. However, relatively few systems are designed to take advantage of this feature. The interlocked exchange operation is perhaps the most common tool for multiprocessor operation. Using it, one can simulate the test-and-set operation, the test-and-clear operation and through careful use of global values, sequenced locks and integer semaphores become practical. Some CPU families go out of their way to add interlocked operations. The DG MV series and the NSC 32000 have a list of instructions which operate in this fashion. BTW, the TAS instruction makes barrier synchronization much simpler. Without it, writing a ROM to handle 'n' processors coming out of reset at the same time and trampling over each other would not be as easy. Robert Cousins Dept. Mgr, Workstation Dev't. Data General Corp. Speaking for Myself alone. >-- >Mike McNally Lynx Real-Time Systems >uucp: {voder,athsys}!lynx!m5 phone: 408 370 2233 > > Where equal mind and contest equal, go.
davidsen@sungod.crd.ge.com (William Davidsen) (06/20/89)
In article <5742@lynx.UUCP> m5@lynx.UUCP (Mike McNally) writes: | I think that the x86 (x>0) series locks the bus on all XCHG instructions. | The original chips required a LOCK prefix. I don't know whether or not | the LOCK is honored with other read/write instructions. Specified to lock the bus until the next instruction is complete. This is a reasonable way to allow multiple processors to use any appropriate interlock. I don't really like the ADDC for flag testing, since some logic paths may require a loop until free (for short term resources) and something could overflow. Why was this posted to wizards instead of arch???? bill davidsen (davidsen@crdos1.crd.GE.COM) {uunet | philabs}!crdgw1!crdos1!davidsen "Stupidity, like virtue, is its own reward" -me
gwyn@smoke.BRL.MIL (Doug Gwyn) (07/22/89)
In article <577@lakart.UUCP> dg@lakart.UUCP (David Goodenough) writes: >So the bottom line is all you need is the ability to capture the state >of a bit, and set it no matter what, all in one atomic instruction. The key is that it be atomic; not all "add with carry" instructions are. On the PDP-11, we used to use something like TST and INCB as the two semaphore basic instructions; it was tricky due to the bus supporting both byte and word transfers. When generalizing to multiprocessor architectures, many designers seem to have found "test and set" more suitable for the purposes of basic synchronization than an arithmetic operation would be.
jmm@ecijmm.UUCP (John Macdonald) (07/22/89)
In article <577@lakart.UUCP> dg@lakart.UUCP (David Goodenough) writes: > [quoted material deleted] > >I have never understood the need for a test and set instruction, when >you can make do with adc (add with carry). Allow me to explain: > >The point behind TAS is to allow a process to test if a flag is set or >clear, and set it no matter what the result. But why does the test have >to be in the same instruction? In fact all that is needed is the ability >to capture the state of a bit, setting it as you do the capture, and >test it later. If you think about the following: > > [example of multitasking use of ADC deleted] > >Note that Task 1 got interrupted between sampling the bit, and testing it, >_BUT_IT_DIDN'T_MAKE_ANY_DIFFERENCE_ - the system still worked. > >So the bottom line is all you need is the ability to capture the state >of a bit, and set it no matter what, all in one atomic instruction. This is true for a single-processor multi-tasking situation. There is a stronger requirement for a multi-processor shared memory situation. In that case, there must be provision for the atomic instruction to: 1. Read and check the status of the old value. 2. Change to a (possibly) new value. 3. Write back the new value. (the same as described above, plus:) 4. Ensure that no other processor can access the old value between steps 1 and 3! In many processors, most read-modify-write instructions release their access path to the memory during step 2 and then regain it for step 3. This allows another processor to use the memory path without waiting. In such processors, there is generally a small number of instructions which are guaranteed to not release the memory path. For example, on the Motorola 68020, the TAS (test and set), CAS (compare and swap), and CAS2 (compare and swap twice) instructions all lock the memory bus for the duration of all of their accesses; while other instructions (e.g. add immediate to memory) which have a read-modify-write pattern do not. This type of design trades off increased speed for the non- locking operations against the reuirement that the programmer use one of the locking instructions whenever there may be a multi-processor simultaneous access to the datum. -- John Macdonald