david@marvin.jpl.oz (David Magnay) (01/22/91)
Can someone pls explain spinlocks. Are these a form of semaphore ? Thanks .......................................................... "You learn a lot as you go thru life, but not usually what you expected" .......................................................... David Magnay mail david@marvin.jpl.oz Boral Elevators was: Johns Perry Lifts 45 Wangara Road, Cheltenham 3192 phone (03) 584-3311 Victoria, Australia O/seas +61 3 584 3311
ewoods@hemel.bull.co.uk (Eoin Woods) (01/22/91)
david@marvin.jpl.oz (David Magnay) writes: >Can someone pls explain spinlocks. Are these a form of semaphore ? Hmm, interesting question! Basically yes. The most simple spin-lock is a piece of code which would look something like: while (locked) check_lock (lockA) ; The essential thing about a spin-lock (if I can remember my real-time systems courses) is that the process concerned loops until the lock it is waiting for is released (and so can be inefficient in a real-time environment as it uses resources and is stuck until the lock is released for it). A good reference for this sort of information is : "Real-Time Systems Design", 2nd ed, Allworth and Zobel, Macmillan. Eoin. -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~ Eoin Woods, Software Development Group, Bull HN Information Systems, ~ ~ Maxted Road, Hemel Hempstead, Herts HP2 7DZ, UK. ~ ~ Tel : +44 442 232222 x4823 Fax : +44 442 234084 ~ ~ < Eoin.Woods@hemel.bull.co.uk or ...!uunet!ukc!brno!ewoods> ~ ~ < When do we start news group comp.os.emacs ? :-) > ~
prent@interc.UUCP (01/24/91)
In the Concurrent multiprocessor implementations, as in a number of others, the spinlock technique is almost perfectly efficient due to the hardware architecture. The presumption is only that the checking processor really has nothing better to do than wait for the resource being locked. Given this, then the locking processor locks the lock with a single instruction, no OS overhead, etc. The checking processor fetches the lock (once only) from shared memory... thereafter (while "spinning"), it is merely looping in its own cache, with no resource toll being made on memory hardware. The eventual unlocking of the lock causes the cache-invalid signal to be generated and (a single) subsequent real-memory read. Net memory traffic is one read if originally unlocked, two otherwise. Cheap, and the requesting processor can go about his business within two memory cycles of the resource becoming available, or less than many single- instruction execute times. Compute in Good Health (and that Quickly), Prentiss Mooney
leong@interc.UUCP (01/25/91)
I don't know about you guys out there, but I consider CPU cycles extremely precious. First of all, spinlocks seems to be an acceptable KLUGE for small applications, but I sure do not want to design my Real-Time -- Flight Criticle -- Life Saving -- System with the initial flaws of techniques such as spinlocks! I guess in the scenerious of shared-memory multisystem environments might warrant a need for something like spinlocks, but I would guess that there are other means of providing synchronizations whether it be through additional software messaging or hardware signalling. By today's real-time standards of microsecond context switching, I find it hard that designers would consider spinlocks. Leong
dcousins@bbn.com (Dave Cousins) (01/30/91)
fox@rudolf.nscl.msu.edu (Ron Fox) writes: >-- >operation, and you need some external thing (either an event, included >here is a pre-empting process), or another CPU to break the loop. I use spin locks on a shared memory architecture, basically because the interrupt processing on our system was not fast enough to use as a semaphore. Spin locks have low latency, especially when the system releasing the lock (i.e. writing the lock bit) does so by cycle-stealling from the polling cpu. Dave Cousins
brucej@verdix.com (Bruce Jones) (01/31/91)
In article <3600010@interc> leong@interc.UUCP writes: >I don't know about you guys out there, but I consider CPU cycles >extremely precious. First of all, spinlocks seems to be an >acceptable KLUGE for small applications, but I sure do not want >to design my Real-Time -- Flight Criticle -- Life Saving -- System >with the initial flaws of techniques such as spinlocks! [paragraph deleted...] >By today's real-time standards of microsecond context switching, I >find it hard that designers would consider spinlocks. Spin locks are used on shared memory multiprocessors _because_ CPU cycles are precious, not because designers like kludges. Used properly, spin locks are extremely efficient and completely safe. Used improperly, they are extremely dangerous, as you have pointed out. Consider a shared resource protected by a "sleepy" lock such as AT&T UNIX semaphores. When there is contention for the resource, the cost of gaining it includes the cost of executing a system call, the cost of the time the kernal spends putting the processor to sleep and waking it up again and the cost of at least two context switches. Then change the sleepy lock into a spin lock. Now the cost of gaining the lock is the time spent spinning. This time depends on the spin lock implementation, the amount of contention, and the length of time the lock is held by a process once gained. When the lock is held for only a short length of time and contention is low, spin locks are a huge win over sleepy locks. Measure the cost of a system call and a few context switches on your system. Better yet, meausre the cost of an SysV semaphore call! :) Then measure the time needed to execute one or two or six test-and-set instructions. On a 68030 the ratio is likely to be almost 500 to 1. Anderson has published an in-depth study of spin lock implementation and contention issues in IEEE Software Engineering. There were some rather subtle and suprising results arising from shared memory caches, bus contention, and the like. He was able to demonstrate effective spin lock algorithms for a wide range of environments. He clearly explained subtle flaws in naive implementations that can cause problems in high contention SMMP machines. There are several dangers inherent in spin locks. The foremost is the danger of uncontrolled spinning, if the owner of a lock is aborted or exits or is simply context switched out. This is why spin locks are not used in user programs running under UNIX, but are used extensively in MP implementations of UNIX. The operating system can protect itself, by preventing context switches, masking interrupts and the like, but in general these capabilities are not available to user programs. My friend Mark Heuser presented a paper at the last Usenix on his implementation of safe, reliable and efficient spin locks for user programs under UNIX. He found that a user application and the OS can cooperate to produce safe spin locks with performance far exceeding typical context switch times. Without system calls or context switches. His work is awesome, and I'm sorry I don't work for that company anymore... I'm sending this off from home, and my references are at work. I will be glad to email you bibliography entries if you drop me some email. -- brucej -------------------------------- Bruce Jones Verdix Corporation, Aloha OR brucej@verdix.com
chris@yarra.oz.au (Chris Jankowski) (01/31/91)
In article <3600010@interc> leong@interc.UUCP writes: > > I don't know about you guys out there, but I consider CPU cycles > extremely precious. First of all, spinlocks seems to be an > acceptable KLUGE for small applications, but I sure do not want > to design my Real-Time -- Flight Criticle -- Life Saving -- System > with the initial flaws of techniques such as spinlocks! In multiprocessor (shared memory on a bus) systems CPU cycles are cheap compared to eg. cost of accessing memory. That changes the whole perspective. Moreover cost of processors is falling and they are becoming faster and faster therefore imposing higher and higher load on memory and *bus*. I am talking here about *single* systems in the sense that they run a single copy of an operating system. > > By today's real-time standards of microsecond context switching, I > find it hard that designers would consider spinlocks. > For systems mentioned above spinlocks are the basic and cheapest tool used within OS to lock critical sections of OS code. This works fine. Best implementations provide efficiency of about 97% for say 8 processor systems on typical mixture of tasks. The same concept may be useful outside the OS area eg for real-time applications. But again you better have lots of processors. Definitely more than one. And of course it only makes sense to use spinlocks to synchronise events with finite and low waiting time. Again we have a tradeoff: Spinlocks provide low cost, low latency synchronisation mechanism at a price of burning CPU cycles. -m------- Chris Jankowski - Senior Systems Engineer chris@yarra.oz.au ---mmm----- Pyramid Technology Corporation Pty. Ltd. fax +61 3 521 3799 -----mmmmm--- 1st Floor, 553 St. Kilda Road tel. +61 3 525 1730 -------mmmmmmm- Melbourne, Victoria, 3004 AUSTRALIA (03) 521 3799 ``If you're crossing the nation in a covered wagon, it's better to have four strong oxen than 100 chickens. Chickens are OK but we can't make them work together yet.'' - Ross Bott, Pyramid U.S., on multiprocessors at AUUGM '89.
vestal@SRC.Honeywell.COM (Steve Vestal) (01/31/91)
In article <3600010@interc> leong@interc.UUCP writes: >> acceptable KLUGE for small applications, but I sure do not want >> to design my Real-Time -- Flight Criticle -- Life Saving -- System >> with the initial flaws of techniques such as spinlocks! It's not clear to me that spin-locks should be completely ruled out for reliable real-time systems. Essentially, multiple processes contending for the same lock are queued using a random discipline. I admit this is not as conceptually nice as being queued using some priority discipline. However, in many real-time systems the maximum number of processes that might ever be queued on a semaphore can be bounded (by some small number). It would seem that the total blocking time for any of these processes could be bounded by the sum of the critical section execution times (plus some delta for the spin-lock timing). This may well be a larger blocking time for many of these processes than could be achieved using, say, a priority inheritance protocol. Except, of course, where the priority inheritance lock/unlock times are large relative to the sum of the critical section and spin-lock times. Steve Vestal Mail: Honeywell S&RC MN65-2100, 3660 Technology Drive, Minneapolis MN 55418 Phone: (612) 782-7049 Internet: vestal@src.honeywell.com
fox@rudolf.nscl.msu.edu (Ron Fox) (01/31/91)
-- In article <3600009@interc>, prent@interc.UUCP writes: >Path: > > >In the Concurrent multiprocessor implementations, as in a number of >others, the spinlock technique is almost perfectly efficient due to >the hardware architecture. The presumption is only that the >checking processor really has nothing better to do than wait for the >resource being locked. This is the key assumption and depending on what you use spinlocks for it may or may not be true. To me a place to use spin locks would be in the O/S to lock shared resources such as the process queues, or distributed O/S tables. The place in this architecture *NOT* to use spinlocks would be in process level IPC's because during the spin, you could get better system wide throughput if you could schedule someone else to run while waiting for the lock to complete. Of course this should not be taken as a blanket statement. If the cooperating applications *knows* things about it's partners, or has special timing requirements for the latency of detecting the release of the lock (this is comp.realtime after all), then all bets are off and the actual processor usage profile is less important than the timing requirements of the application. > [coherent caching discussion elminated here.. ] Ron Ron Fox | FOX@MSUNSCL.BITNET | Where the name NSCL | FOX@RUDOLF.NSCL.MSU.EDU | goes on before Michigan State University | MSUHEP::RUDOLF::FOX | the quality East Lansing, MI 48824-1321 | | goes in. USA
ikw@genrad.UUCP (Isaac K. Wong) (02/01/91)
In article <68639@yarra.oz.au> chris@yarra.oz.au writes: > >For systems mentioned above spinlocks are the basic and cheapest >tool used within OS to lock critical sections of OS code. >This works fine. Best implementations provide efficiency of about 97% >for say 8 processor systems on typical mixture of tasks. > >The same concept may be useful outside the OS area eg for real-time >applications. >But again you better have lots of processors. >Definitely more than one. I agree. Spin locks work. They work especially well in multi-processor shared-memory systems running no OS. I implemented such scheme in one of my projects here. It's a multi-channel real time signal analysis system utilizing multiple processor boards and shared memory. As some other netters have pointed out, when one processor board is looping over a spin lock, it has no better things to do! -- Isaac Wong GenRad, Structural Products Div. Milpitas, CA ...!sun!topo!wongi or wongi@topo.uucp
jat@xavax.com (John Tamplin) (02/03/91)
Spin locks are quite useful in multiprocessor systems, at least in terms of synchronization between OS's on different processors accessing common data structures. However, it is still not perfect. If n processors are accessing some shared data structure simultaneously, 1 will get it. The others will spin on local copies in their respective caches. When the lock is released, the lock gets ping-ponged between the caches with one getting it again. This costs n-1 bus cycles moving ownership of the cache line between the processors. You do it again next time, and so on, until the last processor wins the lock. Granted, this is worst case, but you can still have O(n^2) bus cycles to get the lock to all processors. Additionally, there is only a probabilistic guarantee of fairness. If the resources are available, I think hardware semaphores are a much neater solution. With the gap between processor and memory speeds increasing, it is reasonable to assume that large-scale multiprocessors will use some sort of "request packet" structure (ie, the processor sends a message "read location 20 and send me the result" and blocks until it gets a response). With this structure, it would be easy to implement hardware semaphores that queue requests and send a response to a processor when it owns the semaphore. With the depths available in FIFOs, you won't have to worry about a limited queue, and the interface could be as simple as: dummy=[sem0] /* block until lock granted */ ... critical section ... [sem0]=dummy /* release lock */ This solution guarantees O(n) bus cycles (cycle in this case being a request packet and a response packet) to solve the same problem as before, which is optimal. Also, you can enforce whatever kind of fairness you desire, even giving certain processors higher priority and ordering the queue by that priority. None of this requires any changes to the processor itself, although the bus interface would have to already support a disconnect-style bus. -- John Tamplin Xavax jat@xavax.COM 2104 West Ferry Way ...!uunet!xavax!jat Huntsville, AL 35801