daveb@rtech.ARPA (Dave Brower) (03/28/85)
>> The VMS locking protocol supposedly works on VAX/VMS systems... ^^^^^^^^^^ Well actually, it DOES work, very well. I suspect the author's orthodoxy has crept into this 'non-religious' discussion :-). > Perhaps somebody from Relational Technology would be interested in > discussing the problem of locking on UNIX, since they had to deal > with it in porting Ingres to UNIX machines. > Phil Kos For a good introduction to the issues, you should examine C.J. Date's "Introduction to Database Systems, Volume II," (Addison-Wesley). Chapter 3 is devoted to concurrency control. He discusses in detail the tradeoffs that are made between maximizing concurrency, repeatability, and data integrity. The drive to increase concurrency is the major reason for the use of many different locking levels, such as EXCLUSIVE, SHARED, INTENDED_SHARED, SHARED_INTENDED_EXCLUSIVE, and INTENDED_EXCLUSIVE. These are easily handled via the magic-cookie lock manager, but are difficult with simpler schemes (like lockf()). This is not to say that lockf() is inadequate for data integrity, only that it does not allow the maximum possible concurrency. Flock(2) could be extended to support more lock levels but has insufficient granularity. OK, presume you've decided you need a cookie manager, how do you get one? An adequate magic-cookie lock manager comes with the VMS tape; on Unix we need to do some work. Locking can be seen as a specialized inter-process communications problem. Since we know how well different Unices handle IPC :-), building a fast lock manager gets ugly very fast. We can't presuppose any locking primitives (such as the lockf()) beyond the raw filesystem. Uucp, tip, et. al. do locking using files, which works, but is much too slow to consider for a database. The options are shared memory (System V only), semaphores (SV only), named pipes (SV only), sockets and server processes (BSD only), and device drivers. RTI currently does locking for INGRES under Unix using a pseudo-device controlling 'magic-cookies.' Two of the advantages of this approach are (relative) universality among the different Unixes, and kernel-level atomicity of lock operations. The major disadvantage is the need to install the pseudo-device driver in the kernel of any system to support multi-user INGRES. This is not a simple operation for many, many Unix sites, nor is it always easy to have an OEM install a driver for you. There has been some discussion of the desirability of putting locks in the filename space rather than the 'magic-cookie jar.' I think this is inappropriate for database locks for the following reasons: 1) Creating names in the files space is irrelevant to an application. It just wants the lock. 2) Creating and destroying a file-space name is always going to take some time, even if it's just copying the magic-cookie name into the directory file. Given the number of potential locks in the database space (one for each record), and the frequency of lock transactions, this overhead is unacceptable in a high-performance application. 3) In the network environment, there is not likely to be a common namespace for files on all the connected machines. This is why the BSD IPC Unix domain (implemented using the filesystem namespace) is only good on-machine. The Internet domain for communication with other machines is implemented using cookie-like network addresses. My votes are disposed towards very general advisory schemes that go VERY fast and can work in a networked environment. -dB ---------------- These opinions are my own, and do not necessarily represent those of my employer. VMS is a trademark of Digital Equipment Corporation. Unix and System V are tradmarks of ATT Bell Laboratories. -- {ucbvax, decvax}!mtxinu \ ihnp4!amdahl / !rtech!daveb "If it worked, we wouldn't call it High Tech"