sjs@u1100s.UUCP (Stan Switzer) (03/04/85)
[ does this really belong in net.arch? ] In article <538@rlgvax.UUCP> guy@rlgvax.UUCP (Guy Harris) writes: > > The problem lies in picking an underlying model sufficiently robust > > to do all of the things you want to do with record and file locking, ... > > Well, several UNIX versions *do* have record and file locking - System V > Release 2 Version 2, for one. They are based on John Bass' byte-stream > locking mechanism; you lock a given range of bytes in the file. > > VMS, on the other hand, has a locking mechanism that locks magic cookies; > cooperating applications agree on a name for a resource and they lock > that name when they want to lock that resource. ... > ... > This means you can lock things other than files; perhaps locking should > have nothing to do with the file system. On the contrary, all magic cookies should be tied to the file system in some way. It really burns me the way some message passing, locking, and semaphore systems use a globally administered pool of names (or worse yet, numbers!!) for the names of entities. The file system directory structure is the ONLY logical place for global names of ANY kind to reside. UNIX drew on this idea from MULTICS, and used it to good advantage. As in MULTICS, devices and executable programs are just a particular kind of enrty in the file system. In MULTICS it went further: memory, in fact everything, was named in the file system. Actually I agree about locking resources, but locking ranges of bytes in files is useful too. Anyway, for the purpose of locking resources, you just need a file (special or otherwise) to represent a resource, and you do your locking against that file. IF SOMETHING NEEDS A NAME, PUT IT IN THE FILE SYSTEM!!!!!!!!!!!!!!! -------------------------------------------------------------------------- Stan Switzer | "We're gonna have a TV party tonight. ihnp4!u1100s!sjs | Alright!" -- Black Flag
herbie@watdcsu.UUCP (Herb Chong [DCS]) (03/07/85)
In article <190@u1100s.UUCP> sjs@u1100s.UUCP (Stan Switzer) writes: >On the contrary, all magic cookies should be tied to the file system in >some way. It really burns me the way some message passing, locking, >and semaphore systems use a globally administered pool of names (or >worse yet, numbers!!) for the names of entities. The file system directory >structure is the ONLY logical place for global names of ANY kind to reside. and you pay for it too unless your entire file directory resides in memory all the time, in which case you have consistency problems, not to mention taking up real memory which could be allocated to other things besides file names. numbers are perfectly reasonable things to have as names of entities. are you going to read and make decisions based upon these names? no, a program is going to, and it is just as happy with a short name(word) as a long one, but a long one takes up so much more space. >UNIX drew on this idea from MULTICS, and used it to good advantage. As in >MULTICS, devices and executable programs are just a particular kind of >enrty in the file system. In MULTICS it went further: memory, in fact >everything, was named in the file system. and look at how much overhead MULTICS has. not that MULTICS is bad conceptually, but why run a system that permanently takes away more than half your CPU for overhead when you can run Unix, which is just about as flexible most of the time, and you get one quarter or more of your CPU back to boot. >Actually I agree about locking resources, but locking ranges of bytes in >files is useful too. Anyway, for the purpose of locking resources, you >just need a file (special or otherwise) to represent a resource, and you >do your locking against that file. locking bytes, or inability to do so, is as much a function of hardware as software. implementation that is both efficient in memory and time requires some help from both. >IF SOMETHING NEEDS A NAME, PUT IT IN THE FILE SYSTEM!!!!!!!!!!!!!!! > >-------------------------------------------------------------------------- >Stan Switzer | "We're gonna have a TV party tonight. >ihnp4!u1100s!sjs | Alright!" -- Black Flag there is something very fundamentally wrong here with stan's argument. how do i ensure no two people have locked on the same file at the same time? no, you can't do an i/o to find out, because the other process may be queued just ahead of you for the device so that a query finds it unlocked, but the locking command (i/o or whatever) encounters an already locked file. but this is a dumb disk, it just goes ahead and locks it anyway. so now you have two processes with exclusive access to the lock at the same time. you can imagine what will happen when both try to write to the file. you have to check before you actually lock in a manner that will guarantee exclusive access by one process so what do you do? you lock on a bit (word or whatever) in memory and THEN lock on the file. but why bother locking on the file when you couldn't care less about the file but some other shared resource. AH, but a SMART disk drive would stop this. big deal, the lock is still in memory, but it's the device controller's now. still not on disk. now, let's give the bit a NAME, so it's not fixed to any one location in storage. need i go on? for those of you who don't know, OS/360 (really MVS) has both internal named locks (ENQ/DEQ) and file locks. in a multiple processor environment without shared memory, file locks are used because there is really no other way. the controller has the smarts to acknowledge a RESERVE and prevent all I/O from any other CPU's to that device until the CPU issuing the RESERVE is finished with it. (i might mention at this point that a small MVS system will only have 30 or 40 disk devices attached.) when the processors share memory, the memory management units serialize access to shared storage so that a locking technique works with words (actually, double words) for all shared resources. speed is of main concern here. you do not want to tie up a shared resource more than you have to when a typical i/o instruction can take tens of thousands of CPU cycles and you need to do at least two to lock on a file by setting bits in the filesystem. why two? one to issue the lock, and another to see if you really got it. if you don't have to do the last one, then you're already got "magic cookies" that don't use the file system. there is a very pragmatic reason for using using named "magic cookies" in a system for interprocess synchronization rather than the file system. it takes orders of magitudes less time and the system can continue to schedule processes instead of waiting around for a lock to be satisfied. also, since the lock is in memory and never needs to be updated to disk, consistency of the locks is ensured with much less overhead. this really belongs in net.flame or net.religion.operating-systems, but i am putting it here since no-one else will pay attention otherwise. Herb Chong... I'm user-friendly -- I don't byte, I nybble.... (but i'm sick and tired of ignorance) UUCP: {decvax|utzoo|ihnp4|allegra|clyde}!watmath!water!watdcsu!herbie CSNET: herbie%watdcsu@waterloo.csnet ARPA: herbie%watdcsu%waterloo.csnet@csnet-relay.arpa NETNORTH, BITNET, EARN: herbie@watdcs, herbie@watdcsu
henry@utzoo.UUCP (Henry Spencer) (03/07/85)
Nobody ever said that locks should necessarily be *implemented* using the file system. The point is that they should *look* like part of the file system to the customer. We have a perfectly good name->resource mapping mechanism already in existence; why invent another? -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
bass@dmsd.UUCP (John Bass) (03/07/85)
As the author of locking(2) and lockf(2) I have a few comments: First the locking system I designed IS a simple semaphore system in which each byte of the file address space is a magic cookie to be locked, unlocked, and tested. It so happens that I included semantics to deal wwith ranges of magic cookies .. thus the byte stream nature. The code started out as a semaphore manager for a LUNDY display memory system to allow several processes to maintain the display lists with races. The soft blocking was added to minimize the number of system calls and improve performance of the graphics applications. In 1980 I had a stiff political argument with ONYX management and a contractor over how to do file locking. The suggested "mail box" semaphores with a global 64 bit cookie/resource or a P & V semaphore drive ala the UCLA implementation in the late 70's. I resurected the lundy code and proposed it be recoded to fit into the filesystem and after nearly being run out of town it ended up the ONYX locking(2) call. The soft blocking is generally known "as enforcement mode". In 1981 I published the work in the USENIX news letter login: and it has been in the public domain. After much hassle it went thru the /usr/group standard proposal with three minor changes. 1) the soft blocking was changed to be conditionally enabled based on the SGID bit of a file to remove distructive behavor in the hands of children. Locking /etc/passwd is generally unhealthy and kids in the educational area like such games although comercial systems are generally free of such bull. 2) lock/unlock actions were defined to deal with the previous record when given negative lengths saving two addititional lseek(2) calls. 3) A testonly call was added to supplement the test&lock call, thus removing race conditions in some applications. I did not implement shared locks of any kind for several reasons. I have had a number of requests to do so and flames as why necessary. But I have never been able to reach an agreement over the semantics or how to deal with certain race conditions. There are several DIFFERING shared locking semantics: 1) A shared lock to be used for an extended update process. The intent is that only one process can hold the update lock preventing other update process from gaining access. Any reader can access the data. 2) A shared lock to be used by reading processes to prevent updates. Any reader can request a duplicate lock, all writers are blocked. 3) locks that can be promoted from 1 to 2 and back again. Type 1 locks are the most common implementation. A payroll record is locked while the operator updates the name/address on a CRT. With my implementation this application can be implemented with a clean work around. The update process locks the record and sets a timestamp for the record (or subrecord) to be updated then frees the lock. Other updating processes note the record busy while the timestamp is active. The updating process checks the timestamp to insure that it is the SAME prior to updating the record, if it changed the operator took too long (IE went to lunch) and some other process updated the record. The process can either have the operator redo the transaction with the new data displayed, or the two records can be diff edited togather. Where money amounts are involved this is easy. Type 2 locks are the most needed but cause REAL design problems. A process traversing an N-Way tree data base needs to protect upper nodes from becoming split/compacted and thus commiting the traversing process into garbage land. This effectively prevents ANY insert/delete operations in a B-tree database since the trama can extend to the root node -- thus a transversal must lockout add/delete operations for the duration. These type locks can be implemented with lockf(2) semaphores indirectly or with the aid of a deamon process -- this is a tough design problem and semaphores of any type will not help. The solution is to do all database actions in a deamon process with fifo's to the user interface -- or to put the entire data base manager into the kernel where the critical regions can be handled properly. A unix based transaction filesystem is best solution for many problems of this class. The original 1981 code that was published has a race condition between the locked call and the inode lock for the enforcement hook in rdwri. The fix is minor but varies between the varities of unix because of changes in the filesystem code. The fix is to move the inode lock into the locked procedure -- systems implementors can write and I will provide the several line fix. I will attempt to repost the lockf.c file and some of the documents to net.source within a week or so ... It has went thru some minor changes in the last couple months -- with the help of some 4.2 sites it now runs under 4.2 but is undefined for the network filesystem. The 4.2 network filesystem is "stateless" to handle certain error problems. This design choice I think will make complex databases nearly impossible because of integrity problems after certain failures. Lockf can be added as RPC's to the host machine in distributed filesystems. -- John Bass DMS Design (System Performance and Arch Consultants) {dual,fortune,idi,hpda}!dmsd!bass (408) 996-0557
herbie@watdcsu.UUCP (Herb Chong [DCS]) (03/09/85)
>Nobody ever said that locks should necessarily be *implemented* using >the file system. The point is that they should *look* like part of >the file system to the customer. We have a perfectly good name->resource >mapping mechanism already in existence; why invent another? > Henry Spencer -------------------------------------- >However, I believe that in this case you are making an unnecessary >assumption that "file system" means "disk drives". The first is a >logical concept, while the second is a physical concept. Part of the >file system (the temporary part) can be kept in (virtual) memory at all >time, and the magic cookies can be in this part. The advantage of >keeping locks in the file system is that the single naming domain >allows any program to access any object, if necessary. >Ralph Johnson point taken, gentlemen. i alluded to this possibility when i spoke about keeping the file system in real memory. why not map to virtual memory instead? in fact, why not have an entire file system in virtual memory? (here i use file system to mean the unix definition of file system which corresponds to a logical disk pack rather than the ENTIRE file system.) let's have a look at the pluses and minuses. first the pluses: 1) consistent object naming convention (a pathname) 2) high access speed and ease of consistency checking 3) trivial generalization to virtual files that can be read/written, and otherwise manipulated using existing system calls 4) small changes in kernel code, if any, of existing unix systems now some minuses: 1) if the virtual files are being used for anything other than locking, yur system MUST not be memory bound and MUST have a well tuned pageing system. otherwise your CPU thrashes 2) relating file systems to a lock is not as intuitively obvious as you think. it's hard enough to explain interprocess synchronization sometimes without bringing in the complexity of using the file system for locking. mostly, i am quibbling over conceptualization. even if the lock is viewed as a file, what is actually happening is still that a named object in storage (usually a bit or word) is what the system itself will be testing for existence of a lock. you have just added a layer of abstraction to the problem to unify a logical description of the system. deep down inside, it's still bits in memory that are being massaged. (i wonder why AT&T or Berkeley don't implement a virtual file system for unix. i think it would be a wonderful thing, if done right.) as a side note, OS/360 (really MVS) implements a virtual file system if one desires to by refering to UNIT=VIO in DD statements. the file (which cannot be permanent) is mapped to virtual storage and is handled by paging devices if real I/O needs to be done. one must not be storage constrained, have an inefficient pageing system, nor be accessing large amounts of storage (large files) with it or the system will page itself to death. MVS never thrashes due to the design of the scheduler, but it will spend almost all it's time in page-wait if many large VIO files are open at once. a good MVS system administrator will keep a careful watch on usage of VIO. MVS never uses the VIO files for locking, to my knowledge, though there is nothing to stop it from doing so. the ENQ/DEQ facility used for locking files is generalized enough to lock on any resource. as a further digression, i keep mentioning MVS because i know a lot about it, and just about everything in operating system design has been tried on it. if nothing else, every algorithm that has even a reasonable chance of working was tried at one time or another on OS/360. Herb Chong... I'm user-friendly -- I don't byte, I nybble.... UUCP: {decvax|utzoo|ihnp4|allegra|clyde}!watmath!water!watdcsu!herbie CSNET: herbie%watdcsu@waterloo.csnet ARPA: herbie%watdcsu%waterloo.csnet@csnet-relay.arpa NETNORTH, BITNET, EARN: herbie@watdcs, herbie@watdcsu
jlg@lanl.ARPA (03/09/85)
> [ does this really belong in net.arch? ] Yes, it pertains to the archetecture of operating systems. There is no other network forum for this discussion and I don't know how to make one. > It really burns me the way some message passing, locking, > and semaphore systems use a globally administered pool of names (or > worse yet, numbers!!) for the names of entities. The file system directory > structure is the ONLY logical place for global names of ANY kind to reside. This assumes that the shared resources are global, that processes that don't share resources are willing to put up with the overhead of having this capability in the file interface, and that the designer agrees with your religious conviction that the file system should be the only global entity. SOME information about shared resources needs to in the system just as SOME I/O interface must be in the system. But the place for rarely used or very sophisticated features is at some level higher than the system kernal. > IF SOMETHING NEEDS A NAME, PUT IT IN THE FILE SYSTEM!!!!!!!!!!!!!!! And therby slow the file system down for ALL users whether they use the additional global naming capibility or not. J. Giles
avie@cmu-cs-spice.ARPA (Avadis Tevanian) (03/09/85)
> worse yet, numbers!!) for the names of entities. The file system directory > structure is the ONLY logical place for global names of ANY kind to reside. Would you care to say *why* the file system is the only logical place for global names to reside? When I read articles like this, I begin to worry. As far as I'm concerned, a file is just one of many resources in a system. In fact, THERE IS NO REASON that the kernel need know what a file is! Of course, many of the readers of this newsgroup (hopefully not all) are still living in the primitive world of "the Unix Way." We are currently developing a new kernel based on Accent (which was developed here at CMU). Our kernel, conceptually, only knows how to manipulate address maps of processes. The file system is implemented by user level processes, which understand the disk format of the file systems, whether they be on a Fujitsu Eagle, RX01 floppy, or over the ethernet. When raw disk I/O needs to be performed, drivers are called. Even these drivers need not exist in the kernel proper, except that it is more efficient to put them in the kernel. So, in my opinion, it is better to have a general name server, which can assign names to a variety of objects (files, semaphores, ports, whatever). Related to locking, it seems to me that doing locking in the file system is not such a great idea if you want any type of speed. I wrote up a 4.2 semaphore package (in the kernel) which runs on a VAX 784 (4 780 processors connected with shared memory). A P/V combination takes only about 500 microseconds (that includes 2 traps, verification of valid parameters, and all that ugly stuff). I'd like to see you do that with file system locking. Avie Tevanian
pdbain@wateng.UUCP (Peter Bain) (03/09/85)
> >point taken, gentlemen. i alluded to this possibility when i spoke >about keeping the file system in real memory. why not map to virtual >memory instead? in fact, why not have an entire file system in virtual >memory? (here i use file system to mean the unix definition of file >system which corresponds to a logical disk pack rather than the ENTIRE >file system.) Sorry, Herb - it's been tried. The system is Multics, (which was designed by MIT, Bell Labs, and GE) and had absolutely everything anyone had ever though about putting in an operating system. :-) ) The problem with files == segments in Multics was that everything was demand paged, and so if you were going through the file sequentially, your working set got cluttered with stale pages. I am not saying that this has to happen, but that it's not quite as simple as it appears. -- - peter bain ...!{allegra|decvax|clyde|ihnp4 }!watmath!wateng!pdbain hard mail: CCNG, CPH-2369A, University of Waterloo, Waterloo, Ont. Canada N2M 5G4 telephone: (519) 885-1211 x2810
tihor@acf4.UUCP (Stephen Tihor) (03/11/85)
ELXSI did that in Embos, their own operating system, philosphically Unix-based (much as Algol 68 was ALgol 60-based). There is one general name service that covers Files, Devices, even System calls. Thus there is one general protection mechanism that covers them all, etc. A very clean effect. [Of course their Ads are still filled with Vapourware and Marketing's idea of benchmarks, but they are a fairly new company so I can expect them to even behave up to DEC's standards.(;-))]
mat@amdahl.UUCP (Mike Taylor) (03/11/85)
> for those of you who don't know, OS/360 (really MVS) has both internal > named locks (ENQ/DEQ) and file locks. in a multiple processor > environment without shared memory, file locks are used because there is > really no other way. the controller has the smarts to acknowledge a > RESERVE and prevent all I/O from any other CPU's to that device until > the CPU issuing the RESERVE is finished with it. It's a small point, but the facts might as well be straight. Since RESERVE/RELEASE locks an entire volume, not a file, resulting in unwanted side-effects such as deadlocks, the internal named locks may be propagated through all sharing processors by a facility called Global Resource Serialization (GRS). GRS uses channel-to-channel connection to pass a token around all connected processors. Possession of the token allows tasks (processes) on that system to obtain/release global locks. The token is passed every n milliseconds (n = tunable parameter). -- Mike Taylor ...!{ihnp4,hplabs,amd,sun}!amdahl!mat [ This may not reflect my opinion, let alone anyone else's. ]
bass@dmsd.UUCP (John Bass) (03/12/85)
From Mr Tevanian: "Would you care to say *why* the file system is the only logical place for global names to reside? .. In fact, THERE IS NO REASON that the kernel need know what a file is! Of course, many of the readers of this newsgroup (hopefully not all) are still living in the primitive world of "the Unix Way." More Ivory Tower snobbery we can do with out ... every CSc department has its pet students and operating system research programs to produce mindless trash to be ignored by the rest of the sane world (how this for an attention getter living in the primative world). OS research is the fabric of learning what and what not do do in real life -- more the latter I feel. I will grant him that an OS NEED NO KNOW about a file as a fundamental resource for some applications -- on the other hand the file is the basis of most commercial applications environments. From the abstract point of view nami in UNIX provides the mapping from a character string name space to a resource ID (an inode). A resource ID in my model of lockf/locking is a semaphore point with a large number of magic cookies in it. Of course most resource ID's are files, but others are network interfaces, system services, drivers, etc. It is unfair to say that every unix file descriptor is a file. As for performance you quote for PV operations ... the lockf call operates on a 780 with similar timings -- thus there is nothing particularly sloooow about locking and semaphores done with inode handles. Path/file names are a good (if not best) way of describing resource IDs -- the magic number approach is trash --- -- John Bass DMS Design (System Performance and Arch Consultants) {dual,fortune,idi,hpda}!dmsd!bass (408) 996-0557
wapd@houxj.UUCP (Bill Dietrich) (03/14/85)
(Let's keep the discussion friendly and factual, okay ?) I agree with the side that says locks or resources or whatever should look like files if possible. The reason is that the user should have to deal with as few naming conventions or ideas as possible. If everything can be unified into one naming convention (tree-structured, path-names), the user will have a managable number (one) of things that must be understood on the way to solving the problem that is of interest. So I think it is easier to know how files are named and to remember that some files are really locks, some are devices and some are data, than to remember that there is one way to name locks, another to name devices, another for data. I really don't care whether you want to define the naming convention as names of locks (and some locks happen to have data associated with them) or as names of files (and some files happen to have locks associated with them). As a side comment, at least in Unix, using file names to represent "pure" locks shouldn't slow down access to regular files. The kernel already tests whether a file is regular, character special, block special, pipe, etc. "Lock" is just another case. Bill Dietrich houxj!wapd
jlg@lanl.ARPA (03/18/85)
> I agree with the side that says locks or resources or whatever > should look like files if possible. The reason is that the > user should have to deal with as few naming conventions or > ideas as possible. If everything can be unified into one > naming convention (tree-structured, path-names), the user will > have a managable number (one) of things that must be understood > on the way to solving the problem that is of interest. I agree with the above philosophy but not with the terminology. There should indeed be only one system pool for name/resource management. But this is NOT the file system. The main function of the file system is to provide a mechanism for data transfer between a process and certain types of resources. Locks and semaphors (as well as perhaps other resources) don't need this functionality of the file system and should not be associated with the file system. Note that the file system will use whatever resource naming convention is implemented by the appropriate part of the OS. The file system may also use locks or semaphors which are implemented within the OS (to prevent two processes from simultaneously writing to the same place, for example). Both of these things are functions of the operating system which are distinct from the file system. J. Giles