andrew@siesoft.co.uk (Andrew Sinclair) (08/24/90)
In the PCNFS users guide under the section "locking files" it says : If you invoke locking services with /MS ..... any DOS system call affecting the drive will cause a fatal error : Locking violation Abort, Retry or Ignore: It then says that to recover you must unmount and remount the drive. Why ? Shouldn't it allow access once the lock has been removed (by the application/client that locked in the first place). Does this suggest that the client that gets in first and locks the file does not unlock it when it has finished ? sounds like an odd spec of the lock manager to me ! Is there a way out of this ? I do not want to remount my drives every time I see a locked file (I'll wait). What if I have other files open on that drive that I don't want to close ? Thanx in advance Andrew S. -- +---------------------+----------------------------------------------------+ |Andrew Sinclair |andrew....... | |Siemens SDG | | |Nixdorf House | |
geoff@hinode.East.Sun.COM (Geoff Arnold @ Sun BOS - R.H. coast near the top) (08/25/90)
Quoth andrew@siesoft.co.uk (Andrew Sinclair) (in <1990Aug24.090955.28756@siesoft.co.uk>): #In the PCNFS users guide under the section "locking files" it says : #If you invoke locking services with /MS ..... any DOS system call affecting the drive will cause a fatal error : # #Locking violation #Abort, Retry or Ignore: # #It then says that to recover you must unmount and remount the drive. Why ? #Shouldn't it allow access once the lock has been removed (by the #application/client that locked in the first place). Does this suggest that the #client that gets in first and locks the file does not unlock it when it #has finished ? # We're reworking that section of the manual (faint :-) The only time you need to unmount and remount the drive is when the remote lock manager has gone away (either due to network problems, server reboot, or genuine lock manager failure). This is a (painful) tradeoff. Suppose the server is rebooted while you have locks established. How do you reestablish the locks? The lock manager/status monitor architecture, with a strong bias towards "real" operating systems like Unix (;-) uses the following model (grossly simplified): (1) when a client requests a lock on a file, the Network lock manager (NLM) on the client sends a lock request to the NLM on the server (2) the server NLM notifies the local status monitor (SM) that it needs to keep the client informed of any status changes (3) if the server SM hasn't contacted this client before, it makes a call to the client SM to establish bidirectional notification (4) eventually the client relinquishes the lock, and calls the server NLM (5) if the server is holding no more locks for the client, the server SM calls the client SM to terminate their monitoring. Now, if the server is rebooted between (3) and (4), the server NLM and SM start up and go into recovery mode: (3a) The server NLM begins its "grace period". During the grace period, it will only accept requests to reclaim locks. (3b) The server SM checks to see which clients it was responsible for (recorded in the "/etc/sm" directory) and calls the SM on each client to advise it that the server has rebooted. (3c) The client SM advises the client NLM of the server state change. (3d) The client NLM checks its database for any locks held by the server and issues "reclaim" requests for those locks. (3e) After a suitable period, the server NLM ends the grace period and normal service resumes. [The problem of a client reboot is less traumatic: the client SM simply calls up the server SM, which advises the server NLM to release all locks held for the client.] Now, the problem on the PC is how to handle this SM and NLM traffic. To participate in this, you must implement an SM, which in turn means implementing a portmapper. Then you have to be sure that when an SM state change is notified you can issue all the reclaims in time, regardless of what the PC is doing. There is also a fair amount of "shadow state" to hold in order to cope with failures during the recovery process. For PC-NFS, we decided that we couldn't justify adding all of this baggage into the product, and that we would have to adopt a simpler, if less complete, solution. We added the notion of "non-monitored locks" to the NLM, so that the PC could request a lock without provoking the server into bombarding it with SM RPC calls. And we adopted the strategy of requiring you to remount the drive (hence clearing all the state on both the server and client). Yes, it's inconvenient, and there are one or two things we can do in the future to make it less so. I hope this has been useful. We know that we need to document all of this better, and in fact a number of publications are on the way from several different sources. (Dunno why the first one isn't out yet....) Geoff -- Geoff Arnold, PC-NFS architect, Sun Microsystems. (geoff@East.Sun.COM) -- To receive a full copy of my .signature, please dial 1-900-GUE-ZORK. Each call will cost you one zorkmid.