wittle@dg-rtp.rtp.dg.com (Mark Wittle) (05/16/91)
the question ------------ How should NFS access to mandatory locked files be handled? How does your implementation handle it? the issue --------- [Read carefully - there will be a quiz later.] Access to mandatory record locking (MRL) files across NFS introduces a new class of transient errors - mandatory record locks. NFS access to a remote file is provided by an agent process, NFSD, on the remote system. Each NFSD is time-shared among all remote NFS users. Thus, it is not acceptable for an NFSD to pend indefinitely (on behalf of a single remote user) waiting to gain access to an MRL file that is locked. Therefore, read and write accesses to MRL files that fail due to mandatory record locks must either be retried later, or fail. If they are to be retried later, then the BIOD can continue to attempt the file access periodically, but in the event that the access fails repeatedly, it boils down to this - the issue must be resolved when the user closes the file. If the calls are to fail, then the user must be notified of the error asynchronously, which may result in the user's close(2) call returning a transient error (e.g., EAGAIN) caused by a write request encountering a mandatory record lock. Not only is this error not defined in any unix standard, it is not a good semantic. Currently, the cases where close returns an error (e.g., timeout while writing to a soft mounted file system) are included for correctness only - the file data might not be what the user intended, but there isn't anything the user can do about it. Here is the difficulty. Returning a transient error from close implies that the user can correct the situation. However, once the close call has returned (successfully or in error), the user can no longer access the file, because the file descriptor for the file is no longer valid, and the file is closed. Before the close call has returned, the fate of the cached data must already be decided - the transient error on the server becomes permanent on the client, with possible loss of data. This problem is a result of client side buffering, and could be avoided if NFS were able to force all remote access to MRL files to be done synchronously. However, the client can not insure that a file does or does not have mandatory locking enabled (since the bit can be set or cleared any time), nor can the server insure that the user's access is synchronous. the quiz -------- 1) If an NFSD tries to access a MRL file, but the NFSD's access DOESN'T CONFLICT with any existing locks (or not locks are set), the NFSD should - a) go ahead and access the file, (why not, nobody is denying the access) b) return an error, like ENOTSUPPORTED (too bad its not in the NFS protocol) that implies that access to MRL files isn't supported by NFS. 2) If an NFSD tries to access a MRL file and the file is CURRENTLY LOCKED, the NFSD should - a) pretend that the (client-side) user opened the file non-blocking and return EAGAIN. b) pend, waiting for the record lock to be removed (who cares if all of the NFSD's get hung). c) return an error, like ENOTSUPPORTED (too bad it isn't in the NFS protocol) that implies that access to MRL files isn't supported by NFS. 3) If a BIOD receives the error EAGAIN from the server while performing a read-ahead or write-behind operation, the BIOD should - a) treat it like a retryable error (like a timeout) and try it again later (or wait for the user process to try the operation directly). b) treat it like a non-retryable error, and mark the file inaccessible (or wait for the user process to try the operation directly). 4) Same as question 3, but with an ENOTSUPPORTED error instead. 5) A user process is performing NFS write operations during a close(2) call, and receives an EAGAIN error from the server, the O.S. should - a) return EAGAIN (who cares what the man page says). b) try the operation continually, until it succeeds. a) send the user process a SIGKILL, and pretend it didn't happen. 6) A user process is copying a MRL file across NFS to a remote file system. The O.S. should - a) notice that the mandatory locking bit is set before making the NFS setattr call (on the client), and return ENOTSUPPORTED. b) notice that the mandatory locking bit is set before creating the file (on the server), and return ENOTSUPPORTED. c) allow the create to succeed, and then when the user attempts to write data to the file, fall back to one of the cases presented above. Post a reply, or email to wittle@dg-rtp.dg.com, and I'll summarize.