[comp.protocols.nfs] NFS & mandatory lock files

wittle@dg-rtp.rtp.dg.com (Mark Wittle) (05/16/91)
the question
------------
How should NFS access to mandatory locked files be handled?
How does your implementation handle it?


the issue
---------
[Read carefully - there will be a quiz later.]
Access to mandatory record locking (MRL) files across NFS
introduces a new class of transient errors - mandatory record locks.
NFS access to a remote file is provided by an agent process, NFSD,
on the remote system.  Each NFSD is time-shared among all remote
NFS users.  Thus, it is not acceptable for an NFSD to pend
indefinitely (on behalf of a single remote user) waiting to gain
access to an MRL file that is locked.

Therefore, read and write accesses to MRL files that fail due to
mandatory record locks must either be retried later, or fail.

If they are to be retried later, then the BIOD can continue to
attempt the file access periodically, but in the event that the
access fails repeatedly, it boils down to this - the issue
must be resolved when the user closes the file.

If the calls are to fail, then the user must be notified of the
error asynchronously, which may result in the user's close(2) call
returning a transient error (e.g., EAGAIN) caused by a write
request encountering a mandatory record lock.  Not only is this
error not defined in any unix standard, it is not a good semantic.
Currently, the cases where close returns an error (e.g., timeout
while writing to a soft mounted file system) are included for
correctness only - the file data might not be what the user intended,
but there isn't anything the user can do about it.

Here is the difficulty.  Returning a transient error from close
implies that the user can correct the situation.  However, once
the close call has returned (successfully or in error), the user
can no longer access the file, because the file descriptor for
the file is no longer valid, and the file is closed.  Before the
close call has returned, the fate of the cached data must already
be decided - the transient error on the server becomes permanent
on the client, with possible loss of data. 

This problem is a result of client side buffering, and could be
avoided if NFS were able to force all remote access to MRL files
to be done synchronously.  However, the client can not insure
that a file does or does not have mandatory locking enabled (since
the bit can be set or cleared any time), nor can the server insure
that the user's access is synchronous.


the quiz
--------
1) If an NFSD tries to access a MRL file, but the NFSD's access
   DOESN'T CONFLICT with any existing locks (or not locks are set),
   the NFSD should -
   a) go ahead and access the file, (why not, nobody is denying the access)
   b) return an error, like ENOTSUPPORTED (too bad its not in the
      NFS protocol) that implies that access to MRL files isn't
      supported by NFS.

2) If an NFSD tries to access a MRL file and the file is CURRENTLY
   LOCKED, the NFSD should -
   a) pretend that the (client-side) user opened the file non-blocking
      and return EAGAIN.
   b) pend, waiting for the record lock to be removed (who cares if
      all of the NFSD's get hung).
   c) return an error, like ENOTSUPPORTED (too bad it isn't in the
      NFS protocol) that implies that access to MRL files isn't
      supported by NFS.

3) If a BIOD receives the error EAGAIN from the server while
   performing a read-ahead or write-behind operation, the BIOD should -
   a) treat it like a retryable error (like a timeout) and try
      it again later (or wait for the user process to try the
      operation directly).
   b) treat it like a non-retryable error, and mark the file
      inaccessible  (or wait for the user process to try the
      operation directly).

4) Same as question 3, but with an ENOTSUPPORTED error instead.

5) A user process is performing NFS write operations during a close(2)
   call, and receives an EAGAIN error from the server, the O.S. should -
   a) return EAGAIN (who cares what the man page says).
   b) try the operation continually, until it succeeds.
   a) send the user process a SIGKILL, and pretend it didn't happen.

6) A user process is copying a MRL file across NFS to a remote file
   system.  The O.S. should -
   a) notice that the mandatory locking bit is set before making the
      NFS setattr call (on the client), and return ENOTSUPPORTED.
   b) notice that the mandatory locking bit is set before creating
      the file (on the server), and return ENOTSUPPORTED.
   c) allow the create to succeed, and then when the user attempts
      to write data to the file, fall back to one of the cases
      presented above.


Post a reply, or email to wittle@dg-rtp.dg.com, and I'll summarize.