[net.unix-wizards] Extended file system / File locking on networks

gnu@l5.uucp (John Gilmore) (12/30/85)

In article <1011@brl-tgr.ARPA>, gwyn@brl-tgr.ARPA (Doug Gwyn) writes:
> AT&T's RFS, I was told, treats a network link going down the same
> as it would a disk going off-line; there will be an error returned
> from any subsequent attempt to do I/O to the inaccessible file.
> The obvious alternative to I/O errors when a net link goes down is
> to block processes doing remote file I/O over the link until it
> comes back up; this is probably unwise for record locking systems.

The Sun NFS provides both options when a link or machine goes down.  If
you have mounted the file system "hard", then it blocks I/O ops until
it comes back.  If you mount "soft", it retries a few times and then
returns an error code.  I tended to mount non-critical stuff soft,
e.g. my net.sources archives, so in case I touched them while the server
was down, I wouldn't hang with unkillable processes.  For your root
partition you tend to want a hard mount...

> Note that full support for UNIX file system semantics is a crucial
> issue for AT&T UNIX System V systems, which support record locking.

Note that 4.2BSD also has file locking support, and that it doesn't work
on NFS, and that so few programs break because of this that it's not
worth mentioning.  How many things really use Sys V file locking?
Certainly not all the Unix utilities that remain unchanged since V7.

Note also that a serious file locking mechanism on a network must provide
a way for a user program to be notified that the system has broken its lock.
This situation occurs when a process locks a file on another machine, 
and a comm link between the two machines goes down.  You clearly can't
keep your database down for hours while AT&T (grin) puts your long line
back in service, so the lock arbiter reluctantly breaks the lock.  (It
can't tell if your machine crashed or whether it was just a comm
line failure anyway.)  Now everybody can get at the file OK, but when the
comm link comes back up, the process will think it owns the lock and
will muck with the file.  So far nobody has designed a mechanism to tell
the process that this has happened, which means to be safe the system must
kill -9 any such process when this happens (e.g. it must make it *look*
like the system or process really did crash, even though it was just a
comm link failure).  I'm not sure how you even *detect* this situation
though.

This never happened on single machines with file or record locking because
when the kernel crashes, it takes all the user processes with it, so
when it comes back up, they won't be around to munge the file.

Sun (Jo-Mei Chang) is doing some research on how to have the lock
manager know within 30 seconds or so that your host has gone down (so
it can break the lock), but last time I heard, her scheme relied
heavily on broadcast or multicast packets, and gets very inefficient as
soon as you start doing serious traffic thru a gateway or a
non-broadcast network.  And even if they implemented the System V file
locking standard using such a lock manager, that doesn't solve the
above problem.

gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (12/30/85)

> Note that 4.2BSD also has file locking support, and that it doesn't work
> on NFS, and that so few programs break because of this that it's not
> worth mentioning.  How many things really use Sys V file locking?

If I had it, I'd sure use it.  (RECORD locking more than FILE locking.)
Living on a 4.2BSD kernel, I have no record locking at all (my System V
emulation obviously can't compensate adequately for this lack).  Since
record locking is a recent addition, it is not surprising that previously
existing utilities don't use it; that proves nothing about its possible
future importance.

> Note also that a serious file locking mechanism on a network must provide
> a way for a user program to be notified that the system has broken its lock.
> This situation occurs when a process locks a file on another machine, 
> and a comm link between the two machines goes down.  You clearly can't
> keep your database down for hours while AT&T (grin) puts your long line
> back in service, so the lock arbiter reluctantly breaks the lock.  (It
> can't tell if your machine crashed or whether it was just a comm
> line failure anyway.)  Now everybody can get at the file OK, but when the
> comm link comes back up, the process will think it owns the lock and
> will muck with the file.  So far nobody has designed a mechanism to tell
> the process that this has happened, which means to be safe the system must
> kill -9 any such process when this happens (e.g. it must make it *look*
> like the system or process really did crash, even though it was just a
> comm link failure).  I'm not sure how you even *detect* this situation
> though.

I don't see a big problem.  There are three possible cases of failure:
(1)  System owning the data crashes.  In this case, the remote process
will soon peform an I/O on the locked record/file (if it doesn't, you
have a problem even on a single system) which will fail (should return
EIO; could generate a signal instead, I suppose).  The regular failure
recovery should suffice (involves freeing locks, perhaps as a side-effect
of closing the file descriptor, perhaps automatically upon I/O error).
(2)  Communication link crashes.  (3)  Remote system crashes after
planting a lock.  Cases (2) and (3) are the interesting ones, but they
can be easily handled by simply pinging the locking system when a lock
conflict occurs.  (Various strategies could be used to reduce pinging
frequency, if desired, but I don't think it would be necessary.)  If the
locker denies knowledge of the lock, then void it locally and proceed.

The above approach probably doesn't work on stateless remote file
systems such as NFS, but this started out as a general RFS discussion.