[gnu.emacs.bug] Re^2: Lock files

jr@bbn.com (John Robinson) (09/15/89)

pinkas@cadev5.intel.com (Israel Pinkas ~) writes:

>> How about prepending the hostname to the lock filename, so that locks
>> created in an NFS'ed, shared lock directory are unique across machines.

>Most sites mount partitions so that they have the same name on all the
>partitions.  For example, my home directory is /usr/users/ccad/edend/pinkas
>on all machines that mount it.

>Besides, I have run into a few problems with NFS shared lock directories
>due to the write buffering.

I have always believed the best solution is to write the lock file right
into the directory of the file you are locking.  This avoids failing to
lock when things are mounted funny (we always have an odd mount point
like /mnt hanging around) or missing locks due to different symbolic link
paths.

The other problem we had here is that /usr/local/emacs/lock gets royally
pounded when a lot of diskless wortstations share a server for emacs.  I
set lockdir to /tmp.  The lockfile names are strange enough that they
oughtn't cause trouble from collisions.

/jr, nee John Robinson     Life did not take over the globe by combat,
jr@bbn.com or bbn!jr          but by networking -- Lynn Margulis

alarson@SRC.Honeywell.COM (Aaron Larson) (09/16/89)

In article <45625@bbn.COM> jr@bbn.com (John Robinson) writes:

   I have always believed the best solution is to write the lock file right
   into the directory of the file you are locking.

I've actually done this, the following is the header from the new source,
if there's interest, I'll post it.

/* The following provides a file locking mechanism for GNU emacs in which
   lock files are created in the directory where the file being locked
   exists (see LOCKFILEPREFIX & SUPERLOCK_FILE_NAME in paths.h).  The
   previous file locking mechanism was to create lock files in a single
   directory that everybody had write access to, where the lock file name
   was the name name of the file being locked with "/" replaced by "!".
   The file contained the pid of the locking process, so that broken locks
   could be identified.

   The old file locking scheme could lose several ways:

     1. In a network environment, a single point of contact for lock files
        results in bottlenecks, and single point failures.
     2. In a network environment, pid numbers are not adequate to identify
        processes; machine names must be taken into account, as well.
	In point of fact, adding just the machine name is not enough in
	subnetted networks, but the probability that machines with the same
	name, having the same pid running emacs, for the same user id, both
	trying to modify the same file, is fairly low.
     3. If a file was accessible via different paths, (e.g., the path name
        contained symbolic links, or was accessible via a different mount
	point) the lock file name wouldn't be the same, and access violations
	would arise.
     4. Hard links give files different names, so lock file names don't
        match.

   The scheme implemented by the following code is to write lock files in the
   same directory as the file being locked (we still use a superlock file, and
   it too is in the same directory as the file being locked).  The lock file
   contains the user name, pid, the date the lock was acquired, and the
   machine name on which the locking process is executing.  Because the lock
   files are in the same dir as the file being locked, it gets around problems
   1 & 2 above, and minimizes the impact of 3.  You can however loose in
   different ways:

     A. If the file visited is a symbolic link, you still loose.  This is
        also a problem for backup files, so presumably users are somewhat
	aware of it.
     B. Still doesn't handle hard links.  We considered using the inode
        of the file, but you can loose in a large number of ways, and it
	complicates the locking strategy considerably.
     C. If you don't have write access to the directory where the file
        you are writing will go, you will be unable to lock the file.
	Once again, autosaves and backup files already have this problem,
	so presumably users are aware of it.
     D. If the name of the lock file coincides with the name of an existing
        file, AND if ask-user-about-lock returns T even though the user
	name is NIL, AND the user has write access to the file, then it is
	possible that data could be overwritten.  A judicious choice of
	LOCKFILEPREFIX should reduce this possiblity to near zero.

   The following code implements the same interface as the previous
   file, and does not do any more file I/O than the old method.
 */

pinkas@cadev5.intel.com (Israel Pinkas ~) (09/19/89)

In article <31274@srcsip.UUCP> alarson@SRC.Honeywell.COM (Aaron Larson) writes:

> In article <45625@bbn.COM> jr@bbn.com (John Robinson) writes:
>
>    I have always believed the best solution is to write the lock file right
>    into the directory of the file you are locking.
>
> I've actually done this, the following is the header from the new source,
> if there's interest, I'll post it.

Your patch still does not address the problem introduced by the
statelessness of NFS.  In particular, NFS introduced a nasty problem of
a race condition.  If machines client-a and client-b mount a directory from
server-1, a file created by a user on client-a might not be visible on
client-b for up to a minute.

You can get around the machine name problem on subnets by using the TCP
address instead of the machine name.  The harder problem is figuring out
whether the lock is still valid.  (i.e., did the other machine crash and
leave a bogus lock.)  You can't exactly use kill() or look at the process
table to determine if the other process is running.

You can almost eliminate the problem by using lockf() if you run the lock
daemon.  If both clients use this, you have network-transparent locking.
You do not need a lock directory or file.  You also don't have to worry
about multiple hard links or soft links.  (I say 'almost eliminate' because
I am not 100% sure that you won't run into the race conditions.)

I looked into it at one point, but since we still have Ultrix 2.2 file
servers (that don't have the lock daemon), I haven't been able to really
test it out.

-Israel Pinkas
--
--------------------------------------
Disclaimer: The above are my personal opinions, and in no way represent
the opinions of Intel Corporation.  In no way should the above be taken
to be a statement of Intel.

UUCP:	{amdcad,decwrl,hplabs,oliveb,pur-ee,qantel}!intelca!mipos3!cadev4!pinkas
ARPA:	pinkas%cadev4.intel.com@relay.cs.net
CSNET:	pinkas@cadev4.intel.com

alarson@SRC.Honeywell.COM (Aaron Larson) (09/19/89)

   ... You can almost eliminate the problem by using lockf() if you run the
   lock daemon. ...

We thought of doing this, but decided against it.  The following line from
the man page on lockf highlights the problem:

     All locks associated with a file for  a  given  process  are
     removed  when  the file is closed or the process terminates.

Lots of our users typically have dozens of files loaded at one time.  It
would appear impractical to have open file descriptors for all of them.  I
guess you would only have to have the modified buffer files open, however
in pre 4.0 SunOS the open file restriction was pretty low, even with 4.x
limits it may not be practical, and I don't know what other operating
system open file limits are like.