[comp.unix.questions] lockfile type locks

les@chinet.chi.il.us (Leslie Mikesell) (07/05/90)

Looking at the code for a mailer that uses lockfiles (/usr/mail/logname.lock)
to obtain exclusive access to the mail file I've become convinced that
there is *no* robust way to detect and delete lockfiles that have been
accidentally left in place.  The popular method of picking up the PID
from the lockfile and sending signal 0 to the locking process will often
falsely indicate that the file is stale since the process that owned the
previous lockfile may exit between the time the checking process reads
the PID and issues the signal.  Worse, regardless of the method of
determination, there is no way to insure that the file removed by
unlink() is the same file that was checked.  Another similar process
may have just gone through the same procedure and created its lockfile
between this process's check and unlink().

Am I missing something?

Les Mikesell
  les@chinet.chi.il.us

chip@tct.uucp (Chip Salzenberg) (07/10/90)

According to les@chinet.chi.il.us (Leslie Mikesell):
>Looking at the code for a mailer that uses lockfiles (/usr/mail/logname.lock)
>to obtain exclusive access to the mail file I've become convinced that
>there is *no* robust way to detect and delete lockfiles that have been
>accidentally left in place.

That's about the size of it.  If you want robustness, kernel locking
is the only way to go.
-- 
Chip Salzenberg at ComDev/TCT     <chip@tct.uucp>, <uunet!ateng!tct!chip>

peter@ficc.ferranti.com (Peter da Silva) (07/11/90)

In practice, storing a PID in the file and doing a kill() on it may give
false positives, but they're usually harmless. It will never give a false
negative (well, except over a network... in which case kernel locking is
likely a lost cause as well).

The problem with kernel locking is all the incompatible "standards".
-- 
Peter da Silva.   `-_-'
+1 713 274 5180.
<peter@ficc.ferranti.com>

les@chinet.chi.il.us (Leslie Mikesell) (07/12/90)

In article <CEL4GPE@xds13.ferranti.com> peter@ficc.ferranti.com (Peter da Silva) writes:
>In practice, storing a PID in the file and doing a kill() on it may give
>false positives, but they're usually harmless. It will never give a false
>negative (well, except over a network... in which case kernel locking is
>likely a lost cause as well).

The non-harmless case of a false positive is where 2 or more processes
test an existing lockfile as the owning process exits.  Both decide
that the lock is stale, one creates its own lock, then the other process
completes its unlink() which now affects a valid file.

>The problem with kernel locking is all the incompatible "standards".

Right - in the case of a mail delivery agent, it has to agree with
the readers that may delete or write back the spool file, and the
source to all such programs running on the machine may not be available.
I've added a test to prevent unlinking lockfiles less than a couple of
minutes old which should greatly reduce the possibility of trouble, but
it still makes me a little queasy...

Les Mikesell
  les@chinet.chi.il.us

chip@tct.uucp (Chip Salzenberg) (07/13/90)

According to peter@ficc.ferranti.com (Peter da Silva):
>In practice, storing a PID in the file and doing a kill() on it may give
>false positives, but they're usually harmless.

Quite.  However, the spurious removal problem continues -- how do I
know that a correct negative means that I can remove the file *now*?
In general, I can't.

>The problem with kernel locking is all the incompatible "standards".

It's almost universally available in *some* form.  The point is to use
any locking method that automatically notices when a process dies, so
the kernel has to get involved at some point.  If you have to use
#ifdefs, well, that's life.
-- 
Chip, the new t.b answer man      <chip@tct.uucp>, <uunet!ateng!tct!chip>