les@chinet.chi.il.us (Leslie Mikesell) (07/05/90)
Looking at the code for a mailer that uses lockfiles (/usr/mail/logname.lock) to obtain exclusive access to the mail file I've become convinced that there is *no* robust way to detect and delete lockfiles that have been accidentally left in place. The popular method of picking up the PID from the lockfile and sending signal 0 to the locking process will often falsely indicate that the file is stale since the process that owned the previous lockfile may exit between the time the checking process reads the PID and issues the signal. Worse, regardless of the method of determination, there is no way to insure that the file removed by unlink() is the same file that was checked. Another similar process may have just gone through the same procedure and created its lockfile between this process's check and unlink(). Am I missing something? Les Mikesell les@chinet.chi.il.us
chip@tct.uucp (Chip Salzenberg) (07/10/90)
According to les@chinet.chi.il.us (Leslie Mikesell): >Looking at the code for a mailer that uses lockfiles (/usr/mail/logname.lock) >to obtain exclusive access to the mail file I've become convinced that >there is *no* robust way to detect and delete lockfiles that have been >accidentally left in place. That's about the size of it. If you want robustness, kernel locking is the only way to go. -- Chip Salzenberg at ComDev/TCT <chip@tct.uucp>, <uunet!ateng!tct!chip>
peter@ficc.ferranti.com (Peter da Silva) (07/11/90)
In practice, storing a PID in the file and doing a kill() on it may give false positives, but they're usually harmless. It will never give a false negative (well, except over a network... in which case kernel locking is likely a lost cause as well). The problem with kernel locking is all the incompatible "standards". -- Peter da Silva. `-_-' +1 713 274 5180. <peter@ficc.ferranti.com>
les@chinet.chi.il.us (Leslie Mikesell) (07/12/90)
In article <CEL4GPE@xds13.ferranti.com> peter@ficc.ferranti.com (Peter da Silva) writes: >In practice, storing a PID in the file and doing a kill() on it may give >false positives, but they're usually harmless. It will never give a false >negative (well, except over a network... in which case kernel locking is >likely a lost cause as well). The non-harmless case of a false positive is where 2 or more processes test an existing lockfile as the owning process exits. Both decide that the lock is stale, one creates its own lock, then the other process completes its unlink() which now affects a valid file. >The problem with kernel locking is all the incompatible "standards". Right - in the case of a mail delivery agent, it has to agree with the readers that may delete or write back the spool file, and the source to all such programs running on the machine may not be available. I've added a test to prevent unlinking lockfiles less than a couple of minutes old which should greatly reduce the possibility of trouble, but it still makes me a little queasy... Les Mikesell les@chinet.chi.il.us
chip@tct.uucp (Chip Salzenberg) (07/13/90)
According to peter@ficc.ferranti.com (Peter da Silva): >In practice, storing a PID in the file and doing a kill() on it may give >false positives, but they're usually harmless. Quite. However, the spurious removal problem continues -- how do I know that a correct negative means that I can remove the file *now*? In general, I can't. >The problem with kernel locking is all the incompatible "standards". It's almost universally available in *some* form. The point is to use any locking method that automatically notices when a process dies, so the kernel has to get involved at some point. If you have to use #ifdefs, well, that's life. -- Chip, the new t.b answer man <chip@tct.uucp>, <uunet!ateng!tct!chip>