[comp.mail.elm] Problem With Network Disconnects And Locks

scs@vax3.iti.org (Steve Simmons) (04/21/89)

Maybe we should start comp.mail.elm.scs... :-)

Found an interesting little problem that I'm too ignorant to really
address.  Information on why things are done the way they are would
be really appreciated.

Basic data: 4.3BSD, elm 2.2, patchlevel 1.

I was telnet'ed in from a remote host to my mail host, reading mail
with elm, when the gateway crashed.  When things came back, on
starting elm I got a series of "Waiting to read mailbox while mail
is being delivered..." messages, followed by elm exiting.  There was
a file named scs.lock in /usr/spool/mail.  Rather than just blast it,
I went and read the BSD mail, rmail, Mail and sendmail code.  All
of it uses flock() to do mailbox locking.  So *then* I blasted the
lock, and all was OK.

I read the locking code for elm, and can see no reason whatsoever to
have this lock file under BSD.  If it's only for elm, flock() should
do the job.

Along the way I patched the elm locking code to not only create this
lock, but put the PID of the creator in the file.  My first thought was
to try to do a kill( pid, 0 ) in elm to see if the locker was still
around and blast the lock if not.  That proved ineffective (eg, if it
was locked by root, one cannot send the kill signal and can't determine
just why the kill failed).  Yes, there is more I can do but we're
getting into some pretty non-portable areas.  Before doing that work
(I volunteer) can anyone explain to be why to bother with the lockfile
at all under BSD?

   Steve Simmons         Just another midwestern boy
   scs@vax3.iti.org  -- or -- ...!sharkey!itivax!scs
         "Hey...you *can* get here from here!"

rob@PacBell.COM (Rob Bernardo) (04/21/89)

In article <996@itivax.iti.org> scs@vax3.iti.org (Steve Simmons) writes:
+I read the locking code for elm, and can see no reason whatsoever to
+have this lock file under BSD.  If it's only for elm, flock() should
+do the job.

Actually some BSD systems use the lock file to lock a mailbox and others
use flock() on the mailbox itself. There was no way for Configure
to determine which type of locking was needed by the particular system,
so what we decided was this:

	1. If the system can do flock(), elm will use it just in case
	it is the way mailboxes get locked on the system.
	2. All systems will lock with a lock file.

This way Configure can't go wrong and the extraneous locks don't hurt.
-- 
Rob Bernardo, Pacific Bell UNIX/C Reusable Code Library
Email:     ...![backbone]!pacbell!pbhyf!rob   OR  rob@pbhyf.PacBell.COM
Office:    (415) 823-2417  Room 4E850O San Ramon Valley Administrative Center
Residence: (415) 827-4301  R Bar JB, Concord, California

syd@dsinc.DSI.COM (Syd Weinstein) (04/21/89)

In article <996@itivax.iti.org> scs@vax3.iti.org (Steve Simmons) writes:
>Before doing that work can anyone explain to be why to bother with
> the lockfile at all under BSD?

Yes, I can,  It was for two reasons,
1. (and most important) No one could figure a 'fool resistant' way to
decide if a system used .lock type locking or flock type locking for
its mailboxes.  To be safe, we did both.

2. to allow for compatibility with old MUA's that only did .lock type
locking.

Why not just rush in and do pid type locking:  We might in the future,
but for now, remember that not all systems can do kill(pid, 0).
-- 
=====================================================================
Sydney S. Weinstein, CDP, CCP                   Elm Coordinator
Datacomp Systems, Inc.				Voice: (215) 947-9900
syd@DSI.COM or {bpa,vu-vlsi}!dsinc!syd	        FAX:   (215) 938-0235

les@chinet.chi.il.us (Leslie Mikesell) (04/23/89)

In article <112@dsinc.DSI.COM> syd@dsinc.UUCP (Syd Weinstein) writes:

>Why not just rush in and do pid type locking:  We might in the future,
>but for now, remember that not all systems can do kill(pid, 0).

Nor will this work with remote file systems.  Using RFS, mounting a
common /usr/mail might be a reasonable thing to do but lockfiles containing
pids do not work (if you expect to be able to find out if the process
still exists).

Les Mikesell 

jgd@csd4.milw.wisc.edu (John G Dobnick,EMS E380,4142295727,) (04/25/89)

From article <5081@pbhyf.PacBell.COM>, by rob@PacBell.COM (Rob Bernardo):
> In article <996@itivax.iti.org> scs@vax3.iti.org (Steve Simmons) writes:
> 
   [Rob answers Steve Simmon's question about <login>.lock files in
    /usr/spool/mail]
> 
> 	1. If the system can do flock(), elm will use it just in case
> 	it is the way mailboxes get locked on the system.
> 	2. All systems will lock with a lock file.
> 
> This way Configure can't go wrong and the extraneous locks don't hurt.

Well... almost.  Unless the system crashes, or a network interface
crashes, or the user gets disconnected, or...

Then the <login>.lock file stays around, causing no end of confusion.
(Unless, of course, a reboot clears these files.  But then who puts
code into /etc/rc.local to do this?  We don't.  [Maybe we should?])

I would think the advantage of using flock() is that a reboot clears *all*
them ugly locks, and resets the mail system to a "known state".  (Yes, I
got bitten by the "Your mailbox is busy, please wait" problem.  Took a
while to figure out what was happening.)

I vote with the BSD-ites who claim flock() is sufficient, and that the lock
files can be dispensed with.  (Unless someone can convince me they really
*are* required -- in addition to flock().)
-- 
John G Dobnick
Computing Services Division @ University of Wisconsin - Milwaukee
INTERNET: jgd@csd4.milw.wisc.edu
UUCP: <backbone>!uwvax!uwmcsd1!jgd

"Knowing how things work is the basis for appreciation,
and is thus a source of civilized delight."  -- William Safire

rob@PacBell.COM (Rob Bernardo) (04/25/89)

In article <2222@csd4.milw.wisc.edu> jgd@csd4.milw.wisc.edu (John G 
Dobnick,EMS E380,4142295727,) points out that extraneous locking by 
.lock files is not so extraneous; it can have negative side effects on 
networked systems. 

+I vote with the BSD-ites who claim flock() is sufficient, and that the lock
+files can be dispensed with.  (Unless someone can convince me they really
+*are* required -- in addition to flock().)

It's not exactly a matter of *voting* :-). My understanding is that 
some older BSD systems use .lock files and some newer BSD systems use 
flock(). The problem is that Configure has no good way to tell which 
locking method is used.

Okay. Any one out there who has a BSD UNIX whose mailers used .lock files?
(I'm almost sure there were some right among the elm developers.)
-- 
Rob Bernardo, Pacific Bell UNIX/C Reusable Code Library
Email:     ...![backbone]!pacbell!pbhyf!rob   OR  rob@pbhyf.PacBell.COM
Office:    (415) 823-2417  Room 4E850O San Ramon Valley Administrative Center
Residence: (415) 827-4301  R Bar JB, Concord, California

ewiles@netxcom.UUCP (Edwin Wiles) (04/26/89)

In article <2222@csd4.milw.wisc.edu> jgd@csd4.milw.wisc.edu
					(John G Dobnick) writes:
....[edited]....
>I vote with the BSD-ites who claim flock() is sufficient, and that the lock
>files can be dispensed with.  (Unless someone can convince me they really
>*are* required -- in addition to flock().)

I won't mind if BSD elm versions use only 'flock()', but don't mess up the
lock file mechanism for NON-BSD sites which do not have 'flock()'.  (I'm one
of them.)
					Enjoy!

P.S.  GOOD JOB on ELM 2.2!  The new interface takes some getting used to, but
it's MUCH better than the old one!
-- 
...!hadron\   "Who?... Me?... WHAT opinions?!?" | Edwin Wiles
  ...!sundc\   Schedule: (n.) An ever changing	| NetExpress Comm., Inc.
   ...!pyrdc\			  nightmare.	| 1953 Gallows Rd. Suite 300
    ...!uunet!netxcom!ewiles			| Vienna, VA 22180

jgd@csd4.milw.wisc.edu (John G Dobnick,EMS E380,4142295727,) (04/26/89)

> In article <2222@csd4.milw.wisc.edu> jgd@csd4.milw.wisc.edu
> 					(John G Dobnick) [me]  writes:
> .....[edited]....
>>I vote with the BSD-ites who claim flock() is sufficient, and that the lock
>>files can be dispensed with.  (Unless someone can convince me they really
>>*are* required -- in addition to flock().)
  
From article <1213@netxcom.UUCP>, by ewiles@netxcom.UUCP (Edwin Wiles):

> I won't mind if BSD elm versions use only 'flock()', but don't mess up the
> lock file mechanism for NON-BSD sites which do not have 'flock()'.  (I'm one
> of them.)

I was insufficiently clear in expressing myself.  What I *should* have
written was

	"I vote with...    inaddition to flock().)  The locking method
	actually used should be configurable, given that the installer
	presumable knows what is used on his/her system.  Leaving the
	default action as it currently is (use both methods) is fine, as
	long as the installer can tailor Elm to his/her site's environment."

In our case, we would probably trash the lock file and use only flock().
Other sites may wish to do the opposite.  I had no intention of suggesting
that lock files be "dropped" from the Elm product -- I realize there are
multiple and *different* Unix implementations having different facilities
and requirements.  Giving the sites the facility to select which method is
used *is* desirable, though.

Sorry for any confusion that I created.
-- 
John G Dobnick
Computing Services Division @ University of Wisconsin - Milwaukee
INTERNET: jgd@csd4.milw.wisc.edu
UUCP: <backbone>!uwvax!uwmcsd1!jgd

"Knowing how things work is the basis for appreciation,
and is thus a source of civilized delight."  -- William Safire

dudek@frapray.ksr.com (Glen Dudek) (05/04/89)

In article <5114@pbhyf.PacBell.COM> rob@PacBell.COM (Rob Bernardo) writes:
>It's not exactly a matter of *voting* :-). My understanding is that 
>some older BSD systems use .lock files and some newer BSD systems use 
>flock(). The problem is that Configure has no good way to tell which 
>locking method is used.
>
>Okay. Any one out there who has a BSD UNIX whose mailers used .lock files?
>(I'm almost sure there were some right among the elm developers.)

As far as I know, Sun OS (at least as of 3.5, probably 4.0 as well) still
uses .lock files since flock() is not supported across NFS.

You made the right choice :-)
--
Glen Dudek
dudek@ksr.com (ksr!dudek@harvard.harvard.edu)