[comp.mail.mush] dead.letter locked twice

fletcher@cs.utexas.edu (Fletcher Mattox) (07/31/90)

[ An earlier version of this escaped with a typo.  I've canceled it.
Apologies for duplicates on the mail list. --Fletcher ]

This is 7.1.2 with DOTLOCK and HOMEMAIL defined.

  cs.utexas.edu% mush -n! fletcher
  To: fletcher
  
  yo
  ~q
  /v/ai/v0/fletcher/dead.letter already locked, waiting................

which will wait forever.  Same thing happens with ~E and with
generating two SIGINTs.  If I "set dead=~/somethingelse", then
the problem goes away.

schaefer@CSE.OGI.EDU (Barton E. Schaefer) (07/31/90)

On Jul 31, 12:39am, Fletcher Mattox wrote:
} Subject: dead.letter locked twice
}
} This is 7.1.2 with DOTLOCK and HOMEMAIL defined.
} 
}   cs.utexas.edu% mush -n! fletcher
}   To: fletcher
}   
}   yo
}   ~q
}   /v/ai/v0/fletcher/dead.letter already locked, waiting................
} 
} which will wait forever.  Same thing happens with ~E and with
} generating two SIGINTs.  If I "set dead=~/somethingelse", then
} the problem goes away.

What this should mean is that you have a file called
    /v/ai/v0/fletcher/dead.letter.lock
which was for some reason left behind after a previous ~q or interrupt.
There may be some chance of a race condition on three rapid repetitions
of the interrupt character, but that would seem to depend on odd behavior
from the open() system call.

In any case, there is only one call to lock_fopen() in the dead-letter
sequence, and only one check for the file name, so if changing the file
name via $dead solved the problem, it isn't a bug in the dead-letter
part of the code.  You just need to remove that stray lock file.

It has in the past been suggested that the waiting-to-lock cycle should
eventually time out, but no one has yet convinced us of an appropriate
interval.  NFS-mounted mail directories could lead to delays of up to 15
minutes in writing back a file, depending on system timeouts.  When does
mush decide that enough is enough and ignore the lock?  (Yes, if the date
on the lock file is several hours old, it would be safe to assume that it
could be ignored.  I'll put *something* on the TODO list.  But does that
help for non-DOT_LOCK locks ... hmmm ....)

-- 
Bart Schaefer						schaefer@cse.ogi.edu

fletcher@cs.utexas.edu (Fletcher Mattox) (07/31/90)

Barton E. Schaefer writes:

>What this should mean is that you have a file called
>    /v/ai/v0/fletcher/dead.letter.lock
>which was for some reason left behind after a previous ~q or interrupt.

Yep.  On closer examination, I see that the lock file from
the previous ~q isn't being removed because close_lock()
tries to unlink(~/dead.letter.lock) without first expanding
the ~.  The return value of unlink() is ignored so the
user never realises it failed.

schaefer@ogicse.ogc.edu (Barton E. Schaefer) (07/31/90)

In article <10703@cs.utexas.edu> fletcher@cs.utexas.edu (Fletcher Mattox) writes:
} Barton E. Schaefer writes:
} 
} >    /v/ai/v0/fletcher/dead.letter.lock
} >was for some reason left behind after a previous ~q or interrupt.
} 
} Yep.  On closer examination, I see that the lock file from
} the previous ~q isn't being removed because close_lock()
} tries to unlink(~/dead.letter.lock) without first expanding
} the ~.

Ugh.  Band-aid applied and will appear in patch #3.  I wish there was a
better way to handle it than having close_lock() do the tilde-expansion,
such as having open_file() return the name it used somehow, but that
would require changing too many other things.
-- 
Bart Schaefer						schaefer@cse.ogi.edu