[net.unix-wizards] More on 4.2 mail

jwp@sdchema.UUCP (John Pierce) (12/05/84)

While we're on this subject, there is another problem that can also be a bit
mystifying.

If you have a heavily loaded system and mail gets heavy use it's possible for
user's mailboxes to get scrambled through two processes writing to it at once.
Both sendmail and /bin/mail try to prevent this by establishing a lock file
while they're rewriting the mail file.  Both (understandbly) have code that
allows a process to break the lock after some number of seconds.  As we got
it, however, /bin/mail's time limit was 30 seconds.

If you've got a user who regularly keeps 20-30K+ bytes in his mailbox, gets
50-100 messages in an eight hour day, on a system that regularly sees the
load average go over 10...  Well, you'll find out that whoever set that limit
at 30 seconds was a real optimist.  Setting it to 120 seconds (the same, I
believe, as sendmail's time limit) seems to have fixed the problem.

There may actually be some other problem here (I'd believe anything about the
mail system) (and beyond the obvious one that we need another machine), but
the problem really can occur (just like the lost mail problem occurs on some
machines), and resetting the time limit fixed it.  If anyone has a better fix
I would much appreciate hearing about it.

				John Pierce, Chemistry, UC San Diego
				{decvax,sdcsvax}!sdchema!jwp

ted@usceast.UUCP (Ted Nolan) (12/07/84)

In article <308@sdchema.UUCP> jwp@sdchema.UUCP (John Pierce) writes:
>While we're on this subject, there is another problem that can also be a bit
>mystifying.
>
>If you have a heavily loaded system and mail gets heavy use it's possible for
>user's mailboxes to get scrambled through two processes writing to it at once.
>Both sendmail and /bin/mail try to prevent this by establishing a lock file
>while they're rewriting the mail file.  Both (understandbly) have code that
>allows a process to break the lock after some number of seconds.  As we got
>it, however, /bin/mail's time limit was 30 seconds.
>
>.................................................. If anyone has a better fix
>I would much appreciate hearing about it.
>
>				John Pierce, Chemistry, UC San Diego
>				{decvax,sdcsvax}!sdchema!jwp

How about having binmail and sendmail use 4.2's flock(2) system call.
Sounds like exactly what's needed.

				Ted Nolan	..usceast!ted
-- 
-------------------------------------------------------------------------------
Ted Nolan                               ...decvax!mcnc!ncsu!ncrcae!usceast!ted
6536 Brookside Circle                   ...akgua!usceast!ted
Columbia, SC 29206
      ("Deep space is my dwelling place, the stars my destination")
-------------------------------------------------------------------------------

dae@psuvax1.UUCP (David Eckhardt) (12/10/84)

> If you have a heavily loaded system and mail gets heavy use it's possible for
> user's mailboxes to get scrambled through two processes writing to it at once.
> Both sendmail and /bin/mail try to prevent this by establishing a lock file
> while they're rewriting the mail file.  Both (understandbly) have code that
> allows a process to break the lock after some number of seconds.  As we got
> it, however, /bin/mail's time limit was 30 seconds.
> 
> If anyone has a better fix I would much appreciate hearing about it.
> 
> 				John Pierce, Chemistry, UC San Diego
> 				{decvax,sdcsvax}!sdchema!jwp

I've been considering using (on 4.2 systems, anyway) the flock(2)
system call--that way you shouldn't have to worry about breaking locks
prematurely.  Can anybody think of any reason not to do this?

-- 

Spoken:  Dave Eckhardt        Summoned:  Daemon
Net: dae @ { psuvax1. { bitnet, uucp } , penn-state.csnet}

-> "I will have no covenants but proximities" <- Emerson