[comp.mail.sendmail] IDA Sendmail on Sun 4 panics ... any ideas?

rws@cs.brown.edu (Richard W. Sabourin) (09/16/90)

I have recently installed Sendmail 5.64+ w/ IDA on our central mail machine,
a Sun 4/360 running 4.1. (I don't have the IDA revision # on hand, but it
was just last month ftp'ed from uunet.) It compiled fine, and uses MX
records fine. The .cf file is rather a mishmash, but I debugged it and
it does what we want.

Anyway, this machine has recently begun panicking a couple times a week,
always at night (figures), with the following:

--------------------------------------------------
panic: Data fault
BAD TRAP
pid 2680, `sendmail': Data fault
kernel read fault at addr=0xf8238ff8, pme=0x0
Bus Error Reg 80<INVALID>
pc=0xf80a4074, sp=0xf8178558, psr=0x118018c3, context=0x0
g1-g7: 11001ae6, e30002e7, ffffffff, 6, f824b000, f8121000, f8121000
Begin traceback...

[etc.]
--------------------------------------------------

Sendmail is always listed as the culprit, and always a Bad Trap.
After these panics, mail delivery enters a weird failure mode, in which
mail to all local users bounces with "User unknown". Re-freezing the fc
file and restarting fixes this.

Anybody else seen either failure? Coincidence? Conspiracy? Don't worry, I'm
not one of those "Pls fix bug asap urg help thx" posters; just drawing a
blank.

Thanks in advance,
	Rick Sabourin
	Systems Slave

mills@ccu.umanitoba.ca (Gary Mills) (09/16/90)

rws@cs.brown.edu (Richard W. Sabourin) writes:


>I have recently installed Sendmail 5.64+ w/ IDA on our central mail machine,
>a Sun 4/360 running 4.1.
[...]
>Anyway, this machine has recently begun panicking a couple times a week,
[...]
>panic: Data fault
[...]
>Sendmail is always listed as the culprit, and always a Bad Trap.

I believe it's a kernel bug.  We had a very similar problem, but with a
user directory server witten in perl.  I don't know what provokes it.
We stopped running the server.  Now, two or three times a week, the
machine goes to sleep, and has to be re-booted.  A ps listing from the
dump always shows a particular process getting virtually all the cpu.
-- 
-Gary Mills-             -University of Manitoba-             -Winnipeg-

mills@ccu.umanitoba.ca (Gary Mills) (09/16/90)

This _is_ a kernel bug.  The bug ID is 1029939, and Sun has a patch
for it.  
-- 
-Gary Mills-             -University of Manitoba-             -Winnipeg-

rws@cs.brown.edu (Richard W. Sabourin) (09/19/90)

In article <49991@brunix.UUCP> I wrote:
|>
|>I have recently installed Sendmail 5.64+ w/ IDA on our central mail machine,
|>a Sun 4/360 running 4.1. (I don't have the IDA revision # on hand, but it
|>was just last month ftp'ed from uunet.) It compiled fine, and uses MX
|>records fine. The .cf file is rather a mishmash, but I debugged it and
|>it does what we want.
|>
|>Anyway, this machine has recently begun panicking a couple times a week,
|>always at night (figures), with the following:
|>
|>--------------------------------------------------
|>panic: Data fault
|>BAD TRAP
|>pid 2680, `sendmail': Data fault
|>kernel read fault at addr=0xf8238ff8, pme=0x0
|>Bus Error Reg 80<INVALID>
|>[etc]
|>--------------------------------------------------
|>[...]

Here is a summary of the more useful email responses I've received.
(IDA revision is 1.3.4.)

From "Mark D. Baushke" <mdb@ESD.3Com.COM>:

>Did you remember to build with the -Bstatic switch turned on? I have
>only had trouble with 5.64+IDA when using dynamic libraries. Even then
>it never caused a panic, it just did not deliver the mail correctly.

... I didn't do that. I'll probably do that next.

From rickert@cs.niu.edu:

> Never overlook the possibility of a hardware failure.  If it happens at
>night, 'sendmail' may be the main program running, so the fact that it
>always takes the hit might not mean very much.

From Paul Pomes - UofIllinois CSO <paul@uxc.cso.uiuc.edu>:

>Does the error occur when the sendmail.fc file is removed?  Piet Berteema
>has found a bug with setdefuser() munging the malloc pointers when the
>frozen config file is used.

... I have removed the .fc file, and the mail problems have not recurred in
3 days. So I hope for the best.

Lastly, mills@ccu.umanitoba.ca (Gary Mills) posted:

>This _is_ a kernel bug.  The bug ID is 1029939, and Sun has a patch for it.


Thank yous to all who replied. Now I turn my head to Sun's /bin/mail
which has also been acting odd, but that's for another group...

	Rick Sabourin
	<rws@cs.brown.edu>