[comp.sys.sun] How do I trace source of Bus Error?

poage@sunny.ucdavis.edu (Tom Poage) (11/21/90)

I've been plagued lately by bus errors that cause kernel panics.  Of
course, the trace backs and messages are cryptic, so I'm wondering how I
can find the cause of this?  Here's what I know:

	Sun 3/150, SunOS 4.0.3 (more H/W info if you want).
	AC power appears to be clean after monitoring.
	Power supply seems good.
	Reseating boards and jumpers doesn't help (should I start
		reseating chips?).
	Panic seems to be independent of the process that was running;
		kernel stack traces are all different.
	Bus Error Reg 20<TIMEOUT> means that a device doesn't exist
		or didn't respond before timeout.
	Running sundiag and the SunDiagnostic Executive don't seem
		to turn up anything.  However, the SunDiagnostic
		Executive occasionally generates an unclaimed exception,
		apparently due to my (un)plugging serial loopback
		connectors.

Here's a representative panic message:

trap address 0x8, pid 0, pc = f06a254, sr = 2404, stkfmt b, context 0
Bus Error Reg 20<TIMEOUT>
data fault address f112500 faultc 0 faultb 0 dfault 1 rw 1 size 1 fcode 5
KERNEL MODE
page map ff000800 pmgrp ef
D0-D7  20 0 ffffffff 2f000000 1a0000 e09c 1 0
A0-A7  f0e10c2 f0c1bcc 200e0000 f112400 f112000 f0bc648 f08aea4 f08ae78

Is it time to call Sun?

Tom Poage, Clinical Engineering
Universiy of California, Davis, Medical Center, Sacramento, CA
poage@sunny.ucdavis.edu  {...,ucbvax,uunet}!ucdavis!sunny!poage