poage@sunny.ucdavis.edu (Tom Poage) (11/21/90)
I've been plagued lately by bus errors that cause kernel panics. Of course, the trace backs and messages are cryptic, so I'm wondering how I can find the cause of this? Here's what I know: Sun 3/150, SunOS 4.0.3 (more H/W info if you want). AC power appears to be clean after monitoring. Power supply seems good. Reseating boards and jumpers doesn't help (should I start reseating chips?). Panic seems to be independent of the process that was running; kernel stack traces are all different. Bus Error Reg 20<TIMEOUT> means that a device doesn't exist or didn't respond before timeout. Running sundiag and the SunDiagnostic Executive don't seem to turn up anything. However, the SunDiagnostic Executive occasionally generates an unclaimed exception, apparently due to my (un)plugging serial loopback connectors. Here's a representative panic message: trap address 0x8, pid 0, pc = f06a254, sr = 2404, stkfmt b, context 0 Bus Error Reg 20<TIMEOUT> data fault address f112500 faultc 0 faultb 0 dfault 1 rw 1 size 1 fcode 5 KERNEL MODE page map ff000800 pmgrp ef D0-D7 20 0 ffffffff 2f000000 1a0000 e09c 1 0 A0-A7 f0e10c2 f0c1bcc 200e0000 f112400 f112000 f0bc648 f08aea4 f08ae78 Is it time to call Sun? Tom Poage, Clinical Engineering Universiy of California, Davis, Medical Center, Sacramento, CA poage@sunny.ucdavis.edu {...,ucbvax,uunet}!ucdavis!sunny!poage