[comp.unix.sysv386] panic: RAM problem or something else?

todd@pinhead.pegasus.com (Todd Ogasawara) (05/09/91)

One of my UNIX boxes crashed the other day with the following panic message
on the console.

	FATAL: Parity error on motherboard
	PANIC: Parity error at address 0x007D8BCA
	DEBUGGER ; ebp ; 0x0E0000DB4

The system is configured as follows:

	Everex Step 386/25, 8M RAM, 25MHz 80387
	Everex VGA board
	Everex ESDI hard disk controller
	Everex Serial/Parallel board
	Wangtek tape drive & controller
	Digiboard PC/16i
	Cleo Communications 3270 PC/XL U/X
	Interactive UNIX 2.2 (System V/386 R3)

When I rebooted the system, it passed the little POST RAM test for all
8M. UNIX seemed to come up clean after cleaning up the disk with fsck a
bit. The system seems stable so far. I am considering shutting down UNIX
and running some DOS-based RAM checkers. But, I thought I'd check
collective net wisdom before taking any action. I.e., is it really a RAM
problem or am I seeing some kind of interaction? I haven't added any new
hardware or made any new kernel modifications in about two months. The only
thing I did recently (about two weeks ago) was to load software onto the
Cleo 3270 board's onboard RAM.

BTW. The system has been up contiuously for about three months without
any other problems noted.

-- 
Todd Ogasawara ::: Hawaii Medical Service Association
Internet       ::: todd@pinhead.pegasus.com
Telephone      ::: (808) 536-9162 ext. 7

doug@ohenry.UUCP (Richard H. Douglas) (05/10/91)

todd@pinhead.pegasus.com (Todd Ogasawara) writes:
> One of my UNIX boxes crashed the other day with the following panic message
> on the console.
> 
> 	FATAL: Parity error on motherboard
> 	PANIC: Parity error at address 0x007D8BCA
> 	DEBUGGER ; ebp ; 0x0E0000DB4
> 
> The system is configured as follows:
> ... 
> BTW. The system has been up contiuously for about three months without
> any other problems noted.
> ...

Somewhere on the net, I read that radiation (alpha I think) can and will
eventually get you if a machine is left on constantly. Make sure your
ram is of good quality and that it is conservatively rated. It's too bad
that manufacturers do not make boards with ecc correction on them, or did
I miss something somewhere? A extra chip and a little extra cash would
let me sleep better at night.

One other item. Are you on a UPS? A good one? One with a very fast transfer
time or no delay at all. On a 386/486, it's my guess that better than
1 ms should be required. In photography, 1 ms = 1/1000 of a second; not fast
at all.

Yes, I have had this happen to me quite frequently until I got faster ram.

One other thing, if this happens very frequently, give your machines
power supply a good look. They don't have to act bad to be marginal.

Hope this helps.

Rich Douglas    doug@ohenry