[comp.sys.sgi] Kernel Parity Errors & SGI

ktureski@alias.UUCP (Kevin Tureski) (12/09/87)

We've got a bunch of 2400T's and 3130's in house, and were plauged by
kernel parity errors this summer. Haven't seen any for a couple of months, 
and here's why:

The KPE's are caused by some subtle timing problems between the memory boards
and the processor. So subtle that only W3.5r2 detects them as KPE's. Previous
releases just up and died without any indication as to why (when I was at
Omnibus Toronto, we had one 3130 that did this about twice a week for months,
but I don't have to worry about that machine now :-)

Not all machines/boards exhibit the problems either -- at one point, I'd 
swapped in 3 new sets of memory boards each of which worked on other machines
but wouldn't in the problem machine(s).

The solution: SGI has a set of PALS, 7 for the IP2 and 1 for each memory
board in the system. Replacement takes less than half an hour. I don't
know how fast they are producing them now, but we had a long wait between
the discovery of the likely fix and actually testing it out ... I first
called in the problem May 4 and received the last set of PALS Oct 7.

Kevin Tureski
Alias Research Inc.
110 Richmond St E. 
Toronto Canada M5C 1P1
416 362-9181

UUCP:	{allegra,ihnp4,watmath!utai}!utcsri!alias!ktureski