[comp.sys.sgi] 4 Mb SIMMS

shoshana@pdi.UUCP (Shoshana Abrass) (06/26/91)

  We've started having a weird problem with our 4 Mb SIMMS. Background:
  We're using 100% Toshiba chips. Some of our simms were built by 
  Kingston, some of them were built by Toshiba - but it doesn't make
  a difference as far as our problem goes.  We've been using the 4 Mb 
  SIMMS in our 4D 25's since at least september and currently have them
  in 18 machines.

  At first the simms worked fine. But then a few times I noticed that
  when I powered down a machine, I would get an error on startup. This
  error occurs during power-on diagnostics and flashes on the screen
  *very* briefly (but can be stopped with <ctrl>-S). After the error, 
  the machine won't boot and an hinv at the PROM shows 0 memory. The
  entire message is shown below.

  The weird part: The problem doesn't mean that the hardware is bad. At
  first we panicked and called the hotline; they replaced the e-mod and
  things did work better. But then we realized that by just wiggling the
  simms, we could make the problem go away. I've now done this more than
  a dozen times on various machines . The problem always occurs at boot
  time - the machine runs without problems once it's up. Wiggling or, at 
  worst, removing and re-installing the simms always fixes the problem.
  I guess that the physical fit between some 4 Mb simms and some iris 
  memory slots is a bit loose or otherwise imperfect.

  I'm posting to the net because (1) others may have this problem some
  day (2) I'm wondering if anyone has had this problem and found a 
  permanent fix? (3) I thought SGI gurus might have some insight into
  the error message.

-------------
EXCEPTION: <vector = NORMAL>
Exception pc: 0xbfc00e0c
Cause register: 0x1c <CE=0, EXC=DBE>
Status register: 0x88004 <CM, IM8, IPL=7, IEp>
Error Addr register: 0x400000
Local I/O interrupt register: 0xbf <GE, VME ACFAIL, VERTICAL>
Parity error register: 0xe4 <CPU>
Registers (in hex): 

[many registers listed]
------------

  -shoshana
  shoshana@pdi.com
  pdi!shoshana@sgi.com

--

dwong@yosemite.esd.sgi.com (David Wong) (06/27/91)

In article <9106252336.AA00574@koko.pdi.com>, shoshana@pdi.UUCP (Shoshana Abrass) writes:
> 
>   We've started having a weird problem with our 4 Mb SIMMS. Background:
>   We're using 100% Toshiba chips. Some of our simms were built by 
>   Kingston, some of them were built by Toshiba - but it doesn't make
>   a difference as far as our problem goes.  We've been using the 4 Mb 
>   SIMMS in our 4D 25's since at least september and currently have them
>   in 18 machines.
> 
>   At first the simms worked fine. But then a few times I noticed that
>   when I powered down a machine, I would get an error on startup. This
>   error occurs during power-on diagnostics and flashes on the screen
>   *very* briefly (but can be stopped with <ctrl>-S). After the error, 
>   the machine won't boot and an hinv at the PROM shows 0 memory. The
>   entire message is shown below.
> 
>   The weird part: The problem doesn't mean that the hardware is bad. At
>   first we panicked and called the hotline; they replaced the e-mod and
>   things did work better. But then we realized that by just wiggling the
>   simms, we could make the problem go away. I've now done this more than
>   a dozen times on various machines . The problem always occurs at boot
>   time - the machine runs without problems once it's up. Wiggling or, at 
>   worst, removing and re-installing the simms always fixes the problem.
>   I guess that the physical fit between some 4 Mb simms and some iris 
>   memory slots is a bit loose or otherwise imperfect.
> 
>   I'm posting to the net because (1) others may have this problem some
>   day (2) I'm wondering if anyone has had this problem and found a 
>   permanent fix? (3) I thought SGI gurus might have some insight into
>   the error message.
> 
> -------------
> EXCEPTION: <vector = NORMAL>
> Exception pc: 0xbfc00e0c
> Cause register: 0x1c <CE=0, EXC=DBE>
> Status register: 0x88004 <CM, IM8, IPL=7, IEp>
> Error Addr register: 0x400000
> Local I/O interrupt register: 0xbf <GE, VME ACFAIL, VERTICAL>
> Parity error register: 0xe4 <CPU>
> Registers (in hex): 
> 
> [many registers listed]
> ------------
> 
>   -shoshana
>   shoshana@pdi.com
>   pdi!shoshana@sgi.com
> 
> --

next time when you get the same problem, go to the prom monitor and issue the
following command:

	fill -v 0 0xa0400000

and then reset the machine by hitting the reset button.  this is a bug is the
prom.  it tries to read some memory location before the memory is initialized.

					David