shoshana@pdi.UUCP (Shoshana Abrass) (06/26/91)
We've started having a weird problem with our 4 Mb SIMMS. Background: We're using 100% Toshiba chips. Some of our simms were built by Kingston, some of them were built by Toshiba - but it doesn't make a difference as far as our problem goes. We've been using the 4 Mb SIMMS in our 4D 25's since at least september and currently have them in 18 machines. At first the simms worked fine. But then a few times I noticed that when I powered down a machine, I would get an error on startup. This error occurs during power-on diagnostics and flashes on the screen *very* briefly (but can be stopped with <ctrl>-S). After the error, the machine won't boot and an hinv at the PROM shows 0 memory. The entire message is shown below. The weird part: The problem doesn't mean that the hardware is bad. At first we panicked and called the hotline; they replaced the e-mod and things did work better. But then we realized that by just wiggling the simms, we could make the problem go away. I've now done this more than a dozen times on various machines . The problem always occurs at boot time - the machine runs without problems once it's up. Wiggling or, at worst, removing and re-installing the simms always fixes the problem. I guess that the physical fit between some 4 Mb simms and some iris memory slots is a bit loose or otherwise imperfect. I'm posting to the net because (1) others may have this problem some day (2) I'm wondering if anyone has had this problem and found a permanent fix? (3) I thought SGI gurus might have some insight into the error message. ------------- EXCEPTION: <vector = NORMAL> Exception pc: 0xbfc00e0c Cause register: 0x1c <CE=0, EXC=DBE> Status register: 0x88004 <CM, IM8, IPL=7, IEp> Error Addr register: 0x400000 Local I/O interrupt register: 0xbf <GE, VME ACFAIL, VERTICAL> Parity error register: 0xe4 <CPU> Registers (in hex): [many registers listed] ------------ -shoshana shoshana@pdi.com pdi!shoshana@sgi.com --
dwong@yosemite.esd.sgi.com (David Wong) (06/27/91)
In article <9106252336.AA00574@koko.pdi.com>, shoshana@pdi.UUCP (Shoshana Abrass) writes: > > We've started having a weird problem with our 4 Mb SIMMS. Background: > We're using 100% Toshiba chips. Some of our simms were built by > Kingston, some of them were built by Toshiba - but it doesn't make > a difference as far as our problem goes. We've been using the 4 Mb > SIMMS in our 4D 25's since at least september and currently have them > in 18 machines. > > At first the simms worked fine. But then a few times I noticed that > when I powered down a machine, I would get an error on startup. This > error occurs during power-on diagnostics and flashes on the screen > *very* briefly (but can be stopped with <ctrl>-S). After the error, > the machine won't boot and an hinv at the PROM shows 0 memory. The > entire message is shown below. > > The weird part: The problem doesn't mean that the hardware is bad. At > first we panicked and called the hotline; they replaced the e-mod and > things did work better. But then we realized that by just wiggling the > simms, we could make the problem go away. I've now done this more than > a dozen times on various machines . The problem always occurs at boot > time - the machine runs without problems once it's up. Wiggling or, at > worst, removing and re-installing the simms always fixes the problem. > I guess that the physical fit between some 4 Mb simms and some iris > memory slots is a bit loose or otherwise imperfect. > > I'm posting to the net because (1) others may have this problem some > day (2) I'm wondering if anyone has had this problem and found a > permanent fix? (3) I thought SGI gurus might have some insight into > the error message. > > ------------- > EXCEPTION: <vector = NORMAL> > Exception pc: 0xbfc00e0c > Cause register: 0x1c <CE=0, EXC=DBE> > Status register: 0x88004 <CM, IM8, IPL=7, IEp> > Error Addr register: 0x400000 > Local I/O interrupt register: 0xbf <GE, VME ACFAIL, VERTICAL> > Parity error register: 0xe4 <CPU> > Registers (in hex): > > [many registers listed] > ------------ > > -shoshana > shoshana@pdi.com > pdi!shoshana@sgi.com > > -- next time when you get the same problem, go to the prom monitor and issue the following command: fill -v 0 0xa0400000 and then reset the machine by hitting the reset button. this is a bug is the prom. it tries to read some memory location before the memory is initialized. David