blackman@hodgkin.med.upenn.edu (David Blackman) (01/14/89)
I am looking for advice on how to track down a memory problem on a Sun 4/260. The following is recorded in our /usr/adm/messages file about every two minutes: axon vmunix: mem0: soft ecc addr f8248 syn 4f<S16,S2,S1,S0,SX> 47 U1647 Thanks, David Blackman@hodgkin.med.upenn.edu
eap@bu-it.bu.edu (Eric A. Pearce) (01/25/89)
blackman@hodgkin.med.upenn.edu (David Blackman) says: >I am looking for advice on how to track down a memory problem on a Sun >4/260. The following is recorded in our /usr/adm/messages file about >every two minutes: > >axon vmunix: mem0: soft ecc addr f8248 syn 4f<S16,S2,S1,S0,SX> 47 U1647 The "mem0" is the number of the board (they start at 0), so this is your first memory board, which should be in slot 6. The "U1647" refers to the chip position on the board. If you look at the back of the board, you will see the number under each chip. They are soldered in, so you don't have much choice but to replace the entire card. You should be able to find the bad board by booting in diag (change the little toggle switch on the cpu to "diag"). A "correctable error" will light the "CE" LED on the bad board and a "uncorrectable error" will light up "UE". It would be a good idea to boot in diag again after replacement to make sure you've fixed it. If your system crashes due to a memory error while in normal use, go look at the back of it to check out the LED's. If you have 8 meg boards, you can pull out the bad one, and move the jumpers on the other memory cards if needed. If you have the newer 32 meg card, I guess you are stuck until you can get it replaced. If you are doing the work yourself, check to make sure you have the resistor terminator block only on the first memory board. It is located near the center backplane connector. If you are getting errors often (more than one a week), call Sun and have it replaced. This may take some effort, as Sun field service told me not to worry about it until I was getting several hundred per day. It has been my experience that the "soft" errors usually lead to "hard" ones that crash your system. -e Eric Pearce ARPANET eap@bu-it.bu.edu Boston University Information Technology CSNET eap%bu-it@bu-cs 111 Cummington Street JNET jnet%"ep@buenga" Boston MA 02215 UUCP !harvard!bu-cs!bu-it!eap 617-353-2780 voice 617-353-6260 fax BITNET ep@buenga