rivero@kovacs.UUCP (Michael Foster Rivero) (07/09/85)
Anybody out there know what the soft error messages on the system console really mean? So far, nobody an tell us anything more than,"It's some kinda soft memory error message from the error correction system." Big help! We can't figure out where the flaky memory is supposed to be. Typical message on the console looks like.... Msoft ecc 19af sym (58) corrected We're starting to get more of them. Whatever bug is in the system is BREEDING! Thanks in advance Mike Rivero
chris@umcp-cs.UUCP (Chris Torek) (07/13/85)
"soft ecc" messages come from all sorts of places in all sorts of Unix kernels. However, the one you mentioned sounds like a 780 or 750 memory controller (mcr) ecc error. The 4BSD kernels print something like this: mcr%d: soft ecc addr %x syn %x You can then use your manufacturer's tables to look up the "address" and "syndrome number"; this will point you to the bad chip. The tables vary for different memory boards and systems. 4.2/4.3 BSD kernels will even (optionally) point you to the chip, IF you have all Trendata boards.... (Maybe someday I'll put in the National Semi board tables. Too bad there's no way to pull a board identifier out of the controller. Also too bad DEC doesn't seem to give out ECC tables---then again DEC solders the chips in directly (better for stability, worse for repairs).) -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 4251) UUCP: seismo!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@maryland