ghelmer@dsuvax.uucp (Guy Helmer) (08/16/90)
I've recently discovered that my main MINIX machine is suffering memory lossage. I suspected it for some time, but didn't have proof since I saw no panics while MINIX was running and never received the dreaded PARITY ERROR while running MS-DOS junk. Finally, MINIX became extremely unstable, so I dusted off the diagnostic software for this machine and located the faulty bank of RAM. My beef is: I though MINIX would have panic'ed as fast as possible on receipt of NMI, which at least on XT-class machines means parity trouble. There is no way an operating system should continue if this condition occurs. Assuming MINIX doesn't reset the D-type flip-flop at i/o addr 0xa0 to stop NMI's from reaching the processor, I've been researching the kernel sources to find out what happens when an NMI occurs. So far, I think I see code in mpx.x accepting the nmi and calling exception in exception.c; exception is supposed to either call cause_sig() in system.c or panic. I lose track of SIGBUS at that point, and it doesn't seem to matter anyway since I haven't found any code to deal with a SIGBUS (and signal.h indicates that SIGBUS is "obsolete", which I have trouble believing right now). Can the kernel gurus please help? I'll keep searching and try to find a place to stick a handler, but I think this kind of problem should panic ASAP to lessen the chance of multiple NMI's and peripheral corruption. Thanks in advance. -- Guy Helmer work: DSU Computing Services, Business & Education Institute (605) 256-5315 play: MidIX System Support Services (605) 256-2788 dsuvax!ghelmer@cs.utexas.edu, ...!bigtex!loft386!dsuvax!ghelmer
brucee@runxtsa.runx.oz.au (Bruce Evans) (08/18/90)
In article <1990Aug16.021356.24004@dsuvax.uucp> ghelmer@dsuvax.uucp (Guy Helmer) writes: >My beef is: I though MINIX would have panic'ed as fast as possible on >receipt of NMI, which at least on XT-class machines means parity trouble. >There is no way an operating system should continue if this condition >occurs. An NMI in kernel+mm+fs causes a panic. An NMI in a user program just causes a core dump. Unless it is INIT or a shell that gets killed, the shell normally prints a message and you have the system alive to help find the error. -- Bruce Evans Internet: brucee@runxtsa.runx.oz.au UUCP: uunet!runxtsa.runx.oz.au!brucee (My other address (evans@ditsyda.oz.au) no longer works)