david@ms.uky.edu (David Herron -- One of the vertebrae) (04/08/88)
I'm getting this message sometimes when I boot my machine. Probably the machine is deciding that part of my memory is bad and not letting it be used. The message happens after the kernal has been read into memory from the boot device when it's printing out the memory size, BETWEEN the two memory sizes. (i.e. after it says 3.5 megs of memory but before it says 3.3 megs available). The message doesn't always print -- I haven't verified it but I think it won't print if I've done a cold boot and does print on warm boots. What isn't clear is WHERE do frames 80..92 live? I was reading through the fancy hardware manual this morning to see if I could figure out where the problem and came across the description of the memory mapping scheme. Among other things it tells me that the pages are 2K bytes wide. If a "frame" is == to "page" then frame 80 is at 0x50000 which would place it somewhere on my motherboard. Also, 10-15 missing pages would be 20-30K of memory missing -- no big deal. Especially if the fix involves taking chips off the motherboard. Am I correct on everything so far? The tech manual doesn't describe this message nor does it use "frame" in its terminology, so I'm guessing here. I've run memory diagnostics at various times over the last 4 months trying to catch the error. Finally this morning when I ran them it caught an error but it's not real clear exactly where this error is at. I was down in the "s4test" mode and the error was at address "0x5Axxx" (I forget the exact address, it's written down at home, the important thing is that it's within the 80-92 range above). BUT the error was caught in the first section of the Random Write test to my expansion memory. Expansion memory is addressed from 0x200000 on up, not in the 0x50000 range. Yet the test showed an error at 0x5Axxx! So what's going on? Again the manual isn't explicit on the details of every test and what would be going on. I suppose whoever wrote that software messed up a little (or was lazy) and didn't take into account where the test was being made and merely printed the offset into the area being tested. The last feature of this is that my machine crashes somewhat frequently with "panic: NMI (kernal parity error)". I have a bunch of numbers written down on various pieces of paper if anybody is interested. I'll be doing some testing tonight to see if the problem is repeatable. One common feature of these panics is that they happen in conjunction with some activity on the modem. Before upgrading to v3.51a it would panic at the end of a uucico or ATE call. That is, if it would panic at all. Now, the last two panics at least, it does so at the beginning... Oh, and my uudemon.hr script does the equivalent to: phtoggle uucico -r1 phtoggle Which means that there WILL be some activity with the modem at least once an hour. The only problem is that if I'm using the ATE, every hour I get two windows pop up saying that there was a problem with opening the modem and would I please hit ENTER :-). My current theory is that some memory is bad and usually when the machine boots the kernal detects the bad memory, but sometimes it doesn't. (First test is to see if ALL cold boots don't map out those memory frames). When it doesn't detect the bad memory the kernal will eventually assign the memory to some process which will eventually cause a problem and blooie. The next test is a usealot ... :-) -- <---- David Herron -- The E-Mail guy <david@ms.uky.edu> <---- or: {rutgers,uunet,cbosgd}!ukma!david, david@UKMA.BITNET <---- <---- I don't have a Blue bone in my body!