tes@whutt.ATT.COM (STERKEL) (11/16/88)
<*> I see that Brown Bag's Shareware Ramtest 3.0 is being distributed. In response to an urgent request to the network last month, I got two copies. (BTW, thanks!) I threw away one after comp-ing and diff-ing to verify that they were the same, and ran the test on memory for 1 iteration. No error. This is impossible, as I frequently get memory parity errors. I then ran the test 100 iterations--no error. Finally, I ran the test for *10,000* iterations. (My wife thought I was crazy--testing for over 12 hours for an error I can force in less than 2 hours of normal use). Again, RAMTEST FLUNKED; finding NO-ERROR. Does this program *really* test memory? terry
dhesi@bsu-cs.UUCP (Rahul Dhesi) (11/17/88)
In article <4022@whutt.ATT.COM> tes@whutt.ATT.COM (STERKEL) writes: >I ran the >test for *10,000* iterations.... >Again, >RAMTEST FLUNKED; finding NO-ERROR. Dynamic RAMs of marginal quality can have defects that show up only when a certain combination of bits is stored in adjacent cells. To test memory thoroughly one would have to store all possible bit patterns. This could take forever. A second-best solution is to store enough random patterns that there is a good (but not 100%) chance of finding defects. Perhaps in your case the program didn't test with enough random patterns. It might be repeating the same limited set of patterns again and again instead of generating new ones. I personally would be suspicious of a memory testing program that does not document what bit patterns it uses. Unfortunately, none of them do. Another possibility is that you had to disable parity checking to run the test successfully, and so did not detect an error in the parity bit. (Or the program somehow disabled parity checking. It seems to know how to do, since it lets you do it from the screen without having to flip a switch on the memory board.) Aside: There is a rumor that the reason IBM created a BCD character set with holes was because not all combinations of on and off states in the old transistor-based logic circuits were equally error-free. -- Rahul Dhesi UUCP: <backbones>!{iuvax,pur-ee}!bsu-cs!dhesi
tneff@dasys1.UUCP (Tom Neff) (11/18/88)
In article <4022@whutt.ATT.COM> tes@whutt.ATT.COM (STERKEL)
wonders whether Brown Bag's Memory Test 3.0 isn't worthless,
since it found his RAMs error free after 10,000 iterations
even though he "frequently" gets parity errors in "2 hours"
of use or less.
I wonder if he is aware that there's a difference between
"Parity Error 1" and "Parity Error 2". Error 1 pertains to
motherboard RAM and virtually always means a bad chip. Error
2 signals bad parity over the XT/AT bus, and can emanate from
a faulty peripheral card (and often does) rather than from
expansion memory itself. When I get "Parity Error 2" I usually
start pulling interface cards until I isolate the problem.
Memory tests like the aforementioned BagWare :-) won't detect
these conditions.
--
Tom Neff UUCP: ...!cmcl2!phri!dasys1!tneff
"None of your toys CIS: 76556,2536 MCI: TNEFF
will function..." GEnie: TOMNEFF BIX: t.neff (no kidding)
bill@bilver.UUCP (Bill Vermillion) (11/19/88)
In article <4022@whutt.ATT.COM> tes@whutt.ATT.COM (STERKEL) writes: ><*> >I see that Brown Bag's Shareware Ramtest 3.0 is being >distributed. In response to an urgent request to the >network last month, I got two copies. (BTW, thanks!) ..... deleted stuff .... > >Does this program *really* test memory? I am looking for a good memory test routine also. In one of the groups I am in one member has volunteered to write one. If it gets done, it will be posted here. Re: memory testing. I got a call about a memory problem. Program dumped core. ('286 running Xenix). Dump showed text portions jumbled. Close look showed that only every other character bad, and it was off by one bit. L was M, etc. Board in question had no schematics, but since it was a recent problem, I checked the last back of memory, and the 4 corner chips in that bank, since it appeared that a low bit was stuck. I found ONE 64k chip where a 256 should be. System memory test and the board mfrs memory test both passed Talking with a local repair tech he mentioned the memory test supplied for his machines by the machine mfr would pass memory when one of the memory chips legs was bent out from the socket. It appears no one has a good test program. Or at least the people I have checked with, and there are some very knowledgeable, don't know of any either. Years ago no one minded if it took 10 minutes to check memory in a 64k machine. So if it takes 24 hours to thoroughly check a multi-meg machine, why not do it, and do it right. Can anyone recommend a reliable memory test. (Talking '286 and '386 machines here). -- Bill Vermillion - UUCP: {uiucuxc,hoptoad,petsd}!peora!rtmvax!bilver!bill : bill@bilver.UUCP
dold@mitisft.Convergent.COM (Clarence Dold) (11/22/88)
in article <301@bilver.UUCP>, bill@bilver.UUCP (Bill Vermillion) says: > appeared that a low bit was stuck. I found ONE 64k chip where a 256 should be. > System memory test and the board mfrs memory test both passed > > Talking with a local repair tech he mentioned the memory test supplied for his > machines by the machine mfr would pass memory when one of the memory chips > legs was bent out from the socket. > A memory test that allows a 64k chip to pass, in lieu of a 256k chip, is missing the same test that the 'bent leg' diagnostic is missing. The 64k chip looks like a 256k with one pin missing. If a chip has a floating address bit, it will respond to both the proper address, and an address at the inverse of the floating bit. The important point is that it succeeds in read/write at both addresses. It doesn't fail either of them. Some other chip probably is also reading and writing the same data, but it works. A proper memory test will write a unique pattern to each location in RAM, and then read those locations. If two locations have the same data, then somebody isn't addressing properly! Of course if you have >64k you can't uniquely test each word, so patterns of floating bit possibilities are tested. On a separate thought: A Memory diagnostic that checks refresh will write a pattern to all of memory, and then wait some arbitrary length of time (14 seconds), before reading. It then waits another 14 seconds, and reads again. If refresh isn't working, or if a read causes a bit toggle due to bad parity circuits, this will catch it. A memory test that only takes 5 seconds to run isn't testing a whole lot. -- --- Clarence A Dold - cdold@starfish.Convergent.COM (408) 435-5274 ...pyramid!ctnews!mitisft!professo!dold P.O.Box 6685, San Jose, CA 95150-6685
davidsen@steinmetz.ge.com (William E. Davidsen Jr) (11/24/88)
In article <512@mitisft.Convergent.COM> dold@mitisft.Convergent.COM (Clarence Dold) writes: | On a separate thought: | A Memory diagnostic that checks refresh will write a pattern to all of | memory, and then wait some arbitrary length of time (14 seconds), before | reading. It then waits another 14 seconds, and reads again. | If refresh isn't working, or if a read causes a bit toggle due to bad | parity circuits, this will catch it. | | A memory test that only takes 5 seconds to run isn't testing a whole lot. The refresh rate on a PC or AT is settable from software. There are programs which allow you to play with the value. Making it longer gives fewer wait states (up to 3% CPU speedup!) and making it shorter may allow you to run bad memory chips without getting parity errors. If I were writing another memory test program, I'd definitely check the margins on the refresh, but I'm not sure that my program would stay valid while running, so it would report refresh errors by crashing the system. -- bill davidsen (wedu@ge-crd.arpa) {uunet | philabs}!steinmetz!crdos1!davidsen "Stupidity, like virtue, is its own reward" -me