GEO%LOYVAX.BITNET@cunyvm.cuny.edu (03/24/89)
This may be of interest to those caught between hardware and software vendors. The context is: HW -- IBM PS/2 mod 80, 70mb drive, ESDI controller, 4mb memory; SW -- SCO Xenix 2.2.6. On start-up one morning, the memory check halted at 1662 with error messages for configuration change and unset clock/calendar. The reference disk showed 4096k available, 1662 usable, but 0 ESDI controllers available. Autoconfigured and reset clock/calendar. On restart, the Xenix hard disk boot proceeded up to the point at which the hard disk configuration is established. At this point the machine hung with the hard disk activity light on and the message: panic: memory failure - parity error Xenix would boot fine from a floppy, but attempts to mount the hard disk gave the same lock-up, error message, and, as a little bonus, would also trash the boot floppy. After a few days of weeping and teeth-gnashing, IBM replaced just about everything in the machine, including memory and system board. The disk was given a low-level format. DOS booted and ran fine. Overnight looping through all diagnostics for over 14 hours produced no errors. The IBM CE support desk in Atlanta said they had heard of "6 or 8" such situations. I was assured it was a software problem. SCO assured me that the my error message is "one of the more straight-forward ones." The tech said that if my machine were AT-class, he'd be certain it was a memory failure; since the machine's a PS/2-80, he was less certain -- said this machine give the message for other, unknown reasons. In any event, it was definitely a hardware problem, he said. So there I was, caught between IBM & SCO with no leverage at all! To shorten the story, a day or so later, a more knowledgeable SCO tech (Tracy by name) said the problem was probably due to DMA contention. She suggested 3 possibilities: (1) the low-level format of the hard disk had somehow gone bad, (2) DMA arbitration conflict between the hard disk and something else, and (3) improper DMA arbitration set by the autoconfigure routine. Of the three, the third was the most likely one. Tracy didn't know what the proper DMA arb level was, so I cycled through them all (0-7). The autoconfigure had established level 6. Turns out my configuration works ONLY on level 5. Once I reset DMA arb to 5, reloaded from backup, re-created device drivers, etc., etc., why, it was as if nothing had ever happened! :-) The thing is, neither SCO nor IBM knew to rule out DMA arb problems until quite late in the game. The moral: keep backups, certainly. It also convinced me that a service contract -- at some $260/year -- is preferable to paying for a new system board at $190/hour for labor and $2160 for parts. Didn't have to do that this time, but it was just too close a call. George Wright, GEO@LOYVAX.BITNET, Loyola College, Baltimore MD