jbn@wdl1.UUCP (John B. Nagle) (08/15/84)
Machines without parity checking must be considered only slightly above the toy level. Intermittent errors are a continual nagging problem in such machines. The IBM PC has parity checking; the PC Jr does not. The TI Professional does not, and suffers from intermittent problems because of it. Any machine costing over $1000 should unquestionably have parity checking; below that level, there is some argument for economy, but personally I would go for parity all the way down to the appliance control processor level. In a small computer general-purpose operating system, parity errors in user space should kill the job involved and display a message, not crash the machine. Parity errors in system space should crash the machine with a message. More elaborate strategies are possible; this is a minimum. Power supplies should be designed such that if the output voltage deviates from the rated value, the machine goes down. A zener in the right place will accomplish this. It is better to crash fully than have an undetected error. Again, more elaborate strategies are possible, such as power fail interrupts, but just plowing on is a bad idea. If you build an unreliable machine, it will not sell. Remember the Coleco Adam? John Nagle
robison@eosp1.UUCP (Tobias D. Robison) (08/17/84)
References: Parity checking is a flawed feature of limited use unless you and your system together can decide what an error means, and what to do about it. Here's a case in point: - You are editing a file, and it's been a while since your last backup, when suddenly the dreaded parity error occurs. Your data in memory is lost, and you must reboot. And yet, the only thing that was wrong was that part of the operating system needed for CRT output was bombed. You should have been able to store your document. In fact, a system without parity checking WOULD have let you store your document. Good parity checking requires the following: - The system will try to continue running after a parity error. It's your choice whether it should: + just continue what you were doing + execute routines in ROM designed to store data on disk before rebooting. + try to reload the erroneous data from disk and continue + etc... - You and the system must be able to tell the difference between kinds of parity errors: + to OSYS code + to applications code + to OSYS data + to applications data. You want to try different things in these cases, and a good system will check ALL memory, tell you what problem(s) you have, and prompt with appropriate recoveries. In general, you need a system that keeps data and instructions separate, so you can distinguish the above cases. Without the choices of recovery, you know that a parity error will ruin your current work. You don't know, however, that your current work is accurate in the absence of a parity error; there may have been a memory error not detected by parity checking; there may have been a bug; you still have to do sanity checks to make sure your work is OK. - Toby Robison (not Robinson!) allegra!eosp1!robison decvax!ittvax!eosp1!robison
jones@fortune.UUCP (08/22/84)
#R:wdl1:-38400:fortune:28000048:000:549 fortune!jones Aug 21 14:59:00 1984 I believe that all the actions Toby refers to come under the heading of error recovery after the parity hit. It is certainly true that the way an error is handled makes a big difference in user satisfaction. However, the fact that an error is handled poorly does not negate the general importance of detecting the error. Dan Jones (Remember! In vollyball you can only score when you serve!) UUCP: {ihnp4,ucbvax!amd,hpda,sri-unix,harpo}!fortune!jones DDD: (415)594-2440 USPS: Fortune Systems Corp, 101 Twin Dolphin Drive, Redwood City, CA 94065