[comp.arch] Parity checking ALU's

mark@mips.COM (Mark G. Johnson) (01/18/90)

In article <48540@sgi.sgi.com> rpw3@rigden.UUCP (Robert P. Warnock) writes:
   >
   > ... [gives an example of a PDP-10 CPU failure]
   >
   >ALU parity adds some confidence, but not certainty. And no help in
   >this case.

I'll go farther and get flamed warmer:  ALU PARITY IS A RED HERRING: since
the ALU is typically less than 20% of the hardware of a CPU, parity-izing
it (and nothing else) has near negligible effect on overall system reliability.

Why don't people add parity to the program counter; it's an adder just like
the ALU.  How about the register scoreboard hardware: it's *more* gates than
an adder.  One doesn't often hear of parity on the floating-point execution
units: why is that?  Or the branch target cache/buffer: it consumes more area 
on single-chip CPU's than does the ALU.  Why not parity on the TLB?  Or, worst
of all, why not put parity on the logic that detects an interrupt/exception?

I'll propose an answer: those who put parity on the ALU do so *because it's
easy*.  It's hard to put parity on a TLB; consider the '386's 4-way set
associative TLB and think about the elegance (and on-chip bussing delights)
of hanging parity on that.  Interrupt logic: forget it.

Summary and lightning rod: Internal logic of CPU's is incredibly difficult
to compute parity on, except for a couple special cases (ALU, registers).
Putting parity on these makes a nice bullet for your marketing slides
but doesn't protect a very large fraction of the potential hw faults.
Other elements of a system are more amenable to protection via (parity, ECC)
{like memories and busses}, so most hw faults in these regions of the
machine can be detected with (parity, ECC).
-- 
 -- Mark Johnson	
 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086
	(408) 991-0208    mark@mips.com  {or ...!decwrl!mips!mark}