[comp.lsi] Intel quote on 386 Multiply bug

mark@mips.UUCP (Mark G. Johnson) (08/17/87)

In article <376@astroatc.UUCP>, johnw@astroatc.UUCP (John F. Wardale) writes:

>	A couple weeks ago, I asked about the infamous Intel 386 multiply
>	bug...  Does anyone REALLY KNOW, or is the world in as much
>	darkness as I am?  Should I request the "test/demo" program and
>	try to grok it?  (I don't speak Intel assembler, but this may be
>	the only way!)
>	     Can anyone shed any more light on this?  We're involved in
>	designing a large computer thru extensive use of simulation, as
>	I'm sure Intel did.  I'm interested in the following:
>	  * i386 multiply bug details
>		-- do all the failing parts ALWAYS fail the same way, or
>		   is it a transient failure
>	  * why the simulations missed it (tricky 2nd order effect???)
>	  * pitfalls of simulations (in general)
>

The July, 1987 issue of COMPUTER DESIGN magazine has an article on page 22
	"80386 Multiplier Problem Spotlights VLSI Testability Issues"
which you might find illuminating.  The following excerpt is taken
without permission:

{beginning of excerpt}
`Contrary to industry speculation, the 80386 multiplier errors result
from a layout problem, not from an error in logic design.  "We didn't
allow enough margin to catch the worst-case pattern in the multiplier
at the corners of our process," explains Dana Krelle, Intel's 80386
marketing manager.  "As a result, some chips, at some temperature/
voltage/frequency points, will produce errors from particular combinations
of 32-bit operands."  The error, apparently due to unintentional
coupling between adjacent cells in the multiplier, escaped Intel's
simulation and chip verification process until it was spotted in a
subsequent stress-testing program.'	{end of excerpt}


Another interesting facet of the problem hasn't been mentioned
yet --- the 80386 doesn't have a multiplier (!!).  The 386 uses
its general-purpose ALU, plus a shift-and-add microcode routine,
to perform multiplication operations.  Similarly, divide operations
use the ALU plus a shift-and-subtract microcode routine.

The confusing thing is, why do multiplys fail but divides and
adds (apparently) work??  Isn't it possible to supply the
appropriate "bad" ALU operands for a regular add, or to create
them during some step of a divide?
-- 
-Mark Johnson	*** DISCLAIMER: The opinions above are personal. ***	
UUCP: {decvax,ucbvax,ihnp4}!decwrl!mips!mark   TEL: 408-720-1700 x208
US mail: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086