[comp.sys.intel] POPF still broken in 286?

don@gitpyr.gatech.EDU (Don Deal) (11/17/86)

   IBM put a warning in the AT Technical Reference, and I can remember 
several people complaining about a problem with the POPF instruction that
allowed interrupts to be processed even when interrupts had been disabled.
Was this problem fixed in subsequent masks for the 286, or does it still
exist?

   That a problem like this could occur (and has with vendors other than
Intel) makes me wonder what kind of testing goes on before products are
released to market.   Given that increasingly complicated architectures
are showing up in most microprocessor families, it would seem that additional
testing is in order.  Is anyone familiar with the testing cycles that go
on for microprocessors? 

-- 
D.L. Deal, Office of Computing Services, Georgia Tech, Atlanta GA, 30332-0275
Phone: (404) 894-4660   ARPA: don@pyr.ocs.gatech.edu  BITNET: cc100dd@gitvm1
uucp: ...!{akgua,allegra,amd,hplabs,ihnp4,masscomp,ut-ngp}!gatech!gitpyr!don

dan@prairie.UUCP (Daniel M. Frank) (11/18/86)

In article <2646@gitpyr.gatech.EDU> don@gitpyr.gatech.EDU (Don Deal) writes:
>   IBM put a warning in the AT Technical Reference, and I can remember 
>several people complaining about a problem with the POPF instruction ...
>
>   That a problem like this could occur (and has with vendors other than
>Intel) makes me wonder what kind of testing goes on before products are
>released to market.

   Recently, a friend of mine had some problems running a protected mode
operating system on an IBM AT.  He traced it to an early and buggy lot
of 286 chips, a few of which he was unlucky enough to receive via IBM.
On contacting IBM, he was told, "It's an Intel problem".  When he 
called Intel, he was told that IBM had been aware of the problem with
the early runs, and Intel had been unwilling to ship without a letter
from IBM acknowledging the problem and absolving Intel of liability.
IBM duly provided the letter, and Intel shipped the chips.

   I should note that this is hearsay.  Perhaps one of the folks from
Intel could be kind enough to confirm or deny it.  In any case, it 
takes a long time to prepare a Tech Ref manual, and even longer to
write the BIOS, which includes workaround code for many of these bugs.
It is almost inconceivable that IBM didn't know about the problems
long before the introduction of the AT.

	The AT is no miracle of engineering anyway.  The BIOS is filled
with funny delay loops and useless instructions all designed to pass
the time until the contents of device registers become valid.  Too
cheap to build the hardware right, I guess.  It's no surprise to me
that IBM also accepted bad chips and worked around the problems.
And protected mode?  No protected mode operating systems around
anyway, for years maybe.  I can't wait until p.m. DOS comes out :-).

-- 
    Dan Frank
    uucp: ... uwvax!prairie!dan
    arpa: dan%caseus@spool.wisc.edu

tomk@intsc.UUCP (Tom Kohrs) (11/18/86)

>    IBM put a warning in the AT Technical Reference, and I can remember 
> several people complaining about a problem with the POPF instruction that
> allowed interrupts to be processed even when interrupts had been disabled.
> Was this problem fixed in subsequent masks for the 286, or does it still
> exist?
> 
The problem with the POPF instruction (it would always enable interrupts)
was only in the B-step parts (identifiable by markings of (c) Intel'82 or
(c) Intel '83).  The C-step and E-step have this problem fixed.  Almost all
of the B-step parts that were shipped went to IBM.

>    That a problem like this could occur (and has with vendors other than
> Intel) makes me wonder what kind of testing goes on before products are
> released to market.   Given that increasingly complicated architectures
						^^^^^^^^^^^
> are showing up in most microprocessor families, it would seem that additional
> testing is in order.  Is anyone familiar with the testing cycles that go
> on for microprocessors? 

Complicated is the key word. As architectures become more and more complicated
it takes longer to generate all of the test vectors necessary to prove the 
design.  Initial part testing takes two forms. Running software from a previous
part (the 8086 in the case of the 286) and specific vectors designed to stress
the part.  How much gets found and fixed before the parts ship in volume has
more to do with marketing considerations than the technical correctness of
the chip.  Long term testing is done by trying to put the chip through every
conceivable sequence of events (both hardware and software) and by following
up on problem reports from the field. You would be amazed at what some people 
will try to do to a chip.
-- 
------
"Ever notice how your mental image of someone you've 
known only by phone turns out to be wrong?  
And on a computer net you don't even have a voice..."

  tomk@intsc.UUCP  			Tom Kohrs
					Regional Architecture Specialist
		   			Intel - Santa Clara

james@reality1.uucp (james) (11/21/86)

IN article <405@intsc.UUCP>, tomk@intsc.UUCP (Tom Kohrs) wrote:
> The problem with the POPF instruction (it would always enable interrupts)
> was only in the B-step parts (identifiable by markings of (c) Intel'82 or
> (c) Intel '83).  The C-step and E-step have this problem fixed.  Almost all
> of the B-step parts that were shipped went to IBM.

Well, what's the most current rev. level for the 80286, ie, how recent
should mine be to avoid all known bugs?  Chip bugs are normally top secret,
but I assume the the current chip rev. level isn't sensitive.

> How much gets found and fixed before the parts ship in volume has
> more to do with marketing considerations than the technical correctness of
> the chip.  Long term testing is done by trying to put the chip through every
> conceivable sequence of events (both hardware and software) and by following
> up on problem reports from the field. You would be amazed at what some people 
> will try to do to a chip.

Well, gee, I can't think of any bugs in the MC68020 (not XC68020) offhand,
although the argument might be that they haven't been found yet, or that
Motorola has had better luck hiding them than Intel has had.  Of course,
the 68000 did have the bug with the status register in which you could read
the priviledge level directly from user mode (although this was later
documented as a feature :-).
-- 
James R. Van Artsdalen    ...!ut-ngp!utastro!osi3b2!james    "Live Free or Die"

campbell@sauron.UUCP (Mark Campbell) (11/29/86)

In article <83@reality1.uucp> james@reality1.UUCP (james) writes:
>IN article <405@intsc.UUCP>, tomk@intsc.UUCP (Tom Kohrs) wrote:
>> How much gets found and fixed before the parts ship in volume has
>> more to do with marketing considerations than the technical correctness of
>> the chip.  Long term testing is done by trying to put the chip through every
>> conceivable sequence of events (both hardware and software) and by following
>> up on problem reports from the field. You would be amazed at what some people 
>> will try to do to a chip.
>
>Well, gee, I can't think of any bugs in the MC68020 (not XC68020) offhand,
>although the argument might be that they haven't been found yet, or that
>Motorola has had better luck hiding them than Intel has had. [...]

Would it have made any difference if Intel had called the I80286 the
XI80286 before a certain release of the chip?  All of the so-called
X parts I have are labelled "MC68020".  The argument you're putting forth
is syntactic; the semantics are identical.

What really irritates me is releasing different revisions of these
parts with no way for the S/W to be able to detect the different
revisions.  It's terrible that Motorola went to the trouble of dumping
a revision number of the MC68020 in the microstate during certain
exceptions but has never updated that revision number.  This means
that the bug fixes for the X parts must be retained in current releases
of software because there is no way in S/W to tell that which machines
in the field have XC68020's in them.

Around here we have a joke that the first update of the revision
number for the MC68020 will be in the MC68030.

I'm not as familiar with the Intel parts so I don't know if they have
the same problems -- I would assume so.  In any case, why
don't you microprocessor developers out there take pity on the
rest of us and update your revision numbers once in a while.
-- 

Mark Campbell    Phone: (803)-791-6697     E-Mail: !ncsu!ncrcae!sauron!campbell

cmcmanis@sun.uucp (Chuck McManis) (12/01/86)

In article <83@reality1.uucp>, james@reality1.uucp (james) writes:
> Well, what's the most current rev. level for the 80286, ie, how recent
> should mine be to avoid all known bugs?  Chip bugs are normally top secret,
> but I assume the the current chip rev. level isn't sensitive.
James, I don't think chip bugs are "top secret", ask your Intel sales rep
for an Errata sheet. They can also generally tell you the current rev
level or 'stepping' as the semiconducter trade likes to refer to it as.

> 
> Well, gee, I can't think of any bugs in the MC68020 (not XC68020) offhand,
> although the argument might be that they haven't been found yet, or that
> Motorola has had better luck hiding them than Intel has had.  Of course,
> the 68000 did have the bug with the status register in which you could read
> the priviledge level directly from user mode (although this was later
> documented as a feature :-).

Well gee, I bet you couldn't think of any '286 bugs offhand if you hadn't
actually been bit by one. CPU's in general are getting so complicated
that the time to test every transistor in a 32 bit CPU can often be 
measured in hours, one of the biggest challenges facing a Production 
Engineer is to get that testing time down as low as possible. And yes
were bugs in the 68020, and consider that the '286 is both a CPU and
a MMU in one package and compare it against the '020 and '451 or '851.
It is all quite silly, 99% of the 'bugs' are so obscure that they
are *never* seen by 99% of the users. 

For a sobering look at the problems of testing VLSI look at some of the
recent Digital Design, EDN, and other technical mags. 

-- 
--Chuck McManis
uucp: {anywhere}!sun!cmcmanis   BIX: cmcmanis  ARPAnet: cmcmanis@sun.com
These opinions are my own and no one elses, but you knew that didn't you.