[net.micro.68k] Warty Intel co-processor interface

gnu@sun.uucp (John Gilmore) (07/19/84)

	I do SO love it when people spout off about things they seem to know
	little about.  I do suppose, however, that all of this really matters
	in how you define "coprocessor."  WRT the 8086, ...
 
	Ken Shoemaker, Intel, Santa Clara, Ca.
	{pur-ee,hplabs,amd,scgvaxd,dual,idi,omsvax}!intelca!kds

Actually I was talking about the 80286/80287.  Intel refuses to
document the coprocessor interface (that should give you a clue right
there), so what follows is from guesses and experience rather than
specs.

The 286 decodes the floating point instructions and turns them into
references to I/O addresses 0xF8, 0xFA, and 0xFC.  The 287 is wired
into the I/O decoding such that these addresses cause chip selects.
Typical instructions cause a read to 0xF8 (status register) to check
for busy, followed by writing the opcode to 0xF8.  It then writes the
PC and EA to 0xFC, in case a fault occurs.  The 286 then waits for the
287 to issue PEREQ; this indicates that an operand is wanted.  The 286
fetches the operand and writes it to 0xFA, while asserting PEACK.  (The
286 -- the CPU, not the float chip -- knows where the operand is, and
how many bytes it occupies.)  When the 286 has transferred all the
operand bytes, it continues with the next instruction.  The 286 and 287
microcode implements seven different variants of this protocol,
depending which float instruction is involved.

This is not a terrible way to attach a float chip.  However, it does
not let you add arbitrary instructions to the instruction set, which is
the whole point of a general coprocessor interface.  Nor does it let
you have two coprocessors even if the wired-in EA and operand size
decoding could be worked around -- so if you build a graphics
coprocessor you'll have to give up floating point.  The 68020 provides
all these capabilities.

	---the above views are personal.  They may not represent those of Intel.

henry@utzoo.UUCP (Henry Spencer) (07/24/84)

John Gilmore comments, in part:

> This is not a terrible way to attach a float chip.  However, it does
> not let you add arbitrary instructions to the instruction set, which is
> the whole point of a general coprocessor interface.  Nor does it let
> you have two coprocessors even if the wired-in EA and operand size
> decoding could be worked around -- so if you build a graphics
> coprocessor you'll have to give up floating point.  The 68020 provides
> all these capabilities.

Then Motorola may have goofed, and they'll find out the hard way.
What I was told, by a fellow who consulted for Intel, was that Intel
now regrets the "general coprocessor" interface on the 8087, and
thinks it a serious mistake.  The trouble is that it adds a great deal
of complexity on the coprocessor chip for very little gain, compared to 
the slave-processor approach of having the cpu do most of the work.
Intel's abandoning of the coprocessor interface for the 8028[67] tends
to confirm this report.  It sounds like Motorola was taken in by Intel's
early coprocessor rhetoric, even as Intel was (internally) backpedalling
furiously.

There's no denying that the slave-processor approach makes it difficult
to add arbitrary customer-designed auxiliaries.  But it would seem to be
the right way to add a floating-point chip, which is (I would speculate)
a much more important consideration from the manufacturer's viewpoint.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

brucec@orca.UUCP (Master of the Belvedere) (08/06/84)

<go away bug, don't bother me>

The whole subject of coprocesser interfaces is more complicated than it looks
at first.  The reason Intel backed away from the 8087-style interface was
the complexity caused by the tight-coupling of the cpu to fpu.  The fpu had
to duplicate the cpu's instruction prefetch pipeline and operate in parallel
in order to know when the fpu instruction was actually being decoded by the
cpu so it could trap the address of the data which the cpu generated (since
the fpu didn't have all the segment registers, etc., it had to get the
physical address off the bus as the cpu generated it).

The 286/287 interface is nasty too, since it requires the cpu to have all the
microcode required for loading the data into the fpu, running the fpu, then
storing the data.  The more microcode you use for things like this, the less
there is for the main cpu functions.  Also, the fpu loads and stores take
longer, since the cpu has to do them.

There's another way, which, oddly enough, Intel uses also. That is to loosely
couple the cpu and coprocessor, and allow them to operate in parallel (anyone
who has really tried to program the 8086 & 8087 to run in parallel will tell
you just how useless the ability is, given their architecture).  This implies
a lot more capability on the part of the coprocessor, though, since it has to
be able to decode its instructions out of memory, just like a cpu.  The 8089
would be an excellent example of this technique if the instruction set
weren't so screwed up.  The 82730 text display chip shows how nicely the
technique can work.  The nice thing about this "channel processor" technique
of operation is that the cpu doesn't need any special hardware to use it;
all the special stuff is on the coprocessor (like a pin to wake the
coprocessor up and have it read its channel control block).

Of course, this approach doesn't work well when you only want to perform a a
single floating point operation on two operands, but typically you want to do
a sequence of operations on many sets of two or more operations (graphics
transforms, or FFTs are good examples).

				Bruce Cohen
				UUCP:	...!tektronix!orca!brucec
				CSNET:	orca!brucec@tektronix
				ARPA:	orca!brucec.tektronix@rand-relay
				USMail: M/S 61-183
					Tektronix, Inc.
					P.O. Box 1000
					Wilsonville, OR 97070