[comp.sys.ibm.pc] Serious math-coprocessor on the 80386

mic@lapis.berkeley.edu (Michel Bruneau) (05/06/87)

It is a well know fact that the 8086-8087 combinaison beats easily the
80286-80287 setup in term of numbercrunching speed (at least for the
Fortran programs I use!) .. Now that the 80386 is out, is there a
math-coprocessor that will come with it , and can we expect real output
from it (i.e. speeding up 20 times or more, like the 8087 did with the 8086)?

apn@nonvon.UUCP (root) (05/08/87)

in article <3477@jade.BERKELEY.EDU>, mic@lapis.berkeley.edu (Michel Bruneau) says:
> 
> It is a well know fact that the 8086-8087 combinaison beats easily the
> 80286-80287 setup in term of numbercrunching speed (at least for the
> Fortran programs I use!) .. Now that the 80386 is out, is there a
> math-coprocessor that will come with it , and can we expect real output
> from it (i.e. speeding up 20 times or more, like the 8087 did with the 8086)?


	As for myself... I would first worry about getting a "real" 
processor.  Wouldn't you agree that this is only reasonable ?

hjg@bunker.UUCP (Harry J. Gross) (05/08/87)

In article <3477@jade.BERKELEY.EDU> mic@lapis.berkeley.edu(Michel Bruneau) writes:
>It is a well know fact that the 8086-8087 combinaison beats easily the
						       ^^^^^^^^^^^^

	Is this true?  Really?  Even using C (Microsoft)?  Why?  Can someone
explain this please?

>80286-80287 setup in term of numbercrunching speed (at least for the
>Fortran programs I use!) .. Now that the 80386 is out, is there a
>math-coprocessor that will come with it , and can we expect real output
>from it (i.e. speeding up 20 times or more, like the 8087 did with the 8086)?

				Thanks
-- 
..!bunker\			|	This space reserved for a
..!phri\   \			|	particularly funny quotation
 ..!nyit!gor!hjg (Harry Gross)	|
..!helm/			|	All donations cheerfully examined

wtm@neoucom.UUCP (Bill Mayhew) (05/09/87)

In article <3477@jade.BERKELEY.EDU>, mic@lapis.berkeley.edu (Michel Bruneau) writes:
> It is a well know fact that the 8086-8087 combinaison beats easily the
> 80286-80287 setup in term of numbercrunching speed

I was surprised when I ran the same floating point benchmark on an
AT&T 6300 with a NEC V-30 / 8087 and an 8 MHz 1-wait state Epson
Equity III and disovered that the AT&T with the V-30 replacement
chip for its original 8086 performed at about 1.4 times the level
of the '286 machine!

Both tests were run from memory for supposed fairness.  In most
real life situations, the 286 machine beats the 6300 beause it has
a faster disk and the 286 performs non-FPU instructions much more
efficiently beause of its wider internal architecture, etc.

What is most dissipointing is the number of programs that don't
offer FPU support options.

  mits Me

apn@nonvon.UUCP (root) (05/10/87)

in article <579@neoucom.UUCP>, wtm@neoucom.UUCP (Bill Mayhew) says:
> In article <3477@jade.BERKELEY.EDU>, mic@lapis.berkeley.edu (Michel Bruneau) writes:
> 
	I'm not going to go into the details of why the 8087 is
much faster than a 287, BUT... the rumour is that V70 will come WITH
an FPU on chip. Price... a mere 675.- in 100's avail 2Q87

farren@hoptoad.uucp (Mike Farren) (05/10/87)

In article <307@nonvon.UUCP> apn@nonvon.UUCP (root) writes:
>... the rumour is that V70 will come WITH
>an FPU on chip. Price... a mere 675.- in 100's avail 2Q87

The rumor is true, but it is necessary to add that the V70 is not
compatible with the '286 or the '386 at all.  Not much help if you're
looking to improve the performance of your AT or 386 machine.

-- 
----------------
                 "... if the church put in half the time on covetousness
Mike Farren      that it does on lust, this would be a better world ..."
hoptoad!farren       Garrison Keillor, "Lake Wobegon Days"

coffee@aero.UUCP (05/11/87)

In article <579@neoucom.UUCP> ...(Bill Mayhew) writes:
>In article <3477@jade.BERKELEY.EDU>, ..(Michel Bruneau) writes:
>> ..the 8086-8087 combination beats easily the 80286-80287 setup...
>
>I was surprised when I...disovered that the AT&T with the V-30...
>performed at about 1.4 times the level of the '286 machine!

There have been a lot of messages on this one, but so far as I know
no one has mentioned an awfully important fact. The 80286 takes the main
clock signal and divides its frequency by two, so that a 16 MHz crystal
drives the 286 at 8 MHz. The 80287, on the other hand, does a divide by
three, and so normally runs at 2/3 the speed of the co-processing 286.
The 8086 and 8087, on the other hand, both use the incoming clock signal
directly. According to Intel documents, there is no other functional
difference between an 8087 and an 80287: they are merely different packages
for the same basic stack machine. I believe this, because way back
when the first AT came out we ran a CPU-intensive benchmark and discovered
that a standard PC ran faster than a "6 MHz" AT by a ratio of almost
exactly 4.77 to 4 (i.e., the effective clock rates of the _numeric_ chips).

The reason that this is important is that the 8087 and 80287 are internally
divided into an execution unit and, if I remember correctly, a control
unit: the first does the work, the second is an appointments secretary that
handles bus interface and such. This means that the FP work can take
place asynchronously; clever assembly programmers can decide for
themselves when they want to synchronize the two chips instead of accepting
the automatic FWAIT instructions generated by (I believe) most FP-supporting
compilers and assemblers. I believe there's at least one add-on tiny-board
that has an 8087 with its own crystal on a board that plugs into
the FP socket to speed up FP work without affecting other aspects of the
machine's behavior.

After normalizing for clock rate (_real_ clock rate) differences, I have
found the 8087 and 80287 to be effectively identical.

							- Peter C.

zu@ethz.UUCP (05/14/87)

In article <11502@aero.ARPA> coffee@aero.UUCP (Peter C. Coffee) writes:
>... The 80286 takes the main
>clock signal and divides its frequency by two, so that a 16 MHz crystal
>drives the 286 at 8 MHz. The 80287, on the other hand, does a divide by
>three, and so normally runs at 2/3 the speed of the co-processing 286.
>The 8086 and 8087, on the other hand, both use the incoming clock signal
>directly. According to Intel documents, there is no other functional
>difference between an 8087 and an 80287: they are merely different packages
>for the same basic stack machine. ...
>After normalizing for clock rate (_real_ clock rate) differences, I have
>found the 8087 and 80287 to be effectively identical.

While it may be true that an 80287 divides its incoming clock signal by three,
it's not true that the 80287 is a repackaged 8087.
Their speed difference comes from the method they access RAM (and therefore
their programs opcodes). The 8087 directly monitors it's coprocessors data
bus. If the CPU fetches a floating point instruction, the 8087 recognizes
this opcode at the same moment as the CPU. They are simultaneously decoding
the incoming opcodes. The 8086 doesn't do anything with this opcode but
recognizing how many data bytes will follow. The next action the cpu will take
is fetching the next byte. This goes on until all data for the last (floating
point) opcode are fetched.
In the meantime, the 8087 takes those data bytes from the bus in its internal
parameter memory and then goes processing the command.

In contrast to that method, the 80287 isn't directly hooked to the data bus
anymore but gets its data (opcodes and data) via special interface from the
cpu. This isn't true concurrent processing anymore. The cpu now also has the
task of sending the floating point instructions and their parameters to the
80287 AFTER fetching them from memory. While this overhead to be done by the
80286 isn't that big it lessens the performance gain against an 8086/8087
pair.

I don't know why Intel implemeted that scheme but it may be because of the
prefetch queue of the 80286.
I guess (!) that the 80387 works like the 80287 concerning that subject.

	...urs zurbuchen

UUCP:    ...seismo!mcvax!cernvax!ethz!zu
BITNET:  K261819 @ CZHRZU1A

ben@catnip.UUCP (Bennett Broder) (05/19/87)

In article <88@bernina.UUCP> zu@bernina.UUCP (Urs Zurbuchen) writes:
>In contrast to that method, the 80287 isn't directly hooked to the data bus
>anymore but gets its data (opcodes and data) via special interface from the
>cpu. This isn't true concurrent processing anymore. The cpu now also has the
>task of sending the floating point instructions and their parameters to the
>80287 AFTER fetching them from memory. While this overhead to be done by the
>80286 isn't that big it lessens the performance gain against an 8086/8087
>pair.
>
>I don't know why Intel implemeted that scheme but it may be because of the
>prefetch queue of the 80286.

I think it's more likely that they didn't want to duplicate all the memory
management hardware the 80286 needs for protected mode operation.

-- 

Ben Broder
{ihnp4,decvax} !hjuxa!catnip!ben
{houxm,clyde}/