[comp.sys.intel] Concurrent Processing with the 387 DX ??

chris@alderan.uucp (Christoph Splittgerber) (03/11/91)

In Intel's "387 DX User's Manual" - Programmer's Reference I found in

Chapter 5.2:

Because the 386 DX CPU and the 387 DX NPX have separate execution
units, it is possible for the NPX to execute numeric instructions in
parallel with instructions executed by the CPU. [] No special
programming techniques are required to gain advantages of concurrent
execution; ... etc.

So I wrote a 2 very small test function to proof this. Something like:

1)
        .
	.data
constant:
	.double 1.3456        / what ever
result:
	.double 0
	.text
	.align 4
	fldl constant
	; followd by about 300 clocks 80386 instructions
	fcos
	fstl result
	.
	.

2)
        .
	.data
constant:
	.double 1.3456        / what ever
result:
	.double 0
	.text
	.align 4
	fldl constant
	fcos
	; followd by about 300 clocks 80386 instructions
	fstl result
	.
	.

In the first function the 300 clocks 80386 instructions go behind the "fld"
which requires 25 fpu clocks. That means about 270 386-clocks are executed
while the fpu does nothing; right ?
In the second function the 300 386-clocks go behind the "fcos" which takes
between 200-800 fpu clocks. That means the 300 80386 clocks should be
executed while the cosine is computed; no ?

The thing is: I could *NOT* determine any difference in speed-of-execution.
NOT EVEN THE SLIGHTEST DIFFERENCE.

So, what am I doing wrong ? Any ideas ?

              Chris
-- 
************************ Brain fault (core dumped) *************************
Replies-To:  chris@alderan.uucp        UUCP: uunet!mcsun!unido!alderan!chris 
Phone:       +49 711 344375            Fax:  +49 711 3460684