[comp.arch] Need information about 386 performance

paf@unixprt.UUCP (Paul Fronberg) (05/10/89)

I am doing analysis of the i386 for use as in a controller and am having
problems relating calculated timings with measured timings. Does anyone
know about any documentation, ap-notes, or such that might be available
from Intel that describes the inner workings of the i386, especially how
the various phases of the pipeline interacts.

The measured time and calculated time are very different for several
test code fragments (The measurements were on a 386 Unix box with 0 wait
state memory, 32 bit bus). I suspect I am seeing collisions between instruction
prefetch and instruction memory accesses. Things seem to be very complicated
when the MMU is activated and the memory is not 0 wait state and I am sure
that the pipeline timing diagrams I am generating are anywhere correct.

The hardware and software reference manuals are very skimpy when it comes
to information of this type. The timing information given in the programmers
reference manual gives only execution cycles, assuming that the instruction
has already been prefetched and decoded. There seems no information concerning
the effects of MMU, addressing, etc.

Any help or information would be most appreciated. 

Paul Fronberg

henry@utzoo.uucp (Henry Spencer) (05/12/89)

In article <390@unixprt.UUCP> paf@unixprt.UUCP (Paul Fronberg) writes:
>I am doing analysis of the i386 for use as in a controller and am having
>problems relating calculated timings with measured timings. Does anyone
>know about any documentation, ap-notes, or such that might be available
>from Intel that describes the inner workings of the i386, especially how
>the various phases of the pipeline interacts.

Do remember that any such documentation has a good chance of being specific
to a particular release of the chip, which may not be the one you've got
(or the one that will be available when your design goes into production).
This is the sort of thing that manufacturers will often fiddle with as
time goes by, especially to fix bugs.

>The measured time and calculated time are very different for several
>test code fragments...
>... I suspect I am seeing collisions between instruction
>prefetch and instruction memory accesses. Things seem to be very complicated...

Things are sufficiently complicated that what you're trying to do may well
be impossible in a practical sense.  It's very hard to compute accurate
and precise timings for modern CISC machines, and even RISC designs make
it only somewhat easier.
-- 
Mars in 1980s:  USSR, 2 tries, |     Henry Spencer at U of Toronto Zoology
2 failures; USA, 0 tries.      | uunet!attcan!utzoo!henry henry@zoo.toronto.edu