chip@dartvax.UUCP (Brig ) (04/28/84)
I sent out a note a while ago about Intel's inaccurate timings for the 8088. Chuck McManis of Intel replied that I was simply misreading the data sheets. Not so. Let me be a little more specific about the inaccuracies. Let's consider a specific example: how long should it take to shift a register left one bit (SHL AX,1)? According to the book, this takes 2 clock cycles. NOT 2 bus cycles, as Mr. McManis alleges. If you think for a bit, you'll see that bus cycles are pretty irrelevant to any instruction which doesn't reference memory. And the instruction CAN take as little as 2 clocks, under ideal conditions. The only way I've found to get it to execute as fast as advertised is to put it between two multiply instructions -- that way it's guaranteed to be pre-fetched, and not to cause the instruction queue to run dry later on. The problem is that fetching the instructions takes time, despite the 4-byte lookahead. It takes one bus cycle -- 4 clocks -- to fetch a single byte. The SHL was a 2-byte instruction, and so took 8 clocks to fetch. If you had a long series of SHL instructions, they'd each take 8 clock cycles, no matter that they supposedly execute in 2. Intel's documentation recognizes this effect. The 8088 book says "A series of fast executing (fewer than two clocks per opcode byte) instructions can drain the queue and increase execution time", and "With typical instruction mixes, the time actually required to execute a sequence of instructions will typically be with 5-10% of the sum of the individual timings given in the instruction set sequence". Both these statements seem to be simply false. The first should say "four clocks per opcode byte", which means almost all instructions are "fast". The two clocks may be true for the 8086, which fetches 2 bytes at a time, but this statement comes from the book about the 8088 specifically. As to the 5-10% number, I've found 80% to be typical, and have not managed to construct ANY practical sequence where the discrepancy is as little as 10%. (Yes, a series of multiply instructions WILL perform as advertised. i.e. very slowly.) Source: "iAPX 88 Book", Intel, July 81. The newest edition, just seen in a bookstore, does not correct these 'mistakes'. Please don't insult your customers, Mr. McManis. They already feel insulted enough that they have to write code for the 808x series.