m5@lynx.UUCP (Mike McNally) (06/10/88)
According to the 386 programmer's manual, a "mov" instruction which sets a register to a 32-bit value takes 2 clocks. It bothers some part of me to think that it takes five bytes to set EAX to 1. It then dawned on me that I could clear EAX and increment it to 1 with the sequence xor EAX, EAX inc EAX Three bytes. But the problem is, each of these two guys is 2 cycles, according to the book. But but, doesn't the "mov" kinda sorta take extra time too, like in terms of extra instruction fetch overhead? I know it's hard to figure out the costs, but one way or another the CPU has to fetch five bytes. So which is better? -- Mike McNally of Lynx Real-Time Systems uucp: lynx!m5 (maybe pyramid!voder!lynx!m5 if lynx is unknown)
toma@tekgvs.TEK.COM (Tom Almy) (06/13/88)
In article <3889@lynx.UUCP> m5@lynx.UUCP (Mike McNally) writes: >According to the 386 programmer's manual, a "mov" instruction which >sets a register to a 32-bit value takes 2 clocks. It bothers some part >of me to think that it takes five bytes to set EAX to 1. It then >dawned on me that I could clear EAX and increment it to 1 with the >sequence > > xor EAX, EAX > inc EAX > >Three bytes. But the problem is, each of these two guys is 2 cycles, >according to the book. But but, doesn't the "mov" kinda sorta take >extra time too, like in terms of extra instruction fetch overhead? I >know it's hard to figure out the costs, but one way or another the CPU >has to fetch five bytes. So which is better? Well I tried it out by executing a long sequence of these instructions. Basically everything Mike said is correct! On a 20Mhz 386 system with 0ws (Everex 386/20) The MOV instruction method took 3 clocks because the pipeline is drained. The xor/inc sequence took 4 clocks. By the way, I have a dummy program just for this sort of testing. All I have to do is change the instruction in the loop. Now as to the question "which is better?". If you want the smallest program (after all, RAM prices are pretty high!) then the xor/inc sequence is better since it takes fewer bytes. If you want the fastest program (which is why you bought the 386, isn't it?) then mov is better. In fact, mov could be two clocks if in the context it is used the pipeline does not empty. Lets look at this issue from a practical perspective too. Is it worth your t ime to care which is better. Will one clock (50nsec) per execution matter? And you can bet the rules will change for the 486. Generally, you can just use whatever feels best! Tom Almy toma@tekgvs.tek.com