[comp.sys.intel] mov reg,imm32

m5@lynx.UUCP (Mike McNally) (06/10/88)

According to the 386 programmer's manual, a "mov" instruction which
sets a register to a 32-bit value takes 2 clocks.  It bothers some part
of me to think that it takes five bytes to set EAX to 1.  It then
dawned on me that I could clear EAX and increment it to 1 with the
sequence

	xor		EAX, EAX
	inc		EAX

Three bytes.  But the problem is, each of these two guys is 2 cycles,
according to the book.  But but, doesn't the "mov" kinda sorta take
extra time too, like in terms of extra instruction fetch overhead?  I
know it's hard to figure out the costs, but one way or another the CPU
has to fetch five bytes.  So which is better?

-- 
Mike McNally of Lynx Real-Time Systems

uucp: lynx!m5 (maybe pyramid!voder!lynx!m5 if lynx is unknown)

toma@tekgvs.TEK.COM (Tom Almy) (06/13/88)

In article <3889@lynx.UUCP> m5@lynx.UUCP (Mike McNally) writes:
>According to the 386 programmer's manual, a "mov" instruction which
>sets a register to a 32-bit value takes 2 clocks.  It bothers some part
>of me to think that it takes five bytes to set EAX to 1.  It then
>dawned on me that I could clear EAX and increment it to 1 with the
>sequence
>
>	xor		EAX, EAX
>	inc		EAX
>
>Three bytes.  But the problem is, each of these two guys is 2 cycles,
>according to the book.  But but, doesn't the "mov" kinda sorta take
>extra time too, like in terms of extra instruction fetch overhead?  I
>know it's hard to figure out the costs, but one way or another the CPU
>has to fetch five bytes.  So which is better?


Well I tried it out by executing a long sequence of these instructions.
Basically everything Mike said is correct!  On a 20Mhz 386 system with 0ws
(Everex 386/20)  The MOV instruction method took 3 clocks because the
pipeline is drained.  The xor/inc sequence took 4 clocks.  

By the way, I have a dummy program just for this sort of testing.  All I have
to do is change the instruction in the loop.

Now as to the question "which is better?".

If you want the smallest program (after all, RAM prices are pretty high!) then
the xor/inc sequence is better since it takes fewer bytes.

If you want the fastest program (which is why you bought the 386, isn't it?)
then mov is better.  In fact, mov could be two clocks if in the context it is
used the pipeline does not empty.

Lets look at this issue from a practical perspective too.  Is it worth your t
ime to care which is better.  Will one clock (50nsec) per execution matter?  
And you can bet the rules will change for the 486.  Generally, you can just 
use whatever feels best!


Tom Almy
toma@tekgvs.tek.com