kds@mipos2.intel.com (Ken Shoemaker ~) (08/30/88)
I played around with the Duff block move thingy on the Sun 386i box and got some pretty predictable results. If you take the program as written, the Duff block move takes ~1/2 as long as the normal block move. However, if you go in and do a little assembly language hack, taking advantage of the repeat move string instruction, you speed up the move considerably such that the Duff code takes ~1.8 times as long as the assembly string move. In other words: C string move: 2.0 Duff string move: 1.0 Assy string move: 0.55 where the numbers are normalized times. The assembly language hack amounted to adding 3 lines of assembly code and removing the code the compiler generated for the block move. Another thing that I noticed when looking at the assembly language for the string move part of the Duff move was that block of code generated for each of the "cases" does almost exactly what the string move instruction does, since the array pointer variables are declared as register type. The normalized time for this version of the Duff string move is: Modified Duff string move: 0.69 Yet another thing that can be done is to let the compiler generate the iteration variable, but to use the string move instruction to do the work of the loop. Note, that in this case, the compiler puts the iteration variable in memory, not in a register. The normalized time for this is: Modified C string move: 1.6 For the curious, the assembly language generated by the compiler to do the string move is: movl %edi,%eax addl $4,%edi movl %esi,%edx addl $4,%esi movl (%edx),%ecx movl %ecx,(%eax) which, as far as the business end is concerned, is the same as a single movsl Have fun... ------------------- If you break a law to prove a law, you're on pretty shakey moral grounds -- Ian Shoales Ken Shoemaker, Microprocessor Design, Intel Corp., Santa Clara, California uucp: ...{hplabs|decwrl|amdcad|qantel|pur-ee|scgvaxd|oliveb}!intelca!mipos3!kds