[comp.arch] Fancy indexing

lgy@pupthy.PRINCETON.EDU (Larry Yaffe) (07/15/88)

In article <1384@claude.oakhill.UUCP> wca@oakhill.UUCP (william anderson) writes:
	[comments on E. Killian's coding of FFT inner loop]

+One problem with this code is that is assumes the "stride" of the loop
+(the varible "n1" in the C code segment above) is unity!
+What about the code for the inner loop in the general case?  What
+effect does the assumption of non-unity stride have on the MIPS loop
+timing?   [...]

    This FFT code is a typical example where it is worthwhile to
move data explicitly outside the inner loop, in order to keep the
indexing in the inner loop as simple as possible.  The overhead
is small compared to the work in the inner loop (for any substantial
sized calculation).  In my experience, even on machines whose hardware
supports scaled indexing, the simplest addressing modes are significantly
faster - enough so that using the fancy modes is non-optimal.
Matrix multiplication is a particularly classic example of this.

+       /\        /\ 		William C. Anderson
+      //\\      //\\		Member of the M88000 Design Group
+     ///\\\    ///\\\		Motorola Microprocessor Division
+    //    \\  //    \\		Oak Hill, TX.