rang@cs.wisc.edu (Anton Rang) (09/06/90)
In article <1589@redsox.bsw.com> campbell@redsox.bsw.com (Larry Campbell) writes: > /* Fragment 1 */ > for (p = array; p < &array[ARRAY_SIZE]; p++) > *p++ = '\0'; ^^^^ ^^^ presumably only one of these increments is wanted... > /* Fragment 2 */ > for (i = 0; i < ARRAY_SIZE; i++) > array[i] = '\0'; The MIPS compiler generates identical code for these (and two similar cases I checked). GNU C on an MC68020, which usually generates better code than any other 68000-series compiler I've seen, uses array operations for the second loop (at least 1.37.1 does). >If the answer to my question is "yes", then I think we should flame the >culpable compiler vendors to a crisp. Yep...it's time for compiler authors to stop putting so much of the burden on programs, IMHO. (Then again, I like Pascal too... :-) Anton +---------------------------+------------------+-------------+ | Anton Rang (grad student) | rang@cs.wisc.edu | UW--Madison | +---------------------------+------------------+-------------+
rang@cs.wisc.edu (Anton Rang) (09/07/90)
In article <1990Sep6.203031.29254@laguna.ccsf.caltech.edu>, jimmy@tybalt.caltech.edu (Jimmy Hu) writes: >I would expect the two fragments to generate different codes. [ ... ] >I don't think that the compiler will "convert" or "optimize" the >second to the first or vice versa. It should, if the first will run faster--and on most modern machines, it will. On machines with autoincrement, the pointer version becomes something like: loop: clr.b (r0)+ ; Clear a byte, autoincrement R0 cmp.l r0,r1 ; Are we at the end of the array yet? bleq loop ; No, keep going The second loop, if compiled naively, generates code more like loop: clr.b (r0)[d0.l] ; Clear a byte using indexed addressing inc.l d0 ; Increment i cmp.l d0,d1 ; Done yet? bleq loop ; No, not yet This is likely to run slower on nearly all machines. (The above code is intended to reflect a typical CISC machine, say a 68020 or VAX.) >where is it going to get a pointer from? Create one? Yes, it should. >How will i become ARRAYSIZE after exiting the loop? If i is used after the loop, there should be an instruction *after* the loop which moves ARRAYSIZE into i. If i is used inside the loop for something, the compiler has to decide between using indexed addressing for the array, or maintaining an address register synchronized to i (if i isn't assigned to within the loop, and there are extra registers available, this is fairly easy). >I think that the first fragment will run faster on most machines, >since the address (array+i) doesn't have to be computed at each step. Exactly; this is why we want the compiler to convert the array version into the pointer version. That way, programmers can write it more clearly (or at least, perhaps more clearly to them), and it will still run at the same speed. (On some machines, though I don't know of any, it could even be that the pointer version would be slower, and the compiler should then convert case (1) into (2).) This particular optimization is well-known, and has been used in FORTRAN compilers for some time. It's called "induction variable elimination"; check out pp. 466-471 of the original Dragon book for an explanation and (simplistic) algorithms. (It's easier in FORTRAN since aliasing is rarer.) Anton +---------------------------+------------------+-------------+ | Anton Rang (grad student) | rang@cs.wisc.edu | UW--Madison | +---------------------------+------------------+-------------+