decot@hpisod2.HP.COM (Dave Decot) (05/23/89)
> int a[MAX]; int a[MAX]; > int i; int *p; > for (i=0; i<MAX; ++i) for (p=&a[0]; p<&a[MAX]; ++p) > a[i] = 0; *p=0; > > ... > > ... On some compiler/machine combinations, this will run faster, > because the scaling operation and base/offset addition have been > eliminated; on others it may run slower, because a specific addressing > mode cannot be used. Note that the scaling operation has not necessarily been completely eliminated; it may have become hidden in "++p" because increments by 1 may be much faster than increments by sizeof(*p). Anyway, if one is interested in permitting high performance, one might as well give the compiler complete latitude to do this efficiently by rewriting this as: static struct { int i[MAX]; } a_, zero; int *a = &a_.i[0]; ... a_ = zero; :-) :-) :-) :-) :-) :-) :-) :-) Dave "Equal rights for arrays, NOW!" Decot
jcbst3@unix.cis.pittsburgh.edu (James C. Benz) (06/01/89)
In article <2550091@hpisod2.HP.COM] decot@hpisod2.HP.COM (Dave Decot) writes:
]] int a[MAX]; int a[MAX];
]] int i; int *p;
]] for (i=0; i<MAX; ++i) for (p=&a[0]; p<&a[MAX]; ++p)
]] a[i] = 0; *p=0;
]]
]] ...
]]
]] ... On some compiler/machine combinations, this will run faster,
]] because the scaling operation and base/offset addition have been
]] eliminated; on others it may run slower, because a specific addressing
]] mode cannot be used.
]
]Note that the scaling operation has not necessarily been completely
]eliminated; it may have become hidden in "++p" because increments by 1
]may be much faster than increments by sizeof(*p).
]
Also, the scaling operation has simply been moved to the "for" line, as
far as I can tell. The machine still has to dereference &a[MAX] on each
iteration of the loop, doesn't it? It has to perform the test on each
iteration to see if p is < &a[MAX], and unless the optimizer is really
on top of things, it has no way of generating the comparison address
as a constant. It really has no way of knowing that &a[MAX] won't
change during the course of the program, so it must be re-computed on
each iteration of the loop. Seems to me that computing &a[MAX] will
take just as long and just as many arithmetic operations as computing
the address of a[i], with the minor difference between using a constant
as index as opposed to a variable. In any situation I hope to encounter
in the kind of work I do, (database admin) I will trade off the readability
of the first example against the run time efficiency of the second, and
choose readability every time.
--
Jim Benz jcbst3@unix.cis.pittsburgh.edu If a modem
University of Pittsburgh answers,
UCIR (412) 648-5930 hang up!
tim@crackle.amd.com (Tim Olson) (06/06/89)
In article <18229@unix.cis.pittsburgh.edu> jcbst3@unix.cis.pittsburgh.edu (James C. Benz) writes: | In article <2550091@hpisod2.HP.COM] decot@hpisod2.HP.COM (Dave Decot) writes: | ]] int a[MAX]; int a[MAX]; | ]] int i; int *p; | ]] for (i=0; i<MAX; ++i) for (p=&a[0]; p<&a[MAX]; ++p) | ]] a[i] = 0; *p=0; | ]] | ]] ... | ]] | ]] ... On some compiler/machine combinations, this will run faster, | ]] because the scaling operation and base/offset addition have been | ]] eliminated; on others it may run slower, because a specific addressing | ]] mode cannot be used. | ] | ]Note that the scaling operation has not necessarily been completely | ]eliminated; it may have become hidden in "++p" because increments by 1 | ]may be much faster than increments by sizeof(*p). | ] | Also, the scaling operation has simply been moved to the "for" line, as | far as I can tell. Actually, the "scaling operation" has been strength-reduced from a shift or multiply in the inner-loop to a constant add (which was already required to increment the variable "i"). | The machine still has to dereference &a[MAX] on each | iteration of the loop, doesn't it? No. The address of a[MAX] is compared with the current pointer without referencing the value at a[MAX]. | It has to perform the test on each | iteration to see if p is < &a[MAX], and unless the optimizer is really | on top of things, it has no way of generating the comparison address | as a constant. It really has no way of knowing that &a[MAX] won't | change during the course of the program, so it must be re-computed on | each iteration of the loop. Not so in this example -- the array "a" is declared as "int a[MAX]", so the compiler knows that "a" is a constant lvalue which can never change. However, if "a" were actually a pointer, you are correct -- the value of "&a[MAX]" could potentially change if the value of "a" is changed in the loop. -- Tim Olson Advanced Micro Devices (tim@amd.com)
guy@auspex.auspex.com (Guy Harris) (06/07/89)
>Also, the scaling operation has simply been moved to the "for" line, as >far as I can tell. The machine still has to dereference &a[MAX] on each >iteration of the loop, doesn't it? I presume you mean "compute" rather than "dereference". (I should *hope* it doesn't dereference the pointer value "&a[MAX]"; since "a" has "MAX" elements, the last element is "&a[MAX-1]", so dereferencing "&a[MAX]" could cause all sorts of problems....) >It really has no way of knowing that &a[MAX] won't change during the >course of the program, Say what? It knows that "MAX" is a constant (assuming this is vanilla C, and not, say, C as extended by compilers such as GCC that permit you to declare automatic arrays of variable size), and it knows that "a" isn't going to move, so it knows that "&a[MAX]" will always be the same address.
karl@haddock.ima.isc.com (Karl Heuer) (06/07/89)
In article <18229@unix.cis.pittsburgh.edu> jcbst3@unix.cis.pittsburgh.edu (James C. Benz) writes: >[someone writes about the loop: for (p=&a[0]; p<&a[MAX]; ++p) ...] >The machine still has to dereference &a[MAX] on each iteration of the loop, >doesn't it? There's no dereference involved. It does have to *evaluate* &a[MAX]. >unless the optimizer is really on top of things, it has no way of generating >the comparison address as a constant. It really has no way of knowing that >&a[MAX] won't change during the course of the program, so it must be >re-computed on each iteration of the loop. The value "&a[MAX]" cannot change during the scope of the identifier "a". On a VAX-like architecture, the array a (which has block scope and auto storage duration) is commonly implemented as a constant offset from a frame pointer; the additional offset MAX*sizeof(int) will be absorbed by constant folding. The only non-obvious optimization is to keep the value &a[MAX] in a register. >In any situation I hope to encounter in the kind of work I do, (database >admin) I will trade off the readability of the first example against the run >time efficiency of the second, and choose readability every time. I don't necessarily agree that pointers are less understandable than indexing, but I agree that readability/maintainability should be the first concern. Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
chris@mimsy.UUCP (Chris Torek) (06/07/89)
[context: int a[MAX], *p; for (p = &a[0]; p < &a[MAX]; p++) ...] In article <18229@unix.cis.pittsburgh.edu> jcbst3@unix.cis.pittsburgh.edu (James C. Benz) writes: >The machine still has to dereference &a[MAX] on each iteration of the >loop, doesn't it? It has to perform the test on each iteration to see >if p is < &a[MAX], and unless the optimizer is really on top of things, >it has no way of generating the comparison address as a constant. No: on a typical machine, with a stack pointer (and without alloca(), or with a frame pointer and optionally with alloca()), the `for' loop generates code of the form: lea sp@(-8),a2 | compute &a[MAX] lea sp@(-100),a3 | p = &a[0]; loop: cmp a3,a2 | p < [constant] jcc out | no, exit loop ... jbr loop out: `a' is not a modifiable lvalue (although it is not `constant', since the array is local to some function), and MAX is a constant, so &a[MAX] cannot change during any iteration of the loop. Actually, many machines have a `count down to zero' loop form which may be still more efficient, and an optimising compiler (such as `gcc -O -fstrength-reduce') may change the loop to for (p = &a[0], temp = sizeof a/sizeof *a; --temp >= 0; p++) ... use *p ... which contains both a strength reduction and a variable substitution over the original for (i = 0; i < MAX; i++) ... use a[i] ... but does require that `i' not be used for anything except the subscript calculation. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris