[comp.lang.c] array indexing styles. was: Absolute addressing in Turbo-C

hunt@spar.SPAR.SLB.COM (Neil Hunt) (09/03/87)

In article <9140@brl-adm.ARPA> BREEBAAR%HLERUL5.BITNET@wiscvm.wisc.EDU writes:
>[...]
>It's this: Just WHY is it in C 'better' to write: *(a+n) instead of: a[n],
>where 'a' is an array of something? And how much 'better' is it?
>I remember reading on this list some time back that this does not extend to
>arrays of more than one dimension. Why? Just because the expressions will
>get too complex?

Neither form is better than the other in every case. In fact the notations
are 'completely equivalent' as far as the compiler is concerned:

K&R page 94:

	... Rather more surprising, at least at first sight, is the fact
	that a reference to a[i] can also be written as *(a+i). In
	evaluating a[i], C converts it to *(a+i) immediately; the two
	forms are completely equivalent.
		  ^^^^^^^^^^^^^^^^^^^^^

To the human reader, however, the two forms are of course different,
and which ever form is most clear in the context is the correct form
to use.

If you are addressing a data structure which is clearly and simply
expressed as an array, and you want the i.th element, then the
array subscriting form is most probably the one to use.

However, the confusion probably arises from the statement:

K&R page 93

	Any operation which can be achieved by array subscripting can also
	be done with pointers. The pointer version will in general be faster,
				   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	but at least to the unitiated, somewhat harder to grasp immediately.

When indexing an array as a[i] (or the equivalent form *(a+i), the
compiler generates code to add to the address of the start of the
array the value i multiplied by the size of the element to which a points,
or of which a is an array. It is this multiply which takes time,
and if it can be eliminated, the code will be faster. Pointers
provide a means to eliminate the multiply, and access the elements
of an array in a natural fashion. Consider the loop:

	for(i = 0; i < 10; i++)				-- 1
		a[i] = 0;

This is equivalent to:

	for(i = 0; i < 10; i++)				-- 2
		*(a+i) = 0;

By application of a standard optimisation technique, factoring the loop,
converting repeated multiplication (here hidden in the syntax of (a+i))
to an initialisation, and repeated addition: (p is defined as a pointer
to the elements of which a is an array)

	p = a;						-- 3
	for(i = 0; i < 10; i++)
		*p++ = 0;

A good optimiser would perform this optimisation automatically, but the
real speed up now comes form observing that the index variable i is now
only used to count the iterations, which coule be done just as easily
with the pointer variable, as:

	for(p = a; p < a+10; )				-- 4
		*p++ = 0;

My prefered form makes the fact that a is an array explicit, although
of course, it is identical:

	for(p = &a[0]; p < &a[10]; )			-- 5
		*p++ = 0;

If the array is of a variable size, (n instead of 10), then the
expression could be written:

	for(p = &a[0]; p < &a[n]; )			-- 6
		*p++ = 0;

However, some (most?) optimisers are not smart enough to realise
that since n is not changed in the loop, &a[n] does not change,
and they reevaluate a+n every time round the loop. To avoid this,
I prefer one of the forms:

	p = &a[0];					-- 7
	e = &a[n];		/* end of the array */
	while(p < e)
		*p++ = 0;

or better:

	for(p = &a[0], e = &a[n]; p < e; )		-- 8
		*p++ = 0;

which is more idiomatic. I use it so much that it is quite clear to
people who use my code.

The matter is a question of style. Pick the form with which you
are most comfortable, and don't worry too much about what anyone else
tells you. The most important thing is that the code is as clear
and explicit as possible. The advantage of C is that it does allow
you such variation is style, so that you can choose the form that is
clearest in each case.

I trust that you can extrapolate this to decide whether you prefer
a[i][j] or *(a[i] + j) or *(*(a+i) + j), which are all of course
identical as far as the compiler is concerned.

Just as with written prose, you can recognise the author of a piece
of C by his mannerisms and styles !

Neil/.

peter@sugar.UUCP (Peter da Silva) (09/06/87)

> [ optimisers which generate better code for *(a+n) then a[n] ]

Any compiler that generates different code for these two cases certainly needs
to be carefully looked at. You should assume that they will be treated
identically.
-- 
-- Peter da Silva `-_-' ...!seismo!soma!uhnix1!sugar!peter
--                 'U`  <-- Public domain wolf.