[comp.lang.c] Defining a pointer to an array

a1082@mindlink.UUCP (Terry Bartsch) (09/22/89)

I have found it convenient to use a large 4-dimensional array in an interactive
program and must malloc the space to reduce object module size. This caused me
to discover the following peculiarity:

The construct "int y[a][b][c][d]"
allocates a*b*c*d integers in the form of an array.

-----

"ANSI" K&R and the Microsoft Systems Journal each show a one-dimensional
example of "a pointer to an array".

The construct "int (*x) [a][b][c][d]"
should theoretically allocate a pointer to the same sort of array.

-----

However, when the expression is resolved, the [d] position is multiplied by
"d*sizeof(int)" rather than by "sizeof(int)" and the other indexes also are
multiplied by oversized figures, causing the pointer to exceed actual allocated
memory with very small subscripts and preventing usage of a construct such as
"for (n = 0; n < a*b*c*d; x[0][0][0][n] = 0);"

The addresses are "miscalculated" right down to the trivial case of a
one-dimensional array.

I empirically determined that instead of "int (*x) [a][b][c][d]"
one must specify "int (*x) [b][c][d][1]". (!!?)

When defined as the latter, *x[][][][] acts exactly like a "normal" array
y[][][][] of the same dimensions whenever referenced in the program.

This occurs on both Turbo C (808x) and a public-domain C for the Amiga (680xx),
despite (supposedly) differing developers and integer size.

Does this make sense in terms of the definition of the language? Is there a
method of defining the pointer which is more intuitive?

chris@mimsy.UUCP (Chris Torek) (09/24/89)

In article <526@mindlink.UUCP> a1082@mindlink.UUCP (Terry Bartsch) writes:
>I have found it convenient to use a large 4-dimensional array ...

>The construct "int y[a][b][c][d]"
>allocates a*b*c*d integers in the form of an array.

(note that a,b,c,d must be constants)

So far, so good.  More precisely, it declares one object, of type
`array a of array b of array c of array d of int'; there are thus
a*b*c*d total `int's in the array, arranged in row-major order.

>The construct "int (*x) [a][b][c][d]"
>should theoretically allocate a pointer to the same sort of array.

Actually, it declares one object, of type `pointer to array a of array
b of array c of array d of int'.  This pointer does not point to anything
useful, but it can be made to point to one *or more* contiguous objects
of type `array a of array b of array c of array d of int'.

One almost NEVER has several contiguous arrays.  The case above, with the
`four-dimensional' array, happens to have `a' contiguous arrays, each
consisting of `b' contiguous arrays, each consisting of `c' contiguous
arrays, each consisting of `d' contiguous integers.

What we probably want, then, is an object of type `pointer to array b
of array c of array d', which we will make point to the first of several
contiguous objects of that type:

	int (*x)[b][c][d];

	x = (int (*)[b][c][d])malloc(a * sizeof(int [b][c][d]));

We can then get at the i,j,k,l'th element with

	x[i][j][k][l] = value;

This is completely analagous to the equivalence of

	int A[10];

and

	int *p; p = A;

in actual usage (A[i] and p[i] name the same integer).

>However, when the expression is resolved, the [d] position is multiplied by
>"d*sizeof(int)" rather than by "sizeof(int)"

Not if used correctly: since (given your declaration above) x points to
one or more (array a of array b of array c of array d of int)s, when you write

	x[i][j][k][l]

you have named the i'th contiguous such array, and asked for its j'th
contiguous subarray (int [b][c][d]), then its k'th subarray (int [c][d]),
and then that object's l'th subarray (int [d]).  This names an object
of type `array d of int', which in any rvalue context immediately
`devolves' into an object of type `pointer to int', whose value is
the address of the first element of the l'th subarray.

Note that, since x[i][j][k][l] names an array, a compiler should give
an error for

	x[i][j][k][l] = value;

since an array is not a modifiable object.  If the type of <value> above
(taken in rvalue context, i.e., after arrays devolve to pointers to their
first elements) is not `pointer to integer', you also have a type mismatch
(and thus get two diagnostics from many compilers).

Given the same declaration for x, if one wants the l'th element of the
k'th element of the j'th element of the i'th element of the very first
(i.e., 0th) contiguous array to which x points, one must write

	(*x)[i][j][k][l]

or, equivalently,

	x[0][i][j][k][l]

since x is a pointer to (hence has to be dereferenced at least once to
obtain) a `four dimensional' array.

C's entire system of pointers follows from a few simple rules:

(//)	In an rvalue context, an object of type `array N of T'
	`devolves' into one of type `pointer to T'; the value of this
	object is the address of the 0th element of that array.

	(Objects of other types simply change from `object, type T'
	to `value, type T'.)

(**)	The result of *p, where p is a value of type `pointer to T',
	is the object (of type T) to which p points.

(++)	The result of p+i (or i+p), where p is a value of type `pointer
	to T' and i is a value of type `int' (or `long' or `unsigned',
	etc.) is a value of type `pointer to T', but which points to
	the i'th object beyond wherever p pointed.

These three rules combine to manufacture all the arrays one can write
in C.

Given

	int A[4][3], i, j;

then

	A[i][j]

is really

	(*(A + i))[j]

which is really

	*( *(<object: array A> + <object: i>)  +  <object: j>)

which takes A in an rvalue context, so we apply rule (//): N is 4 and
T is `array 3 of int', and A changes from `object, array 4 of array 3
of int' to `value, pointer to array 3 of int'.  Objects i and j change
to their values as well, and we have:

	*( *(<value: pointer to array 3 of int: &A[0]> + <value: int: i>)  +
	  <value: int: j>)

The pointer addition (<ptr>+i) is done by moving up to the i'th sub-object
of whatever <ptr> points to (rule (++)).  Here <ptr> points to several
(4 actually) `array 3 of int's; we move to the i'th `array 3 of int':

	*( *(<value: pointer to i'th array 3 of int: &A[i]>)  +
	  <value: int: j>)

Applying rule (**), we get the i'th array 3 of int (an object):

	*( <object: i'th array 3 of int: A[i]>  +  <value: int: j>)

Applying rule (//), N is 3 and T is int, so we have

	*(<value: pointer to int: &A[i][0]> + <value: int: j>)

We find the j'th subobject, here one of several (3 actually) `int's:

	*(<value: pointer to j'th int: &A[i][j])

Finally, we apply the second *:

	<object: j'th int: A[i][j]>

Doing the above analysis with `object A' replaced by `object p', where
p is declared as `int (*p)[3]', is left as an exercise to the reader.
Hint: since p is not an array, to go from object to value, simply rewrite
it as `value'.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

scjones@sdrc.UUCP (Larry Jones) (09/24/89)

In article <526@mindlink.UUCP>, a1082@mindlink.UUCP (Terry Bartsch) writes:
> The construct "int y[a][b][c][d]"
> allocates a*b*c*d integers in the form of an array.
> 
> The construct "int (*x) [a][b][c][d]"
> should theoretically allocate a pointer to the same sort of array.
> 
> [ but then when you use x[a][b][c][d] (or *x[a][b][c][d], the
> article contains both), it doesn't work right. ]

Let us repeat once again, in unison, "In C, the declaration of a
variable and the use of that variable should look the same."  If
you declare "int (*x)[a][b][c][d]", then you should reference it
as "(*x)[a][b][c][d]", which works just fine, not by using either
of the methods you mentioned in your article.

However, what you probably want to do is declare x to have the
same type as y does after conversion to a pointer.  When you use
the name of an array (like "y") without a subscript, it is
converted to a pointer to the first element of the array (not a
pointer to the entire array).  Thus, "y" is a pointer to an array
of b arrays of c arrays of d integers.  If you declare x as
"int (*x)[b][c][d]", you can then use "x[a][b][c][d]" just like
you use "y[a][b][c][d]".
----
Larry Jones                         UUCP: uunet!sdrc!scjones
SDRC                                      scjones@SDRC.UU.NET
2000 Eastman Dr.                    BIX:  ltl
Milford, OH  45150-2789             AT&T: (513) 576-2070
"I have plenty of good sense.  I just choose to ignore it."
-Calvin