[net.lang.c] Passing Multidimensional Arrays

jeff@isi-vaxa.arpa (Jeffery A. Cavallaro) (11/28/85)

I have an application where I want to pass a 2D array to a subroutine.
The data does not easily avail itself to being converted to a 1D array
of structures.  How can one access the array in the subroutine???

For example:

Calling routine:

	struct x a[M][N];
	
	subrout(M, N, a);

C does not have "adjustable arrays" like FORTRAN.  So what is one to do?

The only way I can see to do it is:

	subrout(m, n, a)
	struct x *a;

and do the address arithmetic yourself.  I believe that this is OK, because
the innermost (leftmost) index is guaranteed to increase fastest, and
C compilers are required to be consistent about packing when doing
address arithmetic or aggregate allocation (I think).

Any hints???

					Thanx,
					Jeff

gwyn@BRL.ARPA (VLD/VMB) (11/29/85)

Yes, if a C function is designed to work with a variety
of sizes for its array parameters, it will have to do
the index computation itself.  You can write macros to
make this easier, but it's still a crock.

jsdy@hadron.UUCP (Joseph S. D. Yao) (11/30/85)

In article <138@brl-tgr.ARPA> jeff@isi-vaxa.arpa (Jeffery A. Cavallaro) writes:
>I have an application where I want to pass a 2D array to a subroutine.
>The data does not easily avail itself to being converted to a 1D array
>of structures.  How can one access the array in the subroutine???
>Calling routine:
>	struct x a[M][N];
>	subrout(M, N, a);
>The only way I can see to do it is:
>	subrout(m, n, a)
>	struct x *a;
>and do the address arithmetic yourself.  ...

Yes, you could do it the second way; and unfortunately this is the
only way to implement a routine that has several different sized
matrices being passed to it.  However, it seems to me that you are
implying a known fixed matrix size in your application.  In that
case, you can call:
	struct x a[M][N];
	subrout(a)
and just declare
	subrout(arg)
	  struct x arg[M][N];
	{  ...
-- 

	Joe Yao		hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}

throopw@dg_rtp.UUCP (Wayne Throop) (11/30/85)

> I have an application where I want to pass a 2D array to a subroutine.
> [...I plan to...]
>     do the address arithmetic [myself].  I believe that this is OK, because
> the innermost (leftmost) index is guaranteed to increase fastest, and
> C compilers are required to be consistent about packing when doing
> address arithmetic or aggregate allocation (I think).

You've got it right, except for a couple of crucial details.  First, and
perhaps least important, the C language doesn't guarantee very much
about aggregate allocation.  However, assuming that multidimensional
arrays are layed out as a single dimensional array with fancy address
calculations is one of the safer bets.

Second, and more important, you seem to have the address calculations
backwards.  Assuming that "index x increases faster than index y" means
that a change in index "x" changes the address of the indexed array
element less than the same change in index "y" (which is the
conventional meaning of the phrase), then where you say "the innermost
(leftmost) index is guaranteed to increase fastest", you are exactly
backwards.  First, a practical demonstration.  This routine:

        int printf();
        int a[2][2];
        void main(){
            printf( "%o %o %o %o\n",
                &a[0][0],
                &a[0][1],
                &a[1][0],
                &a[1][1] );
        }

produces this output:

        16000001030 16000001032 16000001034 16000001036

on my machine, and similar output on others.  To see why this is so,
just consider what the subscript operations are doing.  To parenthesize,
we have

        a[x][y]  is equivalent to   ((a[x])[y])

Now then, what is "a"?  A is an array of array of int.  Thus, when you
subscript "a", you get an array of int.  Subscripting that, gives
(surprise) an int.  Thus, the *rightmost* index "increases fastest".
Note that (as has been pointed out many times before) this is the
opposite of the case in FORTRAN.

This is often captured by the rule that "C has odometer array order,
while FORTRAN has anti-odometer array order", by analogy to the way
numbers click over on an odometer.

>					Jeff
-- 
Wayne Throop at Data General, RTP, NC
<the-known-world>!mcnc!rti-sel!dg_rtp!throopw

mikel@codas.UUCP (Mikel Manitius) (12/05/85)

> I have an application where I want to pass a 2D array to a subroutine.
> The data does not easily avail itself to being converted to a 1D array
> of structures.  How can one access the array in the subroutine???

How about this:

#define		X	100
#define		Y	100

struct	x a[Y][X];

subrout(i, j, a)
int	i;
int	j;
struct x *a[];
{
	.
	.
	...a[j][i]...
	.
	.
}

And use i and j as boundary limits.
-- 
			Mikel Manitius @ AT&T-IS Altamonte Springs, FL
			...{ihnp4|akguc|attmail|indra!koura}!codas!mikel

ken@turtlevax.UUCP (Ken Turkowski) (12/09/85)

In article <365@codas.UUCP> mikel@codas.UUCP (Mikel Manitius) writes:
>> I have an application where I want to pass a 2D array to a subroutine.
>> The data does not easily avail itself to being converted to a 1D array
>> of structures.  How can one access the array in the subroutine???
>
>How about this:
>#define		X	100
>#define		Y	100
>struct	x a[Y][X];
>subrout(i, j, a)
>int	i; int	j; struct x *a[];
>{
>	...a[j][i]...
>}
>And use i and j as boundary limits.

Unfortunately, this simplistic approach doesn't work because the offset
of a[3][4] from &a[0][0] is not the same when a is declared as a[5][5]
versus a[50][50].
-- 
Ken Turkowski @ CIMLINC, Menlo Park, CA
UUCP: {amd,decwrl,hplabs,seismo,spar}!turtlevax!ken
ARPA: turtlevax!ken@DECWRL.DEC.COM

throopw@dg_rtp.UUCP (Wayne Throop) (12/15/85)

> > I have an application where I want to pass a 2D array to a subroutine.
> > The data does not easily avail itself to being converted to a 1D array
> > of structures.  How can one access the array in the subroutine???

> How about this:
>   #define  X 100
>   #define  Y 100
>   struct x a[Y][X];
>   subrout(i, j, a)
>     int i;
>     int j;
>     struct x *a[];
>   {   ...a[j][i]... }
> And use i and j as boundary limits.
> Mikel Manitius ...{ihnp4|akguc|attmail|indra!koura}!codas!mikel

The main problem with this is that (although it is a legal C fragment),
it shows neither the definition of x, nor an invocation of subrout.
Thus, there is no clear notion as to what is supposed to be passed to
the formal arguments i, j, and a.  Also, the formal "a" hides the actual
"a", and this is fairly confusing.


Let me take this opportunity to recommend a practice that I try to
follow (and regret whenever I backslide).  Before posting an example,
lint, compile, and (if possible) run it.  For example, making
reasonable guesses as to what was intended in the above example, we get
a complete program like so:

    1   #define  X 100
    2   #define  Y 100
    3   struct x { int f1, f2; };
    4   struct x a[Y][X];
    5   void subrout(i, j, a)
    6     int i;
    7     int j;
    8     struct x *a[];
    9   {   a[j][i]; }
   10   
   11   void main(){
   12           subrout( X, Y, a );
   13   }

which, of course, fails typecheck like so:

  line 12  inconsistent types discovered
      (:IDENTIFIER a :EXTERN ... )
    Types are:
      (:ARRAY_OF
        (:POINTER_TO
          (:STRUCT x
            (
              (:IDENTIFIER f1 :STRUCT_FIELD ... )
              (:IDENTIFIER f2 :STRUCT_FIELD ... ))))
        () ())
    and:
      (:POINTER_TO
        (:ARRAY_OF
          (:STRUCT x
            (
              (:IDENTIFIER f1 :STRUCT_FIELD ... )
              (:IDENTIFIER f2 :STRUCT_FIELD ... )))
          100 ()))

Probing further, let's see what the compiler thought of the situation,
by compiling it and fishing around with a debugger:

    1> breakpoint 12
    1> proceed
    Note: BREAKPOINT: LINE 12 OF main
    1> describe a
       ARRAY [ 0..99 BOUND 32 ] OF ARRAY [ 0..99 BOUND 32 ] OF RECORD
           f1:                 INTEGER
           f2:                 INTEGER
       END RECORD BOUND 64
    1> breakpoint 9
    1> proceed
    Note: BREAKPOINT: LINE 9 OF subrout
    1> describe a
       POINTER TO POINTER TO RECORD
           f1:                 INTEGER
           f2:                 INTEGER
       END RECORD BOUND 64

[   As an aside, I note that the debugger/compiler reports that the
    formal "a" is pointer-to-pointer, while the typechecker reports it
    as array-of-pointer.  This is (sort of) correct for C, where formal
    arrays are "actually" pointers.  The difference comes about in an
    understandable way, since the typechecker is trying to be purist,
    and the debugger has to know how the bits should "really" be
    treated.  ]


Now, I may have guessed wrongly about what Mikel intended.  But I think
it is clear that the example he posted is dangerously incomplete.  I
think it is also clear that the best available tools should be used to
check out code fragments that are posted as examples for others to use.
Without actually trying it out, I couldn't (without a fair amount of
thought, vulnerable to error) have put my finger on exactly what was
wrong with the fragment Mikel posted.
-- 
Wayne Throop at Data General, RTP, NC
<the-known-world>!mcnc!rti-sel!dg_rtp!throopw

mikel@codas.UUCP (Mikel Manitius) (12/18/85)

> > > I have an application where I want to pass a 2D array to a subroutine.
> > > The data does not easily avail itself to being converted to a 1D array
> > > of structures.  How can one access the array in the subroutine???
> 
> > How about this:
> >   #define  X 100
> >   #define  Y 100
> >   struct x a[Y][X];
> >   subrout(i, j, a)
> >     int i;
> >     int j;
> >     struct x *a[];
> >   {   ...a[j][i]... }
> > And use i and j as boundary limits.
> > Mikel Manitius ...{ihnp4|akguc|attmail|indra!koura}!codas!mikel
> 
> The main problem with this is that (although it is a legal C fragment),
> it shows neither the definition of x, nor an invocation of subrout.
> Thus, there is no clear notion as to what is supposed to be passed to
> the formal arguments i, j, and a.  Also, the formal "a" hides the actual
> 
> Now, I may have guessed wrongly about what Mikel intended.  But I think
> it is clear that the example he posted is dangerously incomplete.  I
> think it is also clear that the best available tools should be used to
> check out code fragments that are posted as examples for others to use.
> Without actually trying it out, I couldn't (without a fair amount of
> thought, vulnerable to error) have put my finger on exactly what was
> wrong with the fragment Mikel posted.
> -- 
> Wayne Throop at Data General, RTP, NC
> <the-known-world>!mcnc!rti-sel!dg_rtp!throopw

Alas, your guesses were correct. Then explain, if you don't mind, how
the compiler knows the dimensions of argv in the following construct:

--------
#include <stdio.h>

main(argc, argv)
int	argc;
char	*argv[];
{
	int	i, j;

	for(j=0;argv[j][0] != '\0';j++) {
		for(i=0;argv[j][i] != '\0';i++)
			putc(argv[j][i], stdout);
		putc('\n', stdout);
		}
}
-------

The preceeding program should print it's arguments, one per line,
it is done in this manner to show that the compiler does not need
to know the dimension of argv at compile time, in order to access
the elements of the (2D) array, given i and j.

The result is:

-------
$ a.out one two twee
a.out
one
two
twee
$ 
-------

panic: data allignment error
-- 
			Mikel Manitius @ AT&T-IS Altamonte Springs, FL
			...{ihnp4|akgua|bellcore|clyde|koura}!codas!mikel

jsdy@hadron.UUCP (Joseph S. D. Yao) (12/18/85)

Several pretty good responses to this have already come across
(where can I get the lint with that fabulous output??), but they
managed to miss a fairly important point.

In article <365@codas.UUCP> mikel@codas.UUCP (Mikel Manitius) writes:
>[MISSING ATTRIBUTION]
>> I have an application where I want to pass a 2D array to a subroutine.
>How about this:
>#define		X	100
>#define		Y	100
>struct	x a[Y][X];
>subrout(i, j, a)
>int	i;
>int	j;
>struct x *a[];
>{
>	...a[j][i]...
>}
>And use i and j as boundary limits.

The major thing wrong with this is that, in this case, pointers
and arrays are  n o t  the same thing!  The declaration a[Y][X]
declares a series of data objects:
	a[0][0], a[0][1], ... , a[0][X], a[1][0], a[1][1], ... a[Y][X]
WHEREAS the declaration *a[] declares a series of pointers to
data objects!  as:
	ptr(a[0][0]), ptr(a[1][0]), ...
This latter object DOES NOT EXIST.  Using the above declaration,
trying to find a[0][0] results in the following sequence of
operations:
	first find a[0].  Oh, since a is an array of pointers,
	the first element is the pointer a[0].  (But in the outside
	world, this is the data object a[0][0].)

	now find (a[0])[0].  but a[0] was a pointer.  therefore,
	find whatever that pointer was pointing to.  (Which, to
	the outside world means that you are indirecting off the
	data object a[0][0] ... which was a structure ... I'm
	sure you are getting sick now thinking of what this must
	be doing ...)

Of course, the only reason I know this so thoroughly is that early
in my UNIX career, about a dozen years ago, I did exactly this and
blew a machine totally away.

The correct declaration is either a[Y][X] or even a[][X].  An
array may be declared with the first size left blank or arbitrary,
but all the rest must be filled in.

For those of you who aren't tired of this yet, here's a picture
of the two different structures in memory:

a[Y][X]:
	a:  _  _  _  _  _  _  ...  _  _  _  _  _  _
	    _  _  _  _  _  _  ...  _  _  _  _  _  _
	    ...
	    _  _  _  _  _  _  ...  _  _  _  _  _  _
	a+(X*Y*sizeof(a[0][0]):

versus *a[]:
	a:  _ -> b
	    _ -> c
	    ...
	    _ -> omega
	a+(Y*sizeof(ptr to a)):
	...
	b:  _  _  _  _  _  _  ...  _  _  _  _  _  _
	b+(X*sizeof(a[0][0]):
	...
	c:  _  _  _  _  _  _  ...  _  _  _  _  _  _
	c+(X*sizeof(a[0][0]):
	...

Note that **argv and *argv[] may be used identically, but that
argv[][] has never been used and is incorrect, e.g. .  The reasons
are left to the student as an exercise.		;-)
[If you take that seriously, post answers to me,  n o t  the net!]
-- 

	Joe Yao		hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}

jsdy@hadron.UUCP (Joseph S. D. Yao) (12/19/85)

(First off, the original code fragment, while syntactically legal C,
was most definitely semantically incorrect.)

In article <397@codas.UUCP> mikel@codas.UUCP (Mikel Manitius) writes:
>                             ... Then explain, if you don't mind, how
>the compiler knows the dimensions of argv in the following construct:
>#include <stdio.h>
>main(argc, argv)
>int	argc;
>char	*argv[];
>{
>	int	i, j;
>	for(j=0;argv[j][0] != '\0';j++) {
>		for(i=0;argv[j][i] != '\0';i++)
>			putc(argv[j][i], stdout);
>		putc('\n', stdout);
>		}
>}

Easy.  Argv is  n o t  a 2-D array.  It is an array of pointers.
If my yesterday's article on the subject hasn't arrived at your
machine yet, wait a couple days and then  m a i l  to me to see
whether I can retrieve a copy for you.

If anyone kept a copy of my 100+ line diatribe on arrays and
pointers, and why people confuse them so, from a year ago, wouldst
please mail a copy to me?  I'm afraid I didn't keep a copy, and
if this line of discussion keeps up, I'm going to need it.	;-}
-- 

	Joe Yao		hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}

throopw@dg_rtp.UUCP (Wayne Throop) (12/23/85)

> (First off, the original code fragment, while syntactically legal C,
> was most definitely semantically incorrect.)

(Small nit: the "original" fragment was both syntactically and
semantically legal (mainly because it didn't include an invocation of
"subr", and thus didn't have the type mismatch that was implied).  It
was, however, misleading because of this incompleteness, and
semantically illegal when completed.)

> In article <397@codas.UUCP> mikel@codas.UUCP (Mikel Manitius) writes:
> >                             ... Then explain, if you don't mind, how
> >the compiler knows the dimensions of argv in the following construct:

> Easy.  Argv is  n o t  a 2-D array.  It is an array of pointers.

Exactly.

> [...] If anyone kept a copy of my 100+ line diatribe on arrays and
> pointers, and why people confuse them so, from a year ago, wouldst
> please mail a copy to me?  I'm afraid I didn't keep a copy, and
> if this line of discussion keeps up, I'm going to need it.	;-}
>       Joe Yao hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}

Sadly, I didn't keep it.  But I'll take a stab at clarifying things in
your stead.  Let me preface this by saying that all of what I say here
and more is in "The C Programming Language" (K&R) and "A C Reference
Manual" (H&S).  *Please* read them.

In C, addresses can be subscripted and arrays can be indirected.  This
is often called "pointer-array equivalence", and is the root of this
confusion.  Nevertheless, it is not a complex concept.  It is a
*unification* of subscription and indirection (along with pointer
arithmetic).

It is based on the following equivalence, which defines subscripting in
terms of address calculation and indirection:

                e1[e2]   is-equivalent-to    *((e1)+(e2))

I'll take it for granted that you already know how address calculations
happen (the scaling of integers when added to addresses).  This leads to
the surprising result that integers can be subscripted (by addresses).

Note that e1 or e2 (but not both) in the above must be an address
expression.  This leads to the C-ism that array names are addresses (in
particular, they evaluate to the address of the first element in the
array (you know, the one with subscript "0")).

Thus, when you declare          int a[N];
you can use "a" like so:        a[M]
or like so                      *(a+M)
or even like so                 M[a]

and all of these are equivalent.

Now then, if you declare        int *p;
again, you can use "p" like     p[M]
or                              *(p+M)
or even                         M[p]

because "p" in this case evaluates to an address expression also.  The
crucial difference is that "a" is an "rvalue", while "p" is an "lvalue".
That is, "p" is a *variable*, and *contains* an address, while "a" is an
*expression* and *is* an address.

Thus when you declare (as is often done)

        void main(argc, argv) int argc; char *argv[]; { ... }

"argv" is *not* a two dimensional array, despite the fact that the
character pointed to by the first element of argv can be accessed by

                argv[0][0]

This expression is *exactly* equivalent to

                **argv

How can you tell what the declaration of argv above means?  Decompose
it.  First, fully parenthesize according to the rules in the manual:

                char *(argv[]);

Now, this means that argv is an array of pointers (first you subscript,
then you indirect).  Now subscripting and indirection are equivalent,
*but* *only* in *reference*, *not* in *declaration*.  Declaration of an
array allocates space for some elements.  Declaration of a pointer
allocates space for an *address* *only*.

Does all this make it clear?  If not, *read* *the* *manuals*.

[ Purists will note that I skipped over formal-actual differences,
  declaration of "two-dimensional" arrays, and other esoteric but
  often-encountered cases.  Ah well.... only so many hours in a day.]