[comp.lang.c] Conformant arrays-- how to?

kenny@m.cs.uiuc.edu (05/02/89)

Some time ago, I saw in this newsgroup a discussion for a proposal to
implement conformant arrays of more than one dimension in C.  If I
remember correctly, the proposed scheme was to allow a function to be
expressed as:

	return_type f (int m,		/* Number of rows */
	               int n,		/* Number of columns */
	               double x [] [n]);/* x is a m x n array */
	{
	    .... code for f ....
	}

The generated code would use standard rules for pointer arithmetic,
except that the variable n would enter into the computation:

	sizeof (x [0] [0]) == sizeof (double)
	sizeof (x [0]) == n * sizeof (double)
	sizeof x == sizeof (double (*)[n]) [ == sizeof (double *) ?? ]

and when these rules are applied, we get C's usual row-major storage
layout.

Has anyone tried implementing such a scheme?  (I recall that X3J11
rejected it, quite correctly, because of `insufficient prior art.')
What are the pitfalls to avoid?  At first glance, it seems trivial to
implement, but I suspect otherwise, since dmr didn't do it that way (I
have generally discovered that dmr's less-than-obvious decisions were
right -- with one or two insignificant exceptions).

In the absence of such a scheme, what do people consider to be the
most robust way of handling conformant 2-D arrays?  Two obvious
schemes come to mind:

	- Represent each array by an edge vector that points to the
rows; x [i] [j] is applied to an object of type (double **) and the
pointer chain gives the correct value.

	- Pass to the function (in an argument called Px) the address
&(x [0] [0]), and then replace all references to x [i] [j] with 
Px [n * i + j].  How portable is this latter scheme?  I can imagine
some brain-damaged compiler inserting a pad of unused storage between
rows of the array and defeating this attempt to replicate C's address
calculation.

Of course, with the second scheme, we *could* make the caller use the
same representation, and declare x to be
	double x [m * n + 1] /* replace m and n with the appropriate
				constants, of course */
which should be completely portable, but rather awkward.


Kevin Kenny			UUCP: {uunet,pur-ee,convex}!uiucdcs!kenny
Department of Computer Science	ARPA Internet or CSNet: kenny@CS.UIUC.EDU
University of Illinois
1304 W. Springfield Ave.
Urbana, Illinois, 61801		Voice: (217) 333-5821

schmidt@zola.ics.uci.edu (Doug Schmidt) (05/02/89)

In article <4700036@m.cs.uiuc.edu> kenny@m.cs.uiuc.edu writes:
++ 
++ Some time ago, I saw in this newsgroup a discussion for a proposal to
++ implement conformant arrays of more than one dimension in C.  If I
++ remember correctly, the proposed scheme was to allow a function to be
++ expressed as:
++ 
++ 	return_type f (int m,		/* Number of rows */
++ 	               int n,		/* Number of columns */
++ 	               double x [] [n]);/* x is a m x n array */
++ 	{
++ 	    .... code for f ....
++ 	}
++ 
++ The generated code would use standard rules for pointer arithmetic,
++ except that the variable n would enter into the computation:
++ 
++ 	sizeof (x [0] [0]) == sizeof (double)
++ 	sizeof (x [0]) == n * sizeof (double)
++ 	sizeof x == sizeof (double (*)[n]) [ == sizeof (double *) ?? ]
++ 
++ and when these rules are applied, we get C's usual row-major storage
++ layout.
++ 
++ Has anyone tried implementing such a scheme?  (I recall that X3J11

Check out GNU GCC 1.35.  I believe it implements what you are
referring to:

----------------------------------------
#include <stdio.h>
/* Try using other types for TYPE, i.e., double, int, float, char, etc. */
typedef long TYPE;

void
tester (int i, int j, TYPE buf[i][j])
{
  int k, l;

  for (k = 0; k < i; k++)
    {
      for (l = 0; l < j; l++)
        printf ("buf[%d][%d] = %4d ", k, l, buf[k][l]);
      putchar ('\n');
    }
}

int
main (int argc, char *argv[]) 
{
  if (argc == 3)
    {
      int i = atoi (argv[1]);
      int j = atoi (argv[2]);
      int k, l;
      TYPE c[i][j];

      for (k = 0; k < i; k++)
        for (l = 0; l < j; l++)
          c[k][l] = k * l;

      tester (i, j, c);
      return 0;
    }
  else
    fprintf (stderr, "usage: %s <row integer> <column integer>\n", argv[0]);
  return 1;
}
----------------------------------------

This seems pretty straight-forward to understand and implement, but
might it present problems on some architectures?

Doug
--
On a clear day, under blue skies, there is no need to seek.
And asking about Buddha                +------------------------+
Is like proclaiming innocence,         | schmidt@ics.uci.edu    |
With loot in your pocket.              | office: (714) 856-4043 |

gwyn@smoke.BRL.MIL (Doug Gwyn) (05/02/89)

In article <4700036@m.cs.uiuc.edu> kenny@m.cs.uiuc.edu writes:
>What are the pitfalls to avoid?  At first glance, it seems trivial to
>implement, but I suspect otherwise, since dmr didn't do it that way ...

I asked Dennis about this many years ago.  We seemed to agree that it
was "doable", but not quite trivial.  Unless you restricted the
facility to pointer parameters, which would seem a pity, the auto
storage allocation scheme would have to accommodate object sizes not
known at compiler time.  That isn't always simple to achieve.

There have been several library packages developed to support arrays
with dimensions determined at run time.  I think a recent C User's
Journal described the details of one such package.  It's not hard to
develop such a package, just a bit awkward to use one.  Similar
things could be said for complex numbers, group representations,
multiple-precision arithmetic, and so on.  C gives you enough hooks
to implement solutions for these needs, but it's often not as nice
as a built-in facility would be.  On the other hand, in contexts
where such applications are irrelevant, the C implementation is not
burdened by the necessity of supporting all the extra baggage.

chris@mimsy.UUCP (Chris Torek) (05/03/89)

>In article <4700036@m.cs.uiuc.edu> kenny@m.cs.uiuc.edu writes:
>>... proposal to implement conformant arrays of more than one
>>dimension in C. ...
>>	return_type f (int m,		/* Number of rows */
>>	               int n,		/* Number of columns */
>>	               double x [] [n])	/* x is a m x n array */

In article <13101@paris.ics.uci.edu> schmidt@zola.ics.uci.edu (Doug Schmidt)
writes:
>... GCC ... implements what you are referring to ....

It does.

>This seems pretty straight-forward to understand and implement, but
>might it present problems on some architectures?

No.  Fortran does the same thing (although it needs all but the
*last* dimension, rather than all but the first).
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

mouse@mcgill-vision.UUCP (der Mouse) (05/10/89)

In article <4700036@m.cs.uiuc.edu>, kenny@m.cs.uiuc.edu writes:
> [ways of handling conformant 2-d arrays, one possible way being]

> - Pass to the function (in an argument called Px) the address
>   &(x [0] [0]), and then replace all references to x [i] [j] with
>   Px [n * i + j].  How portable is this latter scheme?  I can imagine
>   some brain-damaged compiler inserting a pad of unused storage
>   between rows of the array and defeating this attempt [...].

From the Revised Standard Bible (:-), page 204 (section A7.4.8):

	...the size of an array of n elements is n times the size of
	one element.

So, suppose we have

sometype x[m][n];

then sizeof(x) must be m*sizeof(x[0]).  Since x[0] is itself an array,
sizeof(x[0]) must be n*sizeof(x[0][0]).  We therefore can perform a
masterful feat of substitution and deduce that sizeof(x) is indeed
m*n*sizeof(x[0][0]), and that compilers are therefore not allowed to
insert (per-row) padding between one row and the next of a
multi-dimensional array.  The padding between x[0][n-1] and x[1][0]
must be exactly the same as the padding between x[0][0] and x[0][1].

The King James version (:-) contains no similar statement in the
analogous place; I didn't search the whole book.  Therefore, by
Murphy's law, there are compilers which do insert such padding, though
it's possible that restrictions elsewhere combine to similar effect.

However, consider a machine with a 36-bit addressing unit using 8-bit
chars, with the declaration "char foo[n][4]".  Is it permissible to put
each row in a separate word, wasting four bits per row and decreeing
that sizeof(36-bit-unit) is 4, with sizeof(char) being, of course, 1?
This appears superficially legal, but feels dubious - are 9-bit chars
the only way out?  (Sure, they're the only *sane* way out, I'm not
asking about that!)

					der Mouse

			old: mcgill-vision!mouse
			new: mouse@larry.mcrcim.mcgill.edu