[comp.lang.c] Dynamic multidimensional arrays

karl@haddock.ISC.COM (Karl Heuer) (06/15/88)

In article <59@cubsun.BIO.COLUMBIA.EDU> shenkin@cubsun.UUCP (Peter Shenkin) writes:
>[What about]
>	float **matrix( int nrl, int nrh, int ncl, int nch )
>[and its N-dimensional generalization?]
>	float **array( int Ndim, 
>	  int n1l, int n1h, int n2l, int n2h, ..., int nNl, int nNh )

Since normal arrays in C are zero-based, I don't see any reason to specify the
lower bounds.

If you allocate the entire array (including space for the pointers as well as
the contents) in a single chunk, then you don't need all those free_array()
routines -- the standard free() will work.  I've written it this way.

The N-dimensional generalization is incorrectly typed.  Since the number of
levels of indirection is not known at compile-time, it should be "void *".

There's a problem with the lack of generality.  You suggest implementing these
in triplicate: arrays of integer, float, or double are allowed.  What if some
other type is desired, e.g. long int, or some random struct?  I'd prefer a
syntax that expects the type-name as an argument to the macro:
	float **mtrx = array2d(float, nr, nc);
Unfortunately, on some architectures the only way to implement this correctly
requires a (minor) hook in the compiler.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
Followups to comp.lang.c only.

chpf127@ut-emx.UUCP (J. Eaton) (06/15/88)

In article <4556@haddock.ISC.COM> (Karl Heuer) writes:
> In article <59@cubsun.BIO.COLUMBIA.EDU> (Peter Shenkin) writes:
> >[What about]
> >	float **matrix( int nrl, int nrh, int ncl, int nch )
> >[and its N-dimensional generalization?]
> >	float **array( int Ndim, 
> >	  int n1l, int n1h, int n2l, int n2h, ..., int nNl, int nNh )
> 
> Since normal arrays in C are zero-based, I don't see any reason to
> specify the lower bounds.

   What if I decide I would like to have negative as well as positive
   (or zero) indices for arrays?

   Standard Fortran allows this.  Doesn't C?

> Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

   J. Eaton
   UT Dept of Chemical Engineering

   Walking Fortran 77 Lint (well, as close as there is around here anyway :-).

chris@mimsy.UUCP (Chris Torek) (06/15/88)

In article <3342@ut-emx.UUCP> chpf127@ut-emx.UUCP (J. Eaton) writes:
>What if I decide I would like to have negative as well as positive
>(or zero) indices for arrays?
>
>Standard Fortran allows this.  Doesn't C?

Well, yes and no.  You can portably use nonzero array origins if and
only if the zero point is contained within the array.  If not, you get
into some shady areas, where the code will work on most machines and in
most cases, but is not guaranteed.

To get an array of type `ty' that ranges from LOW to HIGH, where
LOW <= 0 and HIGH > 0, write

	ty actual_array[HIGH-LOW];
	#define array (&actual_array[-LOW])

	... work with array[k], where LOW <= k < HIGH ...

or

	ty *actual_ptr, *ptr;

	actual_pointer = (ty *)malloc((HIGH-LOW) * sizeof(ty));
	if (actual_pointer == NULL) ... no memory;

	ptr = actual_ptr - LOW;	/* nb. LOW nonpositive, so ptr>=actual_ptr */
	... work with ptr[k], where LOW <= k < HIGH ...
	free(actual_ptr);

	/*
	 * N.B.: this can be done with a single pointer.
	 * If cheating with a positive LOW, the result
	 * might happen to == NULL.
	 */

This code generally works even with LOW positive, but the address
computed for `ptr' above is outside of the address space returned by
malloc(); whether this causes a fault or other misbehaviour is
implementation-dependent.  If it happens to be equal to (ty *)NULL,
the runtime system might `helpfully' catch it.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

hg@mas1.UUCP (Heinrich Gantenbein) (06/16/88)

In article <4556@haddock.ISC.COM> karl@haddock.ima.isc.com (Karl Heuer) writes:
>In article <59@cubsun.BIO.COLUMBIA.EDU> shenkin@cubsun.UUCP (Peter Shenkin) writes:
>>[What about]
>>	float **matrix( int nrl, int nrh, int ncl, int nch )
>>[and its N-dimensional generalization?]
>>	float **array( int Ndim, 
>>	  int n1l, int n1h, int n2l, int n2h, ..., int nNl, int nNh )
>
>
>[stuff deleted]

There is an excellent article about dynamic allocated arrays in C
puplished in the June issue of Computer Language on page 83.
(Volume 5, Number 6)

Heinrich Gantenbein
{...}pyramid!voder!mas1!hg
{...}codas!mas1!hg

jjr@ut-emx.UUCP (Jeff Rodriguez) (06/16/88)

In article <4556@haddock.ISC.COM> karl@haddock.ima.isc.com (Karl Heuer) writes:
>If you allocate the entire array (including space for the pointers as well as
>the contents) in a single chunk, then you don't need all those free_array()
>routines -- the standard free() will work.  I've written it this way.

Quite true.  Another benefit is that the one can use one call to fread()
to read a binary image from a file directly into one of these arrays.
If each row is allocated with a separate call to malloc(), then the
resulting array must be filled one row at a time.

			Jeff Rodriguez
			jjr@emx.utexas.edu

gwyn@brl-smoke.ARPA (Doug Gwyn ) (06/16/88)

In article <3342@ut-emx.UUCP> chpf127@ut-emx.UUCP (J. Eaton) writes:
>   What if I decide I would like to have negative as well as positive
>   (or zero) indices for arrays?
>   Standard Fortran allows this.  Doesn't C?

Sure it does, so long as there is a valid object at the referenced
location.  In fact I'm just about to change some application code's
dynamic array allocation so that [-1] will be a valid offset.

gwyn@brl-smoke.ARPA (Doug Gwyn ) (06/16/88)

In article <3365@ut-emx.UUCP> jjr@ut-emx.UUCP (Jeff Rodriguez) writes:
>In article <4556@haddock.ISC.COM> karl@haddock.ima.isc.com (Karl Heuer) writes:
>>If you allocate the entire array (including space for the pointers as well as
>>the contents) in a single chunk, then you don't need all those free_array()
>>routines -- the standard free() will work.  I've written it this way.
>
>Quite true.  Another benefit is that the one can use one call to fread()
>to read a binary image from a file directly into one of these arrays.
>If each row is allocated with a separate call to malloc(), then the
>resulting array must be filled one row at a time.

Karl's point is that the "row vector" can be contiguous with the MxN data
area.  That is indeed a nice idea, and I think I'll change some of our
applications to do this.

fread() can fill the data area, but then people like me who've been
allocating separate row vector and MxN data areas could have done that
anyway.

Separate allocations for each row really would be pretty dumb, unless
several rows might be unneeded (I have an application like that, too).

chris@mimsy.UUCP (Chris Torek) (06/16/88)

>>In article <4556@haddock.ISC.COM> karl@haddock.ima.isc.com (Karl Heuer)
>>writes:
>>>If you allocate the entire array (including space for the pointers as well as
>>>the contents) in a single chunk, then you don't need all those free_array()
>>>routines -- the standard free() will work.  I've written it this way.

The problem with this is alignment.

	size_t i, n_rows, n_cols;
	void *p;
	void *malloc(size_t size);
	ty *data_area, **row_area;

	p = malloc(n_rows * sizeof(ty *) + n_rows * n_cols * sizeof (ty));
	if (p == NULL) ...
	row_area = p;
	data_area = (ty *)(row_area + n_rows);
	for (i = 0; i < n_rows; i++)
		row_area[i] = data_area + i * n_cols;

This allocates the row vectors and the array, and sets up the row
vectors to point to the rows.  But if `ty' is `double', there
is no guarantee that *data_area will not trap---for instance,
if n_rows is odd and sizeof(double *)==4 and sizeof(double)==8
and doubles must be aligned on an eight-byte boundary.

In article <8099@brl-smoke.ARPA> gwyn@brl-smoke.ARPA (Doug Gwyn) writes:
>Karl's point is that the "row vector" can be contiguous with the MxN data
>area.  That is indeed a nice idea, and I think I'll change some of our
>applications to do this.

To do it you need to be able to align `data_area' above, which is
why we need some sort of standard <align.h> header.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

karl@haddock.ISC.COM (Karl Heuer) (06/18/88)

In article <11989@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>>>In article <4556@haddock.ISC.COM> karl@haddock.ima.isc.com (Karl Heuer)
>>>writes [that by using one big malloc for both the row vectors and the data
>>>area makes it simpler to deallocate the thing]
>
>The problem with this is alignment.

This is correct; if there are objects with stricter alignment than pointers,
you may need to insert a shim.  On any individual machine this is easy, but
without an <align.h> there's no absolute guarantee.  (But I'd bet aligning to
sizeof(double) would work on all current implementations.  This can be done
without mucking around with the pointer format, since the base pointer
returned by malloc() is already aligned.)

In any case, the use is portable even if the implementation details are not.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

gwyn@brl-smoke.ARPA (Doug Gwyn ) (06/18/88)

In article <11989@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>To do it you need to be able to align `data_area' above,

True.

>which is why we need some sort of standard <align.h> header.

Nah, the application can do this itself (it might be overly
conservative, but the amount of wasted space would be small).
(Hint: use a union.)