[mod.sources] run-time memory allocation for multi-dimensional arrays

sources-request@genrad.UUCP (08/13/85)

Mod.sources:  Volume 2, Issue 36
Submitted by: decvax!allegra!phri!roy (Roy Smith

	A while ago, there was a discussion (in net.unix-wizards?) about
how to get dynamic memory allocation (using malloc() and friends).  If I
remember correctly the answer was that you couldn't, for an assortment of
reasons.  Well, I didn't follow those discussions too closely because they
didn't seem interesting at the time.

	However, I recently ran accross a need for just such a thing so I
sat down and wrote what I think is a nifty solution to the problem which
goes back to the old Fortran days -- vectored arrays.  It turns out it
isn't hard at all (the documentation is several times longer than the code,
which perhaps is the way it should be with everything), and as far as I can
tell, is portable.

	I apologize to those of you Sys5 types who don't have the -me
macros to properly print out the documentation.  The man page uses the
regular -man macros, so that should work.  As for the accompanying paper,
do the best you can.  I suppose I could have submitted the already
formatted version, but that seemed rather tacky.  If enough people ask,
however, I'll do so when I post the (inevitable) bug fixes.

[ moderator's note:  I added the a formatted copy of the paper to
  the distribution to avoid the problem  - John Nelson, moderator ]

	Anyway, here it is.  Just read this into your favorite editor, cut
on the dotted line and ... Oh hell, you know what to do.  Have fun, write
if you get the chance, and remember to say hello to everybody for me.

Roy Smith <allegra!phri!roy>
System Administrator, Public Health Research Institute
455 First Avenue, New York, NY 10016

---------there be monsters below this line-----------

#! /bin/sh
# This is a shell archive, meaning:
# 1. Remove everything above the #! /bin/sh line.
# 2. Save the resulting text in a file.
# 3. Execute the file with /bin/sh (not csh) to create the files:
#	Makefile
#	Read_me
#	arrayalloc.3
#	arrayalloc.c
#	arrayalloc.me
#	arrayalloc.txt
#	test.c
# This archive created: Tue Aug 13 07:55:16 1985
export PATH; PATH=/bin:$PATH
echo shar: extracting "'Makefile'" '(675 characters)'
if test -f 'Makefile'
then
	echo shar: will not over-write existing file "'Makefile'"
else
sed 's/^X//' << \SHAR_EOF > 'Makefile'
#
# Makefile for array allocation routines
# $Header: Makefile,v 1.3 85/08/12 17:27:17 roy Rel $
#
# $Log:	Makefile,v $
# Revision 1.3  85/08/12  17:27:17  roy
# Added "Makefile" to ${FILES}.
# 
# Revision 1.2  85/08/12  17:07:40  roy
# Fixed "make shar"
# 
# Revision 1.1  85/08/12  16:56:48  roy
# Initial revision
# 
#

XSRCS =	test.c arrayalloc.c
DOCS =	arrayalloc.me arrayalloc.3 Read_me
XFILES =	${SRCS} ${DOCS} Makefile

CFLAGS = -g

test:		test.o arrayalloc.o
		cc -o test test.o arrayalloc.o

shar:		${FILES}
		shar -c -v ${FILES} > arrayalloc.shar

lint:		${SRCS}
		lint ${SRCS}

clean:		.
		rm core *.o test *~*

rcs:		RCS
		for file in ${FILES}; do co $$file; done
SHAR_EOF
if test 675 -ne "`wc -c < 'Makefile'`"
then
	echo shar: error transmitting "'Makefile'" '(should have been 675 characters)'
fi
fi # end of overwriting check
echo shar: extracting "'Read_me'" '(1856 characters)'
if test -f 'Read_me'
then
	echo shar: will not over-write existing file "'Read_me'"
else
sed 's/^X//' << \SHAR_EOF > 'Read_me'
Read_me: $Header: Read_me,v 1.3 85/08/12 17:11:36 roy Rel $

	This little package allows you to use run-time memory allocation
for 2-dimensional arrays in a C program.  It works under 4.2bsd, but has
not been tested under other systems, so I can't guarantee that it works
under Sys5 or whatever.  Come to think of it, I can't guarantee that it
even works under 4.2bsd, but as far as I can tell, there aren't any
problems.

	Don't worry about the RCS stuff in the Makefile, I'm not
distributing the RCS files, just the (so called) working files.  All you
should have to do to get this running is un-shar it and utter "make test".
Once you have satisfied yourself that it works, add arrayalloc.o to the
appropriate library.  Probably /lib/libc.a is the right place, but you may
prefer to put it in a local library; in fact, this may indeed be the wise
thing to do.  Hey, if some random gave me some funny looking code, I
wouldn't jump to stick it in *my* system library until I was sure I trusted
it.  When you run the test program, you should get output which looks like
this:

	0	1	2	3	4
	10	11	12	13	14
	20	21	22	23	24
	30	31	32	33	34
	40	41	42	43	44

	Of course, if you find any bugs, please let me know so I can fix
things up.  This passes lint without any problems other than complaining
about the rscid header lines; if you can figure out how to make lint shut
up about that, tell me.  Actually, if you run lint with some of the more
stringent checking turned on (-hc for example), it *does* get upset about
most of the type casting, but I don't think it's anything to worry about.
If you find that the type casting doesn't work on your machine, tell me;
I'm not really a maven at these sorts of things.  In the meantime, enjoy.

Roy Smith <allegra!phri!roy>
XSystem Administrator, Public Health Research Institute
455 First Avenue, New York, NY 10016
SHAR_EOF
if test 1856 -ne "`wc -c < 'Read_me'`"
then
	echo shar: error transmitting "'Read_me'" '(should have been 1856 characters)'
fi
fi # end of overwriting check
echo shar: extracting "'arrayalloc.3'" '(1667 characters)'
if test -f 'arrayalloc.3'
then
	echo shar: will not over-write existing file "'arrayalloc.3'"
else
sed 's/^X//' << \SHAR_EOF > 'arrayalloc.3'
X.\"
X.\" $Header: arrayalloc.3,v 1.1 85/08/12 16:56:20 roy Rel $
X.\"
X.\" $Log:	arrayalloc.3,v $
Revision 1.1  85/08/12  16:56:20  roy
Initial revision

X.\"
X.TH ARRAYALLOC 3 "12 August 1985 (PHRI)"
X.SH NAME
arrayalloc, arrayfree \- 2-d array run-time memory allocation 
X.SH SYNOPSIS
X.B v = arrayalloc (imax, jmax, size)
X.br
X.B char **v, **arrayalloc();
X.br
X.B unsigned imax, jmax, size;
X.sp
X.B (void) arrayfree (v)
X.br
X.B char **v;
X.SH DESCRIPTION
Arrayalloc (imax, jmax, size) returns a pointer which can be thought of as
pointing to a 2-dimensional array with imax rows and jmax columns, each
element of which is size bytes long.  In reality, the pointer points to the
address vector of a vectored array, but after being cast to the proper
type, you can access the array with the familiar syntax v[i][j], just as if
it were a true multi-dimensional array.
X.SH SEE ALSO
Run Time Memory Allocation for Multi-Dimensional Arrays in C.
X.SH DIAGNOSTICS
If sufficient memory is not available for either the main array or the
intermediate address vector, NULL is returned.
X.SH BUGS
I haven't found any yet, but I wouldn't be too surprised if you do.
In particular, this has only been tested on a Vax-11/750 running 4.2bsd Unix.
That does not mean it's going to work on your Widget-9000 running Barfix, but
I don't see any reason why it shouldn't.
X.SH AUTHOR
Roy Smith, Public Health Research Institue <allegra!phri!roy>.
X.SH ACKNOWLEDGEMENTS
This software was developed with funding from various federal
research grants, notably AI 09049 from the National Institutes of Health
and PCM 83-13516 from the National Science Foundation.  Their support is
gratefully acknowledged.
SHAR_EOF
if test 1667 -ne "`wc -c < 'arrayalloc.3'`"
then
	echo shar: error transmitting "'arrayalloc.3'" '(should have been 1667 characters)'
fi
fi # end of overwriting check
echo shar: extracting "'arrayalloc.c'" '(2312 characters)'
if test -f 'arrayalloc.c'
then
	echo shar: will not over-write existing file "'arrayalloc.c'"
else
sed 's/^X//' << \SHAR_EOF > 'arrayalloc.c'
/*
 * Arrayalloc.c -- routines to provide dynamic memory allocation for 2
 * dimensional arrays.  The idea is to implement vectored arrays in such a
 * way that they are compatible with normal multi-dimension arrays as far
 * as syntax goes.
 *
 * $Log:	arrayalloc.c,v $
 * Revision 1.4  85/08/12  12:36:27  roy
 * Random commenting and de-lintifying.
 * 
 * Revision 1.3  85/08/12  12:12:51  roy
 * Added random comments in preperation for distribution.
 * 
 * Revision 1.2  85/08/06  18:30:27  roy
 * Added debugging statments in arrayalloc().
 * 
 * Revision 1.1  85/08/05  18:45:20  roy
 * Initial revision
 * 
 */

# include <stdio.h>

static char *rcsid = "$Header: arrayalloc.c,v 1.4 85/08/12 12:36:27 roy Rel $";

/*
 * Arrayalloc () -- allocate an imax by jmax vectored array of "size" byte
 * elements.  If memory can't be allocated, either for the main array or for
 * the row address vector, we return NULL.  See accompanying documentation
 * for more details.
 */
char **arrayalloc (imax, jmax, size)
unsigned imax, jmax, size;
{
	char *malloc();
	register char **vector, *array;
	register int k, stride;

	/*
	 * Get memory for main array.
	 */
	if ((array = malloc (imax * jmax * size)) == NULL)
		return (NULL);

# ifdef DEBUG
	printf ("array = %x\n", array);
# endif

	/*
	 * Get memory for intermediate row address vector.
	 */
	if ((vector = (char **) malloc (imax * sizeof (char *))) == NULL)
		return (NULL);

	/*
	 * Initialize the address vector so each element points to the
	 * first element in the corresponding row in the main array.
	 */
	for (k = 0; k < imax; k++)
	{
		stride = jmax * size;
		vector [k] = &array [k*stride];
# ifdef DEBUG
		printf ("vector [%d] = %x\n", k, vector[k]);
# endif
	}

	return (vector);
}

/*
 * Arrayfree () -- free the memory acquired from arrayalloc ().  No checks
 * are made to make sure things are as they should be, so it is the user's
 * responsibility to make sure that you don't arrayfree() anything that you
 * didn't arrayalloc() in the first place.  Eventually, checks will be added
 * to make sure the user hasn't screwed things up.  We have to first free the
 * real array memory, and then free the intermediate vector.  This sounds
 * more complicated than it really is.
 */
arrayfree (v)
char **v;
{
	free (v[0]);
	free ((char *) v);
}
SHAR_EOF
if test 2312 -ne "`wc -c < 'arrayalloc.c'`"
then
	echo shar: error transmitting "'arrayalloc.c'" '(should have been 2312 characters)'
fi
fi # end of overwriting check
echo shar: extracting "'arrayalloc.me'" '(5356 characters)'
if test -f 'arrayalloc.me'
then
	echo shar: will not over-write existing file "'arrayalloc.me'"
else
sed 's/^X//' << \SHAR_EOF > 'arrayalloc.me'
X.\" arrayalloc.me -- arrayalloc and arrayfree documentation
X.\" $Header: arrayalloc.me,v 1.5 85/08/12 19:38:55 roy Rel $
X.\"
X.\" $Log:	arrayalloc.me,v $
\" Revision 1.5  85/08/12  19:38:55  roy
\" fixed problems with formatting of abstract.  Why do I have to do this
\" stuff right before I'm ready to send it out???
\" 
\" Revision 1.4  85/08/12  16:57:11  roy
\" Added abstract
\" 
\" Revision 1.3  85/08/12  11:58:34  roy
\" Changed name of file from vectors.me to arrayalloc.me
\" 
\" Revision 1.2  85/08/06  13:57:18  roy
\" Diddled with formatting of Figure 1.
\" 
\" Revision 1.1  85/08/05  18:47:46  roy
\" Initial revision
\" 
X.\"
X.ll 6.5i
X.ce
X.ul
Run Time Memory Allocation for Multi-Dimensional Arrays in C.
X.sp
X.(q
X.ce
X.ul
Abstract
X.sp
Using the standard facilities available, there is no easy way to have
2-dimensional arrays in C with storage allocated at run time.
This paper describes a simple to use, higher level interface to the standard
library memory allocation routines which makes run time allocation of
2-dimensional arrays straight forward.
The syntax for accessing these arrays is identical to that used for
"regular" 2-dimensional arrays using compile time memory allocation.
X.)q
X.pp
Due to the way that C treats arrays and pointers, the following two
fragments are interchangeable in many applications.
X.(b L
X.(c
X.sp
int i, *v;				int i, v[10];
v = malloc (10 * sizeof (int));		i = v[0];
i = v[0];
X.sp
X.)c
X.)b
Once the memory is allocated (either at run time as in the first example,
or at compile time as in the second), the syntax used to refer to an
element in the array is the same.
X.pp
Unfortunately, this dynamic memory allocation scheme does not extend easily
to multi-dimensional arrays.  To resolve a reference of the form
X.rb a[i][j] ,
where
X.rb a
has been declared as
X.rb "int a[Imax][Jmax]"
you effectively pretend the declaration was
X.rb "int a[Imax*Jmax]"
and turn the reference into
X.rb a[Imax*i+j] .
Given the way C deals with multi-dimensional arrays, this implies that Imax
must be known at compile time.  Thus, you cannot directly use the standard
dynamic memory allocators for run-time sizing of multi-dimensional arrays.
X.pp
The alternative is to use vectored arrays, where instead of performing the
subscript multiplication, you pre-compute the addresses of all the rows in
the array, store them, and look them up as needed (see figure 1).  Now,
instead of
X.rb a[imax*i+j] ,
you have
X.rb a[x[i]+j] ,
where
X.rb x
is the intermediate row address lookup table.  Fortunately, the C syntax
for dealing with arrays and pointers makes this type of data structure
relatively painless to use once the initial address vector is constructed.
As in the example above for one-dimensional arrays, the written expression
for a true two-dimensional array is identical for that for the vectored
array version.  Of course, the scheme can be extended to handle
multi-dimensional array of order higher than 2; the details are left as an
exercise for the reader.
X.(z
X.(c
X.sp 2
   +----------+             +----------+             +-----------+
   |    a     | ----------> |   a[0]   | ----------> |  a[0][0]  |
   +----------+             +----------+             +-----------+
                            |   a[1]   | ----+       |  a[0][1]  |
                            +----------+     |       +-----------+
                                             |       |  a[0][2]  |
                                             |       +-----------+
                                             +-----> |  a[1][0]  |
                                                     +-----------+
                                                     |  a[1][1]  |
                                                     +-----------+
                                                     |  a[1][2]  |
                                                     +-----------+
X.sp
XFigure 1.  A vectored array, \fBa[2][3]\fP.
X.sp 2
X.)c
X.)z
X.pp
All that remains is a way to set up the proper intermediate address vector.
The routine
X.rb arrayalloc()
does just this (for arrays of order 2).  You give it the the number of rows
and columns and the size of each element; it allocates the needed memory
and returns a pointer to the properly initialized address vector.  After
casting to the proper type, this pointer can be used to access the array in
the normal fashion.  Note that the while the syntax is similar to the
familiar
X.rb char **argv
used to transmit program arguments as an array of strings, the intermediate
address vector
X.ul
is not
a null terminated list.  The above examples have assumed
X.rb int 's
but there is no reason why this must be so.  Array of other types, even
derived types such as structures, will work just as well.
X.pp
As with everything else in life, there are advantages and disadvantages to
vectored arrays.  On the plus side, it is usually faster to perform a
pointer indirection than a multiplication.  I'm sure there is a machine out
there somewhere which can do integer multiplication (perhaps by a power of
2) faster than it can do an indirect memory reference, but I've never seen
one.  On the minus side, you need more memory because you have to store the
intermediate address vector somewhere.  You can reduce this overhead
somewhat for non-square arrays (i.e. where Imax != Jmax) by making Imax the
smaller dimension.
SHAR_EOF
if test 5356 -ne "`wc -c < 'arrayalloc.me'`"
then
	echo shar: error transmitting "'arrayalloc.me'" '(should have been 5356 characters)'
fi
fi # end of overwriting check
echo shar: extracting "'arrayalloc.txt'" '(4852 characters)'
if test -f 'arrayalloc.txt'
then
	echo shar: will not over-write existing file "'arrayalloc.txt'"
else
sed 's/^X//' << \SHAR_EOF > 'arrayalloc.txt'







  Run Time Memory Allocation for Multi-Dimensional Arrays in C.


                            Abstract

    Using the standard facilities available, there is no easy
    way  to have 2-dimensional arrays in C with storage allo-
    cated at run time.  This paper describes a simple to use,
    higher level interface to the standard library memory al-
    location routines which makes run time allocation  of  2-
    dimensional  arrays straight forward.  The syntax for ac-
    cessing these arrays is identical to that used for "regu-
    lar" 2-dimensional arrays using compile time memory allo-
    cation.


     Due to the way that C treats arrays and pointers,  the  fol-
lowing two fragments are interchangeable in many applications.


      int i, *v;                              int i, v[10];
      v = malloc (10 * sizeof (int));         i = v[0];
      i = v[0];


Once the memory is allocated (either at run time as in the  first
example, or at compile time as in the second), the syntax used to
refer to an element in the array is the same.

     Unfortunately, this dynamic memory  allocation  scheme  does
not  extend  easily  to  multi-dimensional  arrays.  To resolve a
reference of the form a[i][j], where a has been declared  as  int
a[Imax][Jmax]  you  effectively  pretend  the declaration was int
a[Imax*Jmax] and turn the reference into a[Imax*i+j].  Given  the
way C deals with multi-dimensional arrays, this implies that Imax
must be known at compile time.  Thus, you cannot directly use the
standard  dynamic memory allocators for run-time sizing of multi-
dimensional arrays.

     The alternative is to use vectored arrays, where instead  of
performing  the  subscript  multiplication,  you  pre-compute the
addresses of all the rows in the array, store them, and look them
up  as  needed  (see figure 1).  Now, instead of a[imax*i+j], you
have a[x[i]+j], where x is the intermediate  row  address  lookup
table.   Fortunately,  the  C  syntax for dealing with arrays and
pointers makes this type of data structure relatively painless to
use  once  the  initial address vector is constructed.  As in the
example above for one-dimensional arrays, the written  expression
for  a  true  two-dimensional array is identical for that for the
vectored array version.  Of course, the scheme can be extended to
handle  multi-dimensional  array  of  order  higher  than  2; the
details are left as an exercise for the reader.
















   +----------+             +----------+             +-----------+
   |    a     | ----------> |   a[0]   | ----------> |  a[0][0]  |
   +----------+             +----------+             +-----------+
                            |   a[1]   | ----+       |  a[0][1]  |
                            +----------+     |       +-----------+
                                             |       |  a[0][2]  |
                                             |       +-----------+
                                             +-----> |  a[1][0]  |
                                                     +-----------+
                                                     |  a[1][1]  |
                                                     +-----------+
                                                     |  a[1][2]  |
                                                     +-----------+

XFigure 1.  A vectored array, a[2][3].



     All that remains is a way to set up the proper  intermediate
address  vector.   The  routine  arrayalloc() does just this (for
arrays of order 2).  You give it  the  the  number  of  rows  and
columns  and  the  size  of each element; it allocates the needed
memory and returns a pointer to the properly initialized  address
vector.   After  casting  to the proper type, this pointer can be
used to access the array in the normal fashion.   Note  that  the
while  the  syntax  is similar to the familiar char**argv used to
transmit program arguments as an array of strings, the intermedi-
ate  address  vector  is  not  a null terminated list.  The above
examples have assumed int's but there is no reason why this  must
be  so.   Array of other types, even derived types such as struc-
tures, will work just as well.

     As with everything else in life, there  are  advantages  and
disadvantages  to  vectored arrays.  On the plus side, it is usu-
ally faster to perform a pointer indirection than  a  multiplica-
tion.   I'm sure there is a machine out there somewhere which can
do integer multiplication (perhaps by a power of 2)  faster  than
it  can do an indirect memory reference, but I've never seen one.
On the minus side, you need more memory because you have to store
the  intermediate  address vector somewhere.  You can reduce this
overhead somewhat for non-square arrays (i.e. where Imax != Jmax)
by making Imax the smaller dimension.













SHAR_EOF
if test 4852 -ne "`wc -c < 'arrayalloc.txt'`"
then
	echo shar: error transmitting "'arrayalloc.txt'" '(should have been 4852 characters)'
fi
fi # end of overwriting check
echo shar: extracting "'test.c'" '(1221 characters)'
if test -f 'test.c'
then
	echo shar: will not over-write existing file "'test.c'"
else
sed 's/^X//' << \SHAR_EOF > 'test.c'
/*
 * Arraytest.c -- driver program to exercise array allocation routine.
 * All this does is get an array, fill it with numbers, print it all
 * again, and then free the array.  This is as much a demo of how to use
 * the routines as a test of how well they work.
 *
 * $Log:	test.c,v $
 * Revision 1.3  85/08/12  19:21:06  roy
 * Fixed trivial little typo in comment.  Of course, I only noticed the typo
 * *after* I had already checked out the last version, marked it for release,
 * built the sharchive, and was all ready to mail it off.  Grrrrrrrr.....
 * 
 * Revision 1.2  85/08/12  17:23:58  roy
 * fixed botch in the " $ H e a d e r $ " line.
 * 
 * Revision 1.1  85/08/12  12:37:16  roy
 * Initial revision
 * 
 */

static char *rcsid = "$Header: test.c,v 1.3 85/08/12 19:21:06 roy Rel $";

# include <stdio.h>
# define MAX	5
main ()
{
	char **arrayalloc();
	int **x, i, j;

	if ((x = (int **) arrayalloc (MAX, MAX, sizeof (**x))) == NULL)
	{
		perror ("error in makearray: ");
		exit (1);
	}

	for (i = 0; i < MAX; i++)
		for (j = 0; j < MAX; j++)
			x[i][j] = 10*i + j;

	for (i = 0; i < MAX; i++)
		for (j = 0; j < MAX; j++)
			printf ("%d%c", x[i][j], j == MAX-1 ? '\n' : '\t');

	arrayfree ((char **) x);
}
SHAR_EOF
if test 1221 -ne "`wc -c < 'test.c'`"
then
	echo shar: error transmitting "'test.c'" '(should have been 1221 characters)'
fi
fi # end of overwriting check
#	End of shell archive
exit 0