LAVITSKY@RU-BLUE.ARPA (08/09/84)
From: Eric <LAVITSKY@RU-BLUE.ARPA> Hi, I'm quite a novice to C, and would like to know if anyone could help me with the following: I need to allocate memory space for some buffers in the following manner: In the middle of my program I read in some parameters from a file which will determine the size of my buffer space. The parameters are two integers - and the buffer itself will contain integers. So, if I read in a value of 8192 from my file header - I need to then allocate my buffers to be exactly 8192 bytes of type integer. I will also be reading data into the buffers from this file. How do I use malloc() to allocate my buffer space ? Everything I read so far is very vague as to how this is used. I would like to read data into the array and allocate each byte of the array dynamically as it is needed. Something like: get buffer size; loop: allocate buffer byte; read data into buffer; untill buffer size; Thanx, Eric -------
bbanerje@sjuvax.UUCP (B. Banerjee) (08/18/84)
Hello, I have a similar question which I would appreciate any solutions to. I have a large (working) program involving a two dimensional array that I would like to allocate storage for, dynamically (calloc). I really don't want to go through and change all the array references to do pointer arithmetic. What I would like to do is something like this - datum **myarray; /* datum is typedef'ed */ myarray = calloc(num_rows * num_cols, sizeof(datum)); The trouble is, what should I cast calloc to in order that I may address myarray using indices? I thought of doing something like this : myarray = (datum **) calloc ( ....) However, the datum that "myarray[i][j]" would address is not clear to me. The situation is even murkier when using higher dimensioned arrays. I would be very grateful for any ( and all ) answers. Please reply by mail, as I doubt greatly that this is of general interest. Regards, -- Binayak Banerjee {allegra | astrovax | bpa | burdvax}!sjuvax!bbanerje P.S. Send Flames, I love mail.
gwyn@BRL-VLD.ARPA (08/19/84)
From: Doug Gwyn (VLD/VMB) <gwyn@BRL-VLD.ARPA> malloc() is most efficient if you allocate the needed space all at once, rather than a "byte" (actually, you seem to mean "integer datum") at a time. The simplest code to do what I think you want is: #include <stdio.h> ... typedef char *generic; /* generic pointer type, (void *) after ANSI C committee is done */ extern generic calloc(); /* layer around malloc() */ extern void free(); void Fatal(); /* prints error message & aborts */ typedef int datum; /* more often, a struct */ int ndata; /* # data records to allocate */ register datum *dp; /* -> allocated array */ register int i; /* index data array */ register FILE *fp; /* data file stream pointer */ /* <get buffer size into `ndata' somehow> */ if ( (dp = (datum *)calloc( ndata, sizeof(datum) )) == NULL ) Fatal( "ran out of memory" ); /* always check! */ /* <get data file open on stream `fp' somehow, e.g. fopen()> */ for ( i = 0; i < ndata; ++i ) if ( fread( (char *)&dp[i], (int)sizeof(datum), 1, fp ) != 1 ) Fatal( "error reading data" ); /* this is just an example; in this case you could have had: if ( fread( (char *)dp, (int)sizeof(datum), ndata, fp ) != ndata ) Fatal( "error reading data" ); ... and avoided the loop altogether */ /* <compute with data now in buffer, using dp[i] to access the i-th element of the array> */ free( (generic)dp ); /* "waste not, want not" */ calloc() is just a simple layer around malloc() to take care of multiplying the number of items by the size of each; it also fills the allocated storage full of 0s which makes debugging less messy.
chris@umcp-cs.UUCP (08/20/84)
(Warning: what follows is a long explanation of various subscripting phenomena.) From: bbanerje@sjuvax.UUCP (B. Banerjee) What I would like to do is something like this - datum **myarray; /* datum is typedef'ed */ myarray = calloc(num_rows * num_cols, sizeof(datum)); The trouble is, what should I cast calloc to in order that I may address myarray using indices? I thought of doing something like this: myarray = (datum **) calloc ( ....) However, the datum that "myarray[i][j]" would address is not clear to me. The situation is even murkier when using higher dimensioned arrays. (Tap tap tap) Quiet in the back row! There, that's better. Can everyone hear me? What? Is that better? Ok, good. Today we're going to talk about multi-dimensional arrays. We all understand single dimensional arrays, right? One simply has a large area of storage that is accessed by adding an offset (the subscript) to the start of the memory area and poking around in there. No problem. Well, in many compiled languages, like C for example [of course!], multi-dimensional arrays are really the same thing, only fancier. Think of it this way: suppose the array is declared as int a[20][50]; Then we can just multiply the two constants together and make a big block of integers (in this case 1000 of them). This is our storage area. To access one of these, we have two choices: we can use what's called ``row major'' order or ``column major order''. Suppose we're asked to fetch a[6][4]. (Let's assume we have zero-based arrays, so we go from a[0][0] to a[19][49].) We can either multiply the first subscript by the second subscript size and add the second subscript. This gives 6 times 50 plus 4 which is 304. So we add 304 to the storage area for ``a'' and grab an integer from there. This is the one called ``row major'' (I think). The other way we could do it is multiply the 4 times the 50 and add 6; that is, start with the second subscript and multiply it by the second subscript size, then add the first subscript. From here we do the same thing as before. So add 206 to ``a'' and fetch from there. If I haven't got them reversed this is ``column major'' order. Going back to the first, let's take a look at a three dimensional array. Given ``int a[4][6][5];'' as the declaration and ``a[2][3][4]'' as the reference, we take 2, multiply by six, add three, multiply the result by five, and add four, giving 79. We can generalize the first as follows: offset = 0; for (i = 0, j = num_dimens - 1; i < num_dimensions; i++, j--) { offset = offset + array_index[i]; offset = offset * array_sub[j]; } where num_dimens is the number of dimensions in the array, array_index[] contains the constants used in the declaration of the array, array_sub[] contains the values used in the reference, and offset computes the offset from the base of the storage area. Comment? What? Yes, I know it's not optimized C code. This is just an illustration. Also keep in mind that for a compiler the subscripts (array_sub[] elements) might well be variables or expressions, and the offset will have to be computed at RUN time. Everyone got that? Too bad, we're moving on anyway. All those multiplications done at run time can really slow things down. Wouldn't it be nice if there were some other way to get to the various array elements? Also, wouldn't it be nice not to have to have 60 columns for every row, when only one or two of them really need all of those? I see some smiles. You, you look like you want to say something. Did everyone hear that? No? He's right; he said ``pointers''. We can use pointers to weasel out of multiplications. Not only that but we can make different ``subscripts'' have different numbers of ``sub-subscripts''. Of course there's a price. We have to pay for this with storage and much head-scratching. Let's look at a ``two-dimensional'' ``array'' using pointers. Think of it this way: instead of having a big rectangular area of storage like this: a[0][0] a[0][1] a[0][2] a[1][0] a[1][1] a[1][2] we set up one vector like this: a[0] a[1] and make each point to another vector. Now we have something like this: a[0]===>[0] [1] [2] a[1]===>[0] [1] [2] That is, one two-element vector of pointers, and two three-element storage areas. We're using more space now, because we have, in addition to the six storage areas for values, two more for pointers. But, we can now access ``a'' without multiplies. If we need to generate code for ``a[1][0]'', we just add 1 to the vector of pointers and fetch that, then add 0 to that and fetch from there. In other words, follow the pointer for a[1] and take [0] from there. Also, we can set up the vectors like this instead: a[0]===>[0] [1] [2] a[1]===>[0] [1] [2] [3] [4] [5] [6] Since we aren't committed to ``rectangular'' storage areas, and in fact can put the data in a[0][] ``far away'' from the data in a[1][], we aren't limited to fixed sizes. Of course, it gets confusing if a[0] can access only [0] through [2] but a[1] can access [0] through [6]. We can make ``higher-dimensional'' ``arrays'' the same way. If ``a'' is supposed to be a five by ten by four array, we can set up five vectors pointing to ten vectors pointing to storage, like this: a[0]===>[0] [1] [2] ... [9] a[1]\ v v v v a[2]\\ [0] [0] [0] ... [0] a[3]\\\ [1] [1] [1] ... [1] a[4] |||[2] [2] [2] ... [2] | |||[3] [3] [3] ... [3] | ||| | ||\[0] [1] [2] ... [9] | || v v v v | || [0] [0] [0] ... [0] | || [1] [1] [1] ... [1] | || [2] [2] [2] ... [2] | || [3] [3] [3] ... [3] | || | |`=>. . . ... . | `==>. . . ... . | `====>[0] [1] [2] ... [9] v v v v [0] [0] [0] ... [0] [1] [1] [1] ... [1] [2] [2] [2] ... [2] [3] [3] [3] ... [3] (Ok, so I'm no artist.) This time, to access a[2][7][3], we add 2 to the a[] area, follow that pointer, add 7 to that area, follow that pointer, and add 3 to that area. Ok, so how to we go about building such a thing, in C? Suppose that you want to read two numbers and set up an array of that size. The array is to hold things of type ``datum'' (a typedef, for what we don't care). Here's such a routine: datum **a; int a_rows, a_cols; build_a () { int i; if (scanf ("%d %d", &a_rows, &a_cols) < 2) return (-1); a = (datum **) calloc (a_rows, sizeof (datum *)); if (a == 0) return (-1); for (i = 0; i < a_rows; i++) { a[i] = (datum *) alloc (a_cols, sizeof (datum)); if (a[i] == 0) return (-1); } return (0); } (Note the tests to make sure that ``calloc'' worked.) Rewriting this for three, four, or more dimensions is left as an exercise to the student. (In other words, I sure don't want to work that hard.) Ok, that's all for today. Quiz Thursday. Good night. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci (301) 454-7690 UUCP: {seismo,allegra,brl-bmd}!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@maryland
jack@vu44.UUCP (Jack Jansen) (08/28/84)
Chris Torek: > Think of it this way: suppose the array is declared as > > int a[20][50]; > > Suppose we're asked to fetch a[6][4]. .... > > The other way we could do it is multiply the 4 times the 50 and add > 6; that is, start with the second subscript and multiply it by the > second subscript size, then add the first subscript. From here we > do the same thing as before. So add 206 to ``a'' and fetch from there. > If I haven't got them reversed this is ``column major'' order. Sorry Chris, but I think this is wrong. What you should do is multiply the second subscript by the *first* subscript size, and add the first subscript, or, in this example use 4*20+6, or 86. Jack Jansen, {philabs|decvax}!mcvax!vu44!jack