phipps@fortune.UUCP (Clay Phipps) (08/22/84)
Lines beginning "!? ": Chris Torek (does he really want to be named ?) Lines beginning " ": Clay [editing indicated by -- CP] !? let's take a look at a three dimensional [row major -- CP] array. !? Given ``int a[4][6][5];'' as the declaration !? and ``a[2][3][4]'' as the reference, !? we take 2, multiply by six, add three, !? multiply the result by five, and add four, giving 79. The above calculations are correct. !? We can generalize the first as follows: !? !? offset = 0; !? for (i = 0, j = num_dimens - 1; i < num_dimensions; i++, j--) { !? offset = offset + array_index[i]; !? offset = offset * array_sub[j]; !? } !? ... I know it's not optimized C code. This is just an illustration. !? ... I see some smiles. You, you look like you want to say something. Yes, teacher, your program fragment is just plain WRONG. It gives an incorrect answer, 142, instead of the correct answer, 79, that was worked out above. !? where !? num_dimens is the number of dimensions in the array, !? array_index[] contains the constants !? used in the declaration of the array, The text above does not describe array "indexes" (or "indices"), but it does describe the "size" (K&R p. 93) of the dimensions. To most people, apparently including K&R, an "index" is synonymous with a "subscript". This counter-mnemonic name was a continual source of confusion while I was trying to work through some examples. Perhaps this even confused the teacher. !? array_sub[] contains the values used in the reference, and !? offset computes the offset from the base of the storage area. Assuming that you have followed C style, and that the above compiler arrays subscripts 0, 1, 2 correspond to the compiled source code subscripts written left to right, the following is correct for row major arrays in C: offset = array_sub[0]; for (i = 1; i < num_dimens; i++) { offset = offset * array_size[i]; offset = offset + array_sub[i]; } where array_size[] is what Chris unfortunately named "array_index"; it contains the constants used in the declaration of the array. A better program fragment would be the following: offset = array_sub[0]; for (i = 1; i < num_dimens; i++) { offset = offset * array_size[i] + array_sub[i]; } Note that the code above takes advantage of the fact that C array declarations, like old-style FORTRAN (IV, 66), specify a size, instead of lower and upper bounds. With other languages like PL/I, Pascal, and Full FORTRAN 77, the dimension size t be calculated, usually at compilation time. !? Ok, that's all for today. Quiz Thursday. Good night. I think you had better review Gries, p. 178 (Pratt isn't much help; don't know about the Dragon Book) before you spring it on us netlanders. -- Clay Phipps -- { amd hplabs!hpda sri-unix ucbvax!amd } !fortune!phipps { ihnp4 cbosgd decvax!decwrl!amd harpo allegra}
chris@umcp-cs.UUCP (Chris Torek) (08/24/84)
Oops. Clay Phipps' corrections are definitely in order. For some reason when I wrote that I thought of the indicies being stored in the sub[] guys in backwards order. I never did specify that (obviously I wasn't thinking clearly there). Oh well, at least I got the pointers part right. (Well, that is, I *assume* I got it right, or someone would have pointed out any errors there, too, right? :-) ) ``array_index'' *was* a lousy name, wasn't it? By the way, (just to get revenge (-: ) this wasn't quite precise: > Note that the code above takes advantage of the fact > that C array declarations, like old-style FORTRAN (IV, 66), > specify a size, instead of lower and upper bounds. > With other languages like PL/I, Pascal, and Full FORTRAN 77, > the dimension size t be calculated, usually at compilation time. Actually, in the case of FORTRAN IV and FORTRAN 66, you'd also have to subtract one from the given indicies, since the legal index values are 1 to ``size'' instead of C's 0 to ``size''-1. In languages with compiler support for arbitrary subscript ranges (in F77, stuff like ``INTEGER I(-10:10)'') the lower bounds have to be saved, too, along with the sizes (or the upper bounds). If the compiler is to do bounds checking, then the upper bounds are required (and it's up to the compiler writer to decide whether to compute the sizes once and save them, or once for each array reference). A complete row-major order implementation (that worked only with constants for subscripts) might look like this: /* a single array dimension, i.e., the -10:10 part of an F77 array */ struct arydim { int ad_low; /* lower bound */ int ad_high; /* upper bound */ int ad_size; /* high-low+1 */ }; #define MAXDIM 7 /* good enough for FORTRAN anyway */ /* or have they changed it again? */ /* an array */ struct ary_descriptor { char *a_name; /* name of the array, for the assembler */ int a_ndims; /* number of dimensions */ struct arydim a_dim[MAXDIM];/* the dimensions themselves */ }; /* Compute the offset in the array described by `d' given the indicies in `sub'. The number of subscripts has already been verified. The subscripts are stored such that sub[0] == leftmost, sub[MAXDIM-1] == rightmost possible. */ int compute_offset (d, sub) struct ary_descriptor *d; int sub[MAXDIM]; { int i, off, t; off = 0; for (i = 0; i < d -> a_ndims; i++) { off *= d -> a_dim[i].ad_size; t = sub[i]; if (t < d->a_dim[i].ad_low || t > d->a_dim[i].ad_high) uerror("array index out of range"); off += t - d -> a_dim[i].ad_low; } return off; } One can generalize even further, to arrays whose sizes are computed at run time, to extensible arrays, and so forth. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci (301) 454-7690 UUCP: {seismo,allegra,brl-bmd}!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@maryland