ajayshah@almaak.usc.edu (Ajay Shah) (04/16/91)
Consider this fragment of C code (from Numerical Recipes): 1 double *dvector(nl,nh) 2 int nl,nh; 3 { 4 double *v; 5 6 v=(double *)malloc((unsigned) (nh-nl+1)*sizeof(double)); 7 if (!v) nrerror("allocation failure in dvector()"); 8 return v-nl; 9 } It's supposed to be a function which allocates a vector of doubles. My interpretation of nl and nh is: they're array indexes. If you want to allocate an array going from 5 to 10, you would say p = dvector(5, 10). Question: what is happening on line 8? Why is he not just returning v (a pointer)? What is the meaning of subtracting nl (an int) from v without any casting? Thanks! -ans. -- _______________________________________________________________________________ Ajay Shah, (213)734-3930, ajayshah@usc.edu The more things change, the more they stay insane. _______________________________________________________________________________
neufeld@aurora.physics.utoronto.ca (Christopher Neufeld) (04/16/91)
In article <31969@usc> ajayshah@almaak.usc.edu (Ajay Shah) writes: > >1 double *dvector(nl,nh) >2 int nl,nh; >3 { >4 double *v; >5 >6 v=(double *)malloc((unsigned) (nh-nl+1)*sizeof(double)); >7 if (!v) nrerror("allocation failure in dvector()"); >8 return v-nl; >9 } > >It's supposed to be a function which allocates a vector of >doubles. My interpretation of nl and nh is: they're array >indexes. If you want to allocate an array going from 5 to 10, >you would say p = dvector(5, 10). > Yes, that's right. Memory is allocated for p[5] to p[10]. Memory outside these points is not yours, and will usually crash the machine if overwritten. >Question: what is happening on line 8? Why is he not just >returning v (a pointer)? What is the meaning of subtracting nl >(an int) from v without any casting? > Subtracting or adding an integer to a pointer has a well-defined meaning. Consider: my_typedef *p1, *p2; where 'my_typedef' is any arbitrary data structure. If one defines: p2 = p1 - n; /* 'n' is an integer */ Then the pointer 'p2' points in memory to a location earlier than 'p1' in memory by an amount n * sizeof(my_typedef) OK, this looks pointless so far, since that memory might not be reserved, or might be reserved for something completely different, like the integer variable 'n'. Here's the fun part. When the compiler sees p1[j] it evaluates it as *(p1+j). The addition of the integer 'j' means the same thing it meant above, it advances in memory by an amount j * sizeof(*p1) So, an example. If you have a conventionally defined array, and a pointer, such as: float arr[3], *myvec; and I assign: myvec = arr - 1; then the following are true: myvec[0] is undefined and dangerous to use, as you step on memory. myvec[1] == arr[0] myvec[2] == arr[1] myvec[3] == arr[2] arr[3] is undefined and dangerous to use. To answer your question, the code fragment you asked about creates a pointer 'v' to the beginning of an array whose indices lie between '0' and 'nh-nl' inclusive. The 'return v-nl;' instruction returns a pointer whose [nl] entry is the [0] entry of the 'v' just defined. I should mention that there is one exception that I know of, and probably a few other people can provide. The rule is less simple for multi-dimensional arrays defined in the conventional manner: float arr[N1][N2][N3]; Now, when the compiler sees arr[i][j][k] it evaluates: *(arr + ((i * N2) + j) * N3 +k) However, if I have: float ***myarr; then when the compiler sees myarr[i][j][k] it evaluates: *(*(*(myarr+i)+j)+k) Note that this definition takes less CPU usually, and eats a bit more memory. It's also a bit easier to use in passing to functions because the function doesn't have to be told the dimensions 'N2' and 'N3', which it obviously needs for the first definition. -- Christopher Neufeld....Just a graduate student | Flash: morning star seen neufeld@aurora.physics.utoronto.ca Ad astra! | in evening! Baffled cneufeld@{pnet91,pro-cco}.cts.com | astronomers: "could mean "Don't edit reality for the sake of simplicity" | second coming of Elvis!"
heinhuis@dri.nl (Gustaaf-Jan Heinhuis) (04/16/91)
In article <31969@usc> ajayshah@almaak.usc.edu (Ajay Shah) writes: >Consider this fragment of C code (from Numerical Recipes): > >1 double *dvector(nl,nh) >2 int nl,nh; >3 { >4 double *v; >5 >6 v=(double *)malloc((unsigned) (nh-nl+1)*sizeof(double)); >7 if (!v) nrerror("allocation failure in dvector()"); >8 return v-nl; >9 } > >It's supposed to be a function which allocates a vector of >doubles. My interpretation of nl and nh is: they're array >indexes. If you want to allocate an array going from 5 to 10, >you would say p = dvector(5, 10). > >Question: what is happening on line 8? Why is he not just >returning v (a pointer)? What is the meaning of subtracting nl >(an int) from v without any casting? > >Thanks! > Well, I know why he substracts nl. The fact that you have an array from 5 to 10 implies you also have indexes 0 to 4. The lowest index your going to use is 5 which is an offset to a base adress. To make this work the base adress has to 5 units (= nl, unit depends on type of reserved memory; here it's double) lower as the adress of the allocated block of memory. You can, however, not use p[0] to p[4] because you didn't alloc any memory for this part of the array. Hope this answers your question, Gustaaf-Jan. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++++++++++++---===To all virgins, thanks for nothing.===---++++++++++++++ ___ GJ Heinhuis student comp. sc. HIO Enschede (Holland) / _) Final Project at: DataCount, Wierden, Holland | ____ ___ Mail: heinhuis@dri.nl \__/| | |___| Everybody is entitled to his own | | | opinion, as long as it's mine! \_/ | Not necessarely DataCount's opinion!!
mhcoffin@tolstoy.waterloo.edu (Michael Coffin) (04/16/91)
In article <31969@usc> ajayshah@almaak.usc.edu (Ajay Shah) writes: >Consider this fragment of C code (from Numerical Recipes): > >1 double *dvector(nl,nh) >2 int nl,nh; >3 { >4 double *v; >5 >6 v=(double *)malloc((unsigned) (nh-nl+1)*sizeof(double)); >7 if (!v) nrerror("allocation failure in dvector()"); >8 return v-nl; >9 } > >It's supposed to be a function which allocates a vector of >doubles. My interpretation of nl and nh is: they're array >indexes. If you want to allocate an array going from 5 to 10, >you would say p = dvector(5, 10). > >Question: what is happening on line 8? Why is he not just >returning v (a pointer)? What is the meaning of subtracting nl >(an int) from v without any casting? The intention is to return a pointer to a phantom array location nl spaces to the "left" of the leftmost space actually allocated, so that after calling, for example, d = dvector(2,4) d[2],d[3], and d[4] would be within the space actually allocated. Although this will work with many C compilers, the ANSI standard does not give its blessing. A compiler is entitled to produce code that issues an error message or dumps core when it evaluates the expression "v-nl" for positive nl. Michael Coffin mhcoffin@watmsg.waterloo.edu Dept. of Computer Science office: (519) 885-1211 University of Waterloo home: (519) 725-5516 Waterloo, Ontario, Canada N2L 3G1
gwyn@smoke.brl.mil (Doug Gwyn) (04/17/91)
In article <31969@usc> ajayshah@almaak.usc.edu (Ajay Shah) writes: >Consider this fragment of C code (from Numerical Recipes): Ugh, but okay. >1 double *dvector(nl,nh) >2 int nl,nh; >3 { >4 double *v; >6 v=(double *)malloc((unsigned) (nh-nl+1)*sizeof(double)); >7 if (!v) nrerror("allocation failure in dvector()"); >8 return v-nl; >9 } >It's supposed to be a function which allocates a vector of doubles. >My interpretation of nl and nh is: they're array indexes. They're supposed to be the minimum and maximum indices that will be used to access the vector after it is allocated. >Question: what is happening on line 8? Why is he not just >returning v (a pointer)? What is the meaning of subtracting nl >(an int) from v without any casting? The intention is to be able to use dvector() like this: double *v = dvector(5,10); v[5] = 42.0; /* ... */ v[10] = 3.1416; The returned pointer is supposed to be offset so that v[5] will access the first location in the object allocated by malloc(). Pointer_to_thing plus or minus some integer is an expression that has type pointer_to_thing and a value that points to that integer number of "things" away from the original pointer. Unfortunately, v-nl does not necessarily point to anything valid (who knows what lurks below the address returned by malloc()?), so this function may fail (most likely catastrophically) on many C implementations. The Numerical Recipes C code is terrible; this is just one example.
torek@elf.ee.lbl.gov (Chris Torek) (04/17/91)
>In article <31969@usc> ajayshah@almaak.usc.edu (Ajay Shah) writes: [quoting from Numerical Recipes in C] >>1 double *dvector(nl,nh) >>2 int nl,nh; >>3 { >>4 double *v; >>5 >>6 v=(double *)malloc((unsigned) (nh-nl+1)*sizeof(double)); >>7 if (!v) nrerror("allocation failure in dvector()"); >>8 return v-nl; >>9 } >> >>It's supposed to be a function which allocates a vector of doubles. (It *is* such a function.) >>My interpretation of nl and nh is: they're array indexes. A better word here is `bounds': >>If you want to allocate an array going from 5 to 10, you would say >>p = dvector(5, 10). The result of this call is not portable (dvector is portable only when nl is nonpositive, nh is nonnegative, and nl is less than or equal to nh). One way to think about this is to imagine that the malloc() call allocates a `true' array (as opposed to reality, in which malloc merely obtains a suitably malleable `blob of memory'): extern double array[nh - nl + 1]; v = &array[0]; This partially-equivalent (but not legal C) assignment makes `a' point to the first of nh-nl+1 doubles. The C language then allows you to point to and use any of those doubles; it also allows you to point to the (single) nonexistent double at the end (&v[nh-nl+1+1]). Thus, v[0]..v[nh-nl+1] are all legal `double's, and v[nh-nl+1+1] is not a double but is nonetheless addressible. NO OTHER ELEMENTS ARE ADDRESSIBLE. You may not compute &v[-1] in portable code. Thus, in the actual assignment, the same thing happens. v points to the first of nh-nl+1 `double's, and v[0]..v[nh-nl+1] are all legal `double's, and in addition you may compute &v[nh-nl+1+1]. So what does the `return' expression `v - nl' compute? If nl is 0, this is just `v'. If nl is negative, this adds -nl `double's to v (skips forward -nl doubles---e.g., if nl is -3, this computes &v[3]). But if nl is positive, this tries to compute &v[<some negative number>]. This is illegal, and the result is undefined (perhaps the computer turns into a frog). On many machines the result is what you would expect (v simply points somewhere `below' the first double, such that v[nl] is the first one), which is why Numerical Recipes in C gets away with it. Portable code should not count on this. In article <1991Apr16.033331.5408@helios.physics.utoronto.ca> neufeld@aurora.physics.utoronto.ca (Christopher Neufeld) writes: >... When the compiler sees p1[j] it evaluates it as *(p1+j). >The addition of the integer 'j' means the same thing it meant >above, it advances in memory by an amount j * sizeof(*p1) More simply, it advances by `j' objects, each of which is whatever *p1 is. On a typical byte-addressed machine, `j objects' and `j * sizeof *p1 bytes' give the same machine-internal-offset-value, but not all machines are byte-addressed. > I should mention that there is one exception that I know of, and >probably a few other people can provide. The rule is less simple for >multi-dimensional arrays defined in the conventional manner: >float arr[N1][N2][N3]; >Now, when the compiler sees arr[i][j][k] it evaluates: >*(arr + ((i * N2) + j) * N3 +k) >However, if I have: >float ***myarr; >then when the compiler sees myarr[i][j][k] it evaluates: >*(*(*(myarr+i)+j)+k) Actually, in the `virtual machine', the operation of both is the same: arr[i][j][k] means `add i to arr, indirect, add j to that, indirect, add k to that, indirect'. This works because `arr' is a collection% of `array N2 of array N3 of float's; adding i moves forward by i such objects. If you have a byte-addressed machine---you probably do, and you probably know about it if you do not---this is indeed the same as moving forward i*N2*N3*sizeof(float) bytes. If not, it is something else. ----- % In fact, `arr' is a collection of exactly N1 objects, and i must be in the half-open interval [0..N1). If not, all bets are off (better find someone who knows who to turn frogs back into computers). ----- This repeats for j and k, but the objects differ: the result of `arr[i]' is a collection of `array N3 of float's, and the result of `arr[i][j]' is a collection of `float's. Indeed, arr[i] is a collection of N2 such arrays, and arr[i][j] is a collection of N3 floats. myarr[i][j][k] means the same thing: `add i to myarr, indirect, add j to that, indirect, add k to that, indirect'. This works because `myarr' points to the first of a collection of `pointer to pointer to float's; adding i moves forward by i such objects. If you have a byte-addressed machine, this is the same as moving forward i*sizeof(float **) bytes. If not, it is something else. This repeats for j and k. In each case the number of objects is not specified (myarr may point to the first of zero or more `float **'s; myarr[i] may point to the first of zero or more `float *'s; and myarr[i][j] may point to the first of zero or more `float's), but whatever each actual number is, i, j, and k had better be less than each. >Note that this definition takes less CPU usually, and eats a bit more >memory. It's also a bit easier to use in passing to functions because >the function doesn't have to be told the dimensions 'N2' and 'N3', which >it obviously needs for the first definition. Right, although the `less CPU' is hard to define; `usually' may be overstating the case. A great deal depends on both the compiler and the machine. -- In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427) Berkeley, CA Domain: torek@ee.lbl.gov