cs411s03@uhccux.uhcc.hawaii.edu (Cs411s03) (04/05/89)
I am having difficulty writing a program which dynamically allocates an array of structs via malloc() and casting. The program seems to compile OK, but at run time I am addressing the SAME struct over and over, I want to step thru them with an index. The program goes like so: /* malloc struct test */ #include <stdio.h> #include <malloc.h> #define NUMRECS 60 struct ttst { int num; }; struct ttst (*tptr)[]; main() { int i, j; if ((tptr = (struct ttst (*)[]) \ malloc(sizeof(struct ttst) * NUMRECS)) == (struct ttst (*)[]) 0) { perror("malloc"); exit(1); } printf("sizeof = %d\n", sizeof(struct ttst) * NUMRECS); for (i = j = 0; i < NUMRECS; i++) { tptr[i]->num = ++j; printf("Rec: %x %d %d\n", tptr[i], i, tptr[i]->num); } printf("---------------\n"); for (i = 0; i < NUMRECS; i++) printf("Rec: %x %d %d\n", tptr[i], i, tptr[i]->num); free(tptr); exit(0); } The program will do much more than this once completed, this program was just to test the basic concept of dynamically allocating the structures rather than just declaring a fixed length array ... The first loop looks as if it is working, but the second loop proves it is not. It seems that the same structure is being written to over and over. I must be missing something fundamental here ... cs411s03!uhccux
chris@mimsy.UUCP (Chris Torek) (04/06/89)
In article <3658@uhccux.uhcc.hawaii.edu> cs411s03@uhccux.uhcc.hawaii.edu (Cs411s03) writes: >I am having difficulty writing a program which dynamically allocates >an array of structs via malloc() and casting. ... >struct ttst (*tptr)[]; % cdecl explain struct ttst (*tptr)[] declare tptr as pointer to array of struct ttst declare tptr as pointer to array of struct ttst Warning: Unsupported in C -- Pointer to array of unspecified dimension struct ttst (*tptr)[] % C arrays *must* have a size. What you really want, given the rest of your example, is `struct ttst *tptr'. Remember that a pointer to an object that is part of an array can be used to access the entire array. Time for some replays. From: chris@mimsy.UUCP (Chris Torek) Subject: Re: pointers to arrays Date: 18 Feb 89 04:32:47 GMT If you think you want a pointer to an array allocated with malloc(), you are probably wrong. You really want a pointer that points *at* (not `to') a block of memory (`array') containing a series of `char *' objects each pointing at a block of memory containing a series of `char's. The type of such a pointer is `char **'. You might ask, `what is the difference between a pointer that points ``at'' a block of memory and one that points ``to'' an array?' The distinction is somewhat artificial (and I made up the words for some netnews posting in the past). Given a pointer to array pa: int a[5]; int (*pa)[5] = &a; /* pANS C semantics for &a */ I can get a pointer that points `at' the array instead: int *p = &a[0]; The latter is the more `natural' C version of the former: typically a pointer points at the first element of a group (here 5). The rest of the group can be reached via pointer arithmetic: *(p+3), aka p[3], refers to the same location as a[3]. The pointer need not point at the first element, as long as it points somewhere into the object: p = &a[2]; Now p[1] refers to a[3]; p[-2] refers to a[0]. To use pa to get at a[3] one must write (*pa)[3] (or, equivalently, pa[0][3]). The thing that is most especially confusing, but that really makes the difference, is that *pa, aka pa[0], refers to the entire array `a'. *p refers only to one element of the array. This can be seen in the result produced by `sizeof': (sizeof *p)==(sizeof(int)), but (sizeof *pa)==(sizeof(int[5]))==(5 * sizeof(int)). Pointers to entire arrays are not particularly useful unless there are several arrays: int twodim[3][5]; Now we can use pa to point to (not at) any of the three array-5-of-int elements of twodim: pa = &twodim[1]; /* or pa = twodim + 1, in Classic C */ and now (*pa)[3] (or pa[0][3]) is an alias for twodim[1][3]. Note especially that since pa[0] names the *entire* array-5-of-int at twodim[1], pa[-1] names the entire array-5-of-int at twodim[0]. \bold{Pointer arithmetic moves by whole elements, even if those elements are aggregates.} Thus pa[-1][2] is an alias for twodim[0][2]. This is merely a convenience, for we can do the same with p: p = &twodim[1][0]; Now p points to the 0'th element of the 1'th element of twodim---the same place that pa[0][0] names. p[3] is an alias for twodim[1][3]. To get at twodim[0][2], take p[(-1 * 5) + 2], or p[-3]. Arrays are are stored in row-major order with the columns concatenated without gaps; they can be `flattened' (viewed as linear, one-dimensional) with impunity. (The flattening concept extends to arbitrarily deep matrices, so that a six-dimensional array can be viewed as a string of five-D arrays, each of which can be viewed as a string of four-D arrays, and so forth, all the way down to a string of simple values.%) Once you understand this, and see why C guarantees that p[-3], pa[-1][2], and twodim[0][4] are all the same, you are well on your way to understanding C's memory model (not `paradigm': that means `example'). You will also see why pa can only point to objects of type `array 5 of int', not `array 17 of int', and why the size of the array is required. ----- % For fun: the six-D array `char big[2][3][5][4][6][10]' occupies 7200 bytes (assuming one byte is one char). If the first byte is at byte address 0xc400, find the byte address of big[1][0][3][1][5][5]. I hid my answer as a message-ID in the references line. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris From: chris@mimsy.UUCP (Chris Torek) Subject: Re: char ***pointer; Keywords: allocating space Date: 18 Nov 88 07:40:26 GMT char *p; declares an object p which has type `pointer to char' and no specific value. (If p is static or external, it is initialised to (char *)NULL; if it is automatic, it is full of garbage.) Similarly, char **p; declares an object p which has type `pointer to pointer to char' and no specific value. We can keep this up for days :-) and write char *******p; which declares an object p which has type `pointer to pointer ... to char' and no specific value. But we will stop with char ***pppc; which declares `pppc' as type `pointer to pointer to pointer to char', and leaves its value unspecified. None of these pointers point *to* anything, but if I say, e.g., char c = '!'; char *pc = &c; char **ppc = &pc; char ***pppc = &ppc; then I have each pointer pointing to something. pppc points to ppc; ppc points to pc; pc points to c; and hence, ***pppc is the character '!'. Now, there is a peculiar status for pointers in C: they point not only to the object immediately at *ptr, but also to any other objects an an array named by *(ptr+offset). (The latter can also be written as ptr[offset].) So I could say: int i, j, k; char c[NPPC][NPC][NC]; char *pc[NPPC][NPC]; char **ppc[NPPC]; char ***pppc; pppc = ppc; for (i = 0; i < NPPC; i++) { ppc[i] = pc[i]; for (j = 0; j < NPC; j++) { pc[i][j] = c[i][j]; for (k = 0; k < NC; k++) c[i][j][k] = '!'; } } What this means is perhaps not immediately clear%. There is a two- dimensional array of pointers to characters pc[i][j], each of which points to a number of characters, namely those in c[i][j][0] through c[i][j][NC-1]. A one-dimensional array ppc[i] contains pointers to pointers to characters; each ppc[i] points to a number of pointers to characters, namely those in pc[i][0] through pc[i][NPC-1]. Finally, pppc points to a number of pointers to pointers to characters, namely those in ppc[0] through ppc[NPPC-1]. ----- % :-) ----- The important thing to note is that each variable points to one or more objects whose type is the type derived from removing one `*' from the declaration of that variable. (Clear? :-) Maybe we should try it this way:) Since pppc is `char ***pppc', what ppc points to (*pppc) is of type `char **'---one fewer `*'s. pppc points to zero or more objects of this type; here, it points to the first of NPPC objects. As to malloc: malloc obtains a blob of memory of unspecified shape. The cast you put in front of malloc determines the shape of the blob. The argument to malloc determines its size. These should agree, or you will get into trouble later. So the first thing we need to do is this: pointer = (char ***)malloc(N * sizeof(char **)); if (pointer == NULL) quit("out of memory... goodbye"); Pointer will then point to N objects, each of which is a `char **'. None of those `char **'s will have any particular value (i.e., they do not point anywhere at all; they are garbage). If we make them point somewhere---to some object(s) of type `char **'---and make those objects point somewhere, then we will have something useful. Suppose we have done the one malloc above. Then if we use: pointer[0] = (char **)malloc(N1 * sizeof(char *)); if (pointer[0] == NULL) quit("out of memory"); we will have a value to which pointer[0] points, which can point to N1 objects, each of type `char *'. So we can then say, e.g., i = 0; while (i < N1 && fgets(buf, sizeof(buf), input) != NULL) pointer[0][i++] = strdup(buf); (strdup is a function that calls malloc to allocate space for a copy of its string argument, and then copies the string to that space and returns the new pointer. If malloc fails, strdup() returns NULL.) We could write instead i = 0; while (i < N1 && fgets(buf, sizeof(buf), input) != NULL) *(*pointer)++ = strdup(buf); Note that **pointer++ = strdup(buf); sets **pointer (equivalently, pointer[0][0]), then increments the value in `pointer', not that in pointer[0]. But using *(*pointer)++ means that we will later have to write pointer[0] -= i; to adjust pointer[0] backwards by the number of strings read in and strdup()ed, or else use negative subscripts to locate the strings. Probably all of this will be somewhat clearer with a more realistic example. The following code creates an array of arrays of lines. /* begin code (untested) */ /* this assumes prototypes are available */ #include <stddef.h> #include <stdio.h> #include <string.h> static char nomem[] = "out of memory, exiting"; quit(char *msg) { (void) fprintf(stderr, "%s\n", msg); exit(1); /* NOTREACHED */ } /* * Read an input string from a file. * Return a pointer to dynamically allocated space. */ char *readstr(FILE *f) { register char *s = NULL, *p; int more = 1, curlen = 0, l; char inbuf[BUFSIZ]; /* * The following loop is not terribly efficient if you have * many long input lines. */ while (fgets(inbuf, sizeof(inbuf), f) != NULL) { p = strchr(inbuf, '\n'); if (p != NULL) { /* got it all */ *p = 0; l = p - inbuf; more = 0; /* signal stop */ } else l = strlen(inbuf); /* * N.B. dpANS says realloc((void *)NULL, n) => malloc(n); * if your realloc does not work that way, you will * have to fix this. */ s = realloc(s, curlen + l + 1); if (s == NULL) quit(nomem); strcpy(s + curlen, inbuf); if (more == 0) /* done; stop */ break; curlen += l; } /* should check for input error, actually */ return (s); } /* * Read an array of strings into a vector. * Return a pointer to dynamically allocated space. * There are n+1 vectors, the last one being NULL. */ char **readfile(FILE *f) { register char **vec, *s; register int veclen; /* * This is terribly inefficent, but it should be correct. * * malloc below is implicitly cast to (char **), but this * depends on it returning (void *); old compilers need the * cast, since malloc() returns (char *). The same applies * to realloc() below. */ vec = malloc(sizeof(char *)); if (vec == NULL) quit(nomem); veclen = 0; while ((s = readstr(f)) != NULL) { vec = realloc(vec, (veclen + 2) * sizeof(char *)); if (vec == NULL) quit(nomem); vec[veclen++] = s; } vec[veclen] = NULL; return (vec); } /* * Read a list of files specified in an argv. * Each file's list of lines is stored as a vector at p[i]. * The end of the list of files is indicated by p[i] being NULL. * * It would probably be more useful, if less appropriate * for this example, to return a list of (filename, contents) pairs. */ char ***readlots(register char **names) { register char ***p; register int nread; register FILE *f; char **vp; extern int errno; p = malloc(sizeof(char **)); if (p == NULL) quit(nomem); for (nread = 0; *names != NULL; names++) { if ((f = fopen(*names, "r")) == NULL) { (void) fprintf(stderr, "ThisProg: cannot read %s: %s\n", *names, strerror(errno)); continue; } vp = readfile(f); (void) fclose(f); p = realloc(p, (nread + 2) * sizeof(char **)); if (p == NULL) quit(nomem); p[nread++] = vp; } p[nread] = NULL; return (p); } /* e.g., instead: struct file_data { char *fd_name; char **fd_text; }; struct file_data *readlots(register char **names) { register struct file_data *p; register int nread; register FILE *f; char **vp; extern int errno; p = malloc(sizeof(*p)); if (p == NULL) quit(nomem); for (nread = 0; *names != NULL; names++) { <...same file-reading code as above...> p = realloc(p, (nread + 2) * sizeof(*p)); if (p == NULL) quit(nomem); p[nread].fd_name = *names; p[nread].fd_text = vp; nread++; } p[nread].fd_name = NULL; p[nread].fd_text = NULL; return (p); } */ /* end of code */ -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
wlp@calmasd.Prime.COM (Walter L. Peterson, Jr.) (04/06/89)
In article <3658@uhccux.uhcc.hawaii.edu>, cs411s03@uhccux.uhcc.hawaii.edu (Cs411s03) writes: > > I am having difficulty writing a program which dynamically allocates > an array of structs via malloc() and casting. The program seems to > compile OK, but at run time I am addressing the SAME struct over and > over, I want to step thru them with an index. The program goes like > so: > > /* malloc struct test */ > > #include <stdio.h> > #include <malloc.h> > > #define NUMRECS 60 > > struct ttst { > int num; > }; > > struct ttst (*tptr)[]; > > main() > { > int i, j; > > if ((tptr = (struct ttst (*)[]) \ > malloc(sizeof(struct ttst) * NUMRECS)) == (struct ttst (*)[]) 0) { > perror("malloc"); > exit(1); > } > > printf("sizeof = %d\n", sizeof(struct ttst) * NUMRECS); > > for (i = j = 0; i < NUMRECS; i++) { > tptr[i]->num = ++j; > printf("Rec: %x %d %d\n", tptr[i], i, tptr[i]->num); > } > > printf("---------------\n"); > > for (i = 0; i < NUMRECS; i++) > printf("Rec: %x %d %d\n", tptr[i], i, tptr[i]->num); > > free(tptr); > exit(0); > } > > The program will do much more than this once completed, this program > was just to test the basic concept of dynamically allocating the > structures rather than just declaring a fixed length array ... > > The first loop looks as if it is working, but the second loop proves it > is not. It seems that the same structure is being written to over and > over. I must be missing something fundamental here ... > > cs411s03!uhccux Yes, you are missing something, but don't feel bad, this DOES seem the intuitive way to do it. I don't know what system you are on, but I have tried your original code on Pyramid (BSD & SYSV), SUN (SunOS (mostly SYSV)) and TURBO-C (Messy-DOS). None of these have malloc.h, so that had to be changed to start. Sun gave the compile warning, zero-length array element on each line that referenced tptr[i], Pyramid compiled OK and TURBO-C, which is pretty much ANSI, had fatal errors and failed to compile at all. It did not like the tptr[i]->num. On both SUN and Pyramid it behaived as you said; the first loop looked OK, but the second didn't. Which is just what I would expect. The solution to this problem is simple once you know a few things about the way C treats arrays and subscripts. The resulting code is also, IMHO, "cleaner" and very easy to read. First: the declaration of the pointer is simpler; just declare a pointer to the structure: struct ttst *tptr; no blank subscripts or other jazz. Second: I would suggest using calloc rather than malloc for several reasons. Reason one is that calloc's arguments make what you are trying to do clearer. The first arg is the number of objects for which you are allocating space and the second arg is the size of one of those objects. The second reason is that calloc will "zero-out" the memory space that it allocates; malloc dosn't (typically). Calloc returns the value of the memory location at the start of the block of memory allocated, which must be cast to the type of object being pointed to; that is: tptr = (struct ttst *)calloc(NUMRECS, sizeof(struct ttst)); This gives you a pointer, tptr, to a block of initialized memory that is NUMRECS * sizeof(struct ttst) bytes long and ensures that C now knows that the pointer tptr points to things that are sizeof(struct ttst) long. It is at this point that an understanding of the way C handles arrays and their subscripts comes in. In C the UNSUBSCRIPTED name of an array is in fact a POINTER TO THE BEGINING OF THE ARRAY. For example, if I declare foo to be an array of 20 integers, then the unsubscripted name foo points to the start of the array. C then uses the subscripts as offsets from the array's starting address, each offset being as long as the type of the array elements, to find each array element. Since in this case we are dealing with a structure we will also need an additional offset to get to each element of the structure at each array element; that is done using the standard structure dot notation, NOT the structure pointer -> notation. The reason for this is (C super-wizards no flames please if this explaination is not *STRICTLY* "by-the-book"; I'm recalling this off the top of my head.) that, since you are using the subscript notation, tptr[i], C is interpreting tptr as the name (and thus address) of an array, each ELEMENT of which is of type struct ttst and not as a pointer to a struct ttst. This means that the notation: tptr[i]->num is wrong (TURBO-C refused to compile using this notation) and should be: tptr[i].num The following code compiles and runs on SUN, Pyramid and TURBO-C and provides the expected output: /* malloc struct test */ #include <stdio.h> #define NUMRECS 60 struct ttst { int num; }; main() { int i; struct ttst *tptr; if ((tptr = (struct ttst *)calloc(NUMRECS, sizeof(struct ttst))) == NULL){ perror("malloc"); exit(1); } printf("sizeof = %d\n", sizeof(struct ttst) * NUMRECS); for (i = 0; i < NUMRECS; i++) { tptr[i].num = i; printf("Rec: %x %d %d\n", tptr[i], i, tptr[i].num); } printf("---------------\n"); for (i = 0; i < NUMRECS; i++) printf("Rec: %x %d %d\n", tptr[i], i, tptr[i].num); free(tptr); exit(0); } The only complaint that any of the compilers give is a "warning: structure passed by value" for the printing of tptr[i], using the %x format. I'm not too certain what you were expecting here; you DONT get the address. What you DO get in this example is the hex value of tptr[i].num. Also, notice the change in the first loop. There is no need for your variable "j". I hope all this helps. --------------------------------------- Walt Peterson wlp@calmasd.Prime.COM -- Walt Peterson. Prime - Calma San Diego R&D (Object and Data Management Group) "The opinions expressed here are my own and do not necessarily reflect those Prime, Calma nor anyone else. ...{ucbvax|decvax}!sdcsvax!calmasd!wlp
dmg@ssc-vax.UUCP (David Geary) (04/08/89)
In article <3658@uhccux.uhcc.hawaii.edu>, (Cs411s03) writes: || I am having difficulty writing a program which dynamically allocates || an array of structs via malloc() and casting. The program seems to || compile OK, but at run time I am addressing the SAME struct over and || over, I want to step thru them with an index. The program goes like || so: || /* malloc struct test */ || #include <stdio.h> || #include <malloc.h> || #define NUMRECS 60 || struct ttst { || int num; || }; || || struct ttst (*tptr)[]; No - what you want here is: struct ttst *tptr; You simply need a pointer to struct ttst. Below we will allocate an array of struct ttst's, and tptr will point to the first struct in the array. || || main() || { || int i, j; || || if ((tptr = (struct ttst (*)[]) \ || malloc(sizeof(struct ttst) * NUMRECS)) == (struct ttst (*)[]) 0) { || perror("malloc"); || exit(1); || } Should be: if( (tptr = (struct ttst *)malloc(sizeof(struct ttst) * NUMRECS)) == NULL) { perror("malloc"); exit(1); } (BTW the '\' character in your code above should not be accepted by the compiler. This is used to extend lines for the preprocessor, not the compiler) So now we have: -------- ________________________ | tptr |--------------------->| | -------- | struct ttst | ------------------------- | | | struct ttst | ------------------------- | | | struct ttst | ------------------------- . . . || printf("sizeof = %d\n", sizeof(struct ttst) * NUMRECS); || From here on down, all tptr[i]->num should be tptr[i].num Realize that tptr[0], tptr[1], tptr[2], etc. are all struct ttst's, NOT pointers to struct ttst's. Therefore, we have tptr[i].num, NOT tptr[i]->num. || for (i = j = 0; i < NUMRECS; i++) { || tptr[i]->num = ++j; || printf("Rec: %x %d %d\n", tptr[i], i, tptr[i]->num); || } || || printf("---------------\n"); || || for (i = 0; i < NUMRECS; i++) || printf("Rec: %x %d %d\n", tptr[i], i, tptr[i]->num); || || free(tptr); || exit(0); || } || -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~ David Geary, Boeing Aerospace, Seattle ~ ~ "I wish I lived where it *only* rains 364 days a year" ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
throopw@agarn.dg.com (Wayne A. Throop) (04/11/89)
> cs411s03@uhccux.uhcc.hawaii.edu (Cs411s03) >I am having difficulty writing a program which dynamically allocates >an array of structs via malloc() and casting. The program seems to >compile OK, but at run time I am addressing the SAME struct over and >over, I want to step thru them with an index. I'll tackle this problem in a way different that Chris does. Running the normal typechecking tools over it gives notice that: 35 type mismatch in call to function free Argument 1 is: tptr Its type is: pointer to array of struct ttst The expected type is: pointer to char But it is pretty clear that that isn't the problem. So let's examine the program with some debugging tools. First, let's see what sort of thing a tptr is: (debug) describe tptr struct { int num; } (*tptr)[1]; (debug) global-options,lang pascal (debug) describe tptr ^ARRAY [0..0] OF RECORD num: LONG_INTEGER; END OK, so we see that it is pointer to an array of records containing 32-bit integers. Further, the compiler seems to have filled in an array bound of 1, in place of the missing one in the declaration. Now, as the program runs, it produces an induction variable "i", and evaluates the expression (tptr[i]->num). Let's ask the debugger what all this means: (debug) global-options, lang pascal (debug) describe tptr[0] Error: the reference to be subscripted is not an array or string. Right away, we see that we can't do this in pascal, so as Daffy Duck says "Hmmmmmm. Sump'n amiss here." Let's return to C and run through that again. "Ho. Ha. Guard. Turn. Parry. THRUST!" (debug) global-options, lang c (debug) describe tptr[0] struct { int num; } [1]; Kapwong! The light should be beginning to dawn. We have subscripted a pointer, and are thus about to use a pointer-indirect operator, "->", to indirect an array name. Now, the absolutely bizarre thing about this is that on our local machine using our local compiler, this program works as the original poster apparently expected, because the structs are 32 bits long, as are pointers, as are ints, as are array[1] of these structs. Thus, under our local compiler, the subscript doesn't always refer to the zeroth array element. On most pcc-derived compilers, the compiler will default the array size to zero instead of one, and produce the behavior the original poster saw. The reason this program works on our local machine with our local compiler is a tremendous comedy of errors and coincidences, and I am leaving the details as an excercise for the reader, because when you work it out (as I have), you just have to laugh and/or cry. And I think it makes a good catharsis either way when thinking about C traps and pitfalls. Now, the moral of this story is USE YOUR TOOLS! Use them in novel and innovative ways. Lint failed to find the bug (because of pointer-arithmetic/array-subscript equivalence), but that doesn't exhaust the list of tools for poking around in programs. Read the assembly code. The subscript calculation produced for the various stages of the offending expression was highly enlightning (and amusing). Poke around with a debugger. When you do the subscript and end up with an array (when the code clearly expected a pointer instead), alarm bells should ring, and either the accessing expression should be changed to (*tptr)[i].num or the declaration AND accessing expression changed to struct ttst *tptr; ... tptr[i].num So again: USE YOUR TOOLS! -- Shakespeare was against assembly language coding: "Bloody instructions which being learned, return to plague the inventor." --- M. E. Hopkins quoting Macbeth. -- Wayne Throop <the-known-world>!mcnc!rti!xyzzy!throopw