root@mjbtn.MFEE.TN.US (Mark J. Bailey) (11/14/88)
Hello, I have a question for the C gurus out there. It has to do with a pointer declaration and its subsequent use. I have declared: char ***pointer; Now I know that *ch is generally pointing to the start of a string, ie, a list (array) of characters. **ch would be equal to say a list of lists (or an array of strings, ie, **argv, for example). ***ch would support a list of lists of lists of characters. I suppose this could be associated with a two dimensional array of strings. My biggest question here is, when I declare a *ch and then call malloc to allocate a space for it to point to, I have a string. I can ch++ to move up the string. But when I have **ch, and subsequently, ***ch, do I have to call malloc to allocate space for more pointers? For example (using pointer above), pointer = (char ***) malloc((x)*sizeof(char *)); to allocate space for the *LIST* of pointers I am wanting to use? Let me put it another way, I intend to maintain an array of arrays of strings. Because of the flexibility of pointers and the fact that I don't know ahead of time how many strings and arrays of strings that I am going to maintain in my top level array, I use ***pointer instead of pointer[x][y][z]; Now, when I have a string. I call malloc to allocate space for that string. I store the string in the new space and then attach it to my pointer structure. Can I say **pointer = (char *) newspace? Then can I say **pointer++ to move to then next space? Do I have to create spaces to **pointer++ to? A pointer *ch and **ch both occupy 4 bytes on my in MSDOS. How does the system view *ch, **ch, and ***ch differently such that I theoretically could do this? I apologize for my uncertainty in presenting this, yet I am quite uncertain about it. I don't know exactly how one would view multi- dimension pointers and I was taking a stab at it here hoping someone better versed in the concept might shed some light on it for me. I also know that *ch can be referenced with ch[] and **ch with ch[][]. I assume ***ch can use ch[][][], but is this legal? And it just seems to me that somehow, one would have to ahead of time allocate space for pointers that you might move to as in **pointer++ above. I guess you could view that as pointer[0][0] going to pointer[0][1]. I feel like I am way off base completely. Please tell me so if I am. Again sorry for such a long posting. I would be most appreciative to any help (and light - PLEASE!!!) anyone might provide. I know I don't know what I am talking about! :-) Mark. P.S. Email responses would be fine as I am sure this is highly specialized and should occupy anymore bandwidth than is necssary. I will summarize and post if warranted. -- Mark J. Bailey "Y'all com bak naw, ya hear!" USMAIL: 511 Memorial Blvd., Murfreesboro, TN 37129 ___________________________ VOICE: +1 615 893 4450 / +1 615 896 4153 | JobSoft UUCP: ...!{ames,mit-eddie}!killer!mjbtn!mjb | Design & Development Co. DOMAIN: mjb@mjbtn.MFEE.US.TN | Murfreesboro, TN USA
throopw@xyzzy.UUCP (Wayne A. Throop) (11/18/88)
> root@mjbtn.MFEE.TN.US (Mark J. Bailey) > I have declared: > char ***pointer; > Now I know that *ch is generally pointing to the start of a string, ie, > a list (array) of characters. Mark has already departed into nonstandard (for C) terminology, and so I imagine that many (like myself) will be having to guess at what he really means by his questions. For example, does he mean that the expression (*ch) points to the start of a string when the variable ch is declared as in the example just above? I suspect not, because it just isn't true. Now if something were declared as char *ch, then the expression (ch) (note: not (*ch)) would point to a character which could be one of an array of characters. (Further, in C, when somebody talks about a "list", they are usually talking about independantly allocated nodes connected by pointers. Arrays are called arrays, or sometimes vectors. Much, *much* more rarely lists.) So, to start off, I presume when Mark says "D is a mumble", I presume he means "The identifier in D, if declared as in D, is a mumble". That's how the following makes the most sense. > **ch would be equal to say a list of lists > (or an array of strings, ie, **argv, for example). ***ch would support > a list of lists of lists of characters. I suppose this could be associated > with a two dimensional array of strings. Well, no. If declared as char ***ch, ch is not a two dimensional array of strings, other than in the most abstract, conceptual way. The object named by ch is a pointer. The pointer denotes a member of a one-dimensional array of pointers. Members of this array of pointers each denote a member of another (potentially unique) one-dimentional array of pointers. Finally, members of this last array each denote a member of a (potentially unique) one-dimentional array of characters. The important thing to note is that the only object for which space is allocated by the declaration char ***ch is that for the first pointer. The array of pointers-to-pointers and the arrays of pointers-to-char and the arrays of characters are all left up to the user to allocate. Therefore: > My biggest question here is, when I declare a *ch and then call malloc > to allocate a space for it to point to, I have a string. I can ch++ to > move up the string. But when I have **ch, and subsequently, ***ch, do > I have to call malloc to allocate space for more pointers? Yes, one must allocate all pointers and characters except for ch itself. > For example (using pointer above), > pointer = (char ***) malloc((x)*sizeof(char *)); > to allocate space for the *LIST* of pointers I am wanting to use? No. Space is being allocated for items of type (char *), and then that space is treated as if it were for items of type (char **). The malloc should, instead, look like this: pointer = (char ***) malloc((x)*sizeof(char **)); And, of course, one would then have to allocate space for the second level of pointers as well as for the character arrays: for( i=0; i<SOME_LIMIT; i++ ){ pointer[i] = (char **) malloc((y)*sizeof(char *)); for( j=0; j<SOME_OTHER_LIMIT; j++ ){ pointer[i][j] = (char *) malloc((z)*sizeof(char)); } } > Can I say **pointer = (char *) newspace? Then can > I say **pointer++ to move to then next space? Depends on what is mean by "then next space". In the example above, one can indeed say **pointer = (char *)malloc(mumble); and one can indeed say **pointer++. Of course, if p was a pointer typed (char ***) and (pointer==p) before the expression (**pointer++) is evaluated, then (pointer==(&p[1])) afterwards. In other words, one would be starting to step through the allocated arrays in fortran, or anti-odometer, order instead of C or odometer order. (Conceptually speaking, of course, since these aren't multi-dimentional arrays, but rather collections of uni-dimentional arrays connected by pointers.) > I also know that *ch can be referenced with ch[] and **ch with > ch[][]. I assume ***ch can use ch[][][], but is this legal? Yes, as long as the data structures to which the declaration char ***ch; refers have been allocated as outlined above. (And, of course, assuming that the brackets are filled in with some subscript values, of course.) > [...] as in > **pointer++ above. I guess you could view that as pointer[0][0] going > to pointer[0][1]. No, that's p[0][0] to p[1][0]. (++) has greater prescedence than (*). This IS one of those topics that comes by once every few months, so there is clearly a lot of confusion about how arrays and pointers relate to each other in C. The way I keep it straight is to remember that the declaration sometype *p; is talking about two chunks of storage, one for a pointer, which is allocated by the declaration, and one for an array of objects of sometype, which the user must allocate. On the other hand, the declaration sometype a[SOMESIZE]; is talking about only one chunk of storage, that for an array of objects of sometype, which the compiler and/or runtime system allocates. Then apply the distinction recursively for multiple dimentions or indirections. The confusion enters because when (a) is mentioned in an expression, it usually evaluates to the address of the first element in the chunk of storage it represents. Since (p) evaluates to the address of the first element in the chunk of storage p has been set to point to, this means that (a) and (p) are often used in exactly the same way in practice, the difference being that in the (p) case the user must allocate and free the chunk of memory that is occupied by the array of sometype objects, and in the (a) case, the compiler takes care of such things. Hope this posting is of some help. -- A is for Atom; they are all so small, That we have not really seen any at all. B is for Bomb. They are much bigger. So, mister you better keep off of the trigger. --- Edward Teller -- Wayne Throop <the-known-world>!mcnc!rti!xyzzy!throopw
chris@mimsy.UUCP (Chris Torek) (11/18/88)
In article <360@mjbtn.MFEE.TN.US> root@mjbtn.MFEE.TN.US (Mark J. Bailey) ends with: >P.S. Email responses would be fine as I am sure this is highly specialized > and should occupy anymore bandwidth than is necssary. I will summarize > and post if warranted. Here I go, breaking the rules again :-) ... >I have declared: > > char ***pointer; > >Now I know that *ch is generally pointing to the start of a string, ie, >a list (array) of characters. **ch would be equal to say a list of lists >(or an array of strings, ie, **argv, for example). ***ch would support >a list of lists of lists of characters. I suppose this could be associated >with a two dimensional array of strings. The first thing to do is establish communication. I think that the text above is the wrong way to start. char *p; declares an object p which has type `pointer to char' and no specific value. (If p is static or external, it is initialised to (char *)NULL; if it is automatic, it is full of garbage.) Similarly, char **p; declares an object p which has type `pointer to pointer to char' and no specific value. We can keep this up for days :-) and write char *******p; which declares an object p which has type `pointer to pointer ... to char' and no specific value. But we will stop with char ***pppc; which declares `pppc' as type `pointer to pointer to pointer to char', and leaves its value unspecified. Now: >My biggest question here is, when I declare a *ch and then call malloc >to allocate a space for it to point to, I have a string. I can ch++ to >move up the string. But when I have **ch, and subsequently, ***ch, do >I have to call malloc to allocate space for more pointers? Malloc has little to do with it. None of these pointers points *to* anything. But if I say, e.g., char c = '!'; char *pc = &c; char **ppc = &pc; char ***pppc = &ppc; then I have each pointer pointing to something. pppc points to ppc; ppc points to pc; pc points to c; and hence, ***pppc is the character '!'. Now, there is a peculiar status for pointers in C: they point not only to the object immediately at *ptr, but also to any other objects an an array named by *(ptr+offset). (The latter can also be written as ptr[offset].) So I could say: int i, j, k; char c[NPPC][NPC][NC]; char *pc[NPPC][NPC]; char **ppc[NPPC]; char ***pppc; pppc = ppc; for (i = 0; i < NPPC; i++) { ppc[i] = pc[i]; for (j = 0; j < NPC; j++) { pc[i][j] = c[i][j]; for (k = 0; k < NC; k++) c[i][j][k] = '!'; } } What this means is perhaps not immediately clear%. There is a two- dimensional array of pointers to characters pc[i][j], each of which points to a number of characters, namely those in c[i][j][0] through c[i][j][NC-1]. A one-dimensional array ppc[i] contains pointers to pointers to characters; each ppc[i] points to a number of pointers to characters, namely those in pc[i][0] through pc[i][NPC-1]. Finally, pppc points to a number of pointers to pointers to characters, namely those in ppc[0] through ppc[NPPC-1]. ----- % :-) ----- The important thing to note is that each variable points to one or more objects whose type is the type derived from removing one `*' from the declaration of that variable. (Clear? :-) Maybe we should try it this way:) Since pppc is `char ***pppc', what ppc points to (*pppc) is of type `char **'---one fewer `*'s. pppc points to zero or more objects of this type; here, it points to the first of NPPC objects. >For example (using pointer above), > >pointer = (char ***) malloc((x)*sizeof(char *)); > >to allocate space for the *LIST* of pointers I am wanting to use? Back to malloc: malloc obtains a blob of memory of unspecified shape. The cast you put in front of malloc determines the shape of the blob. The argument to malloc determines its size. These should agree, or you will get into trouble later. So the first thing we need to do is this: pointer = (char ***)malloc(N * sizeof(char **)); if (pointer == NULL) quit("out of memory... goodbye"); Pointer will then point to N objects, each of which is a `char **'. None of those `char **'s will have any particular value (i.e., they do not point anywhere at all). If we make them point somewhere--- to some object(s) of type `char **'---and make those objects point somewhere, then we will have something useful. >Now, when I have a string. I call malloc to allocate space for that >string. I store the string in the new space and then attach it to my >pointer structure. Can I say **pointer = (char *) newspace? Then can >I say **pointer++ to move to then next space? Do I have to create spaces >to **pointer++ to? Suppose we have done the one malloc above. Then if we use: pointer[0] = (char **)malloc(N1 * sizeof(char *)); if (pointer[0] == NULL) quit("out of memory"); we will have a value to which pointer[0] points, which can point to N1 objects, each of type `char *'. So we can then say, e.g., i = 0; while (i < N1 && fgets(buf, sizeof(buf), input) != NULL) pointer[0][i++] = strdup(buf); (strdup is a function that calls malloc to allocate space for a copy of its string argument, and then copies the string to that space and returns the new pointer. If malloc fails, strdup() returns NULL.) We could write instead i = 0; while (i < N1 && fgets(buf, sizeof(buf), input) != NULL) *(*pointer)++ = strdup(buf); Note that **pointer++ = strdup(buf); sets **pointer (equivalently, pointer[0][0]), then increments the value in `pointer', not that in pointer[0]. But using *(*pointer)++ means that we will later have to write pointer[0] -= i; to adjust pointer[0] backwards by the number of strings read in and strdup()ed, or else use negative subscripts to locate the strings. Probably all of this will be somewhat clearer with a more realistic example. The following code creates an array of arrays of lines. /* begin code (untested) */ /* this assumes prototypes are available */ #include <stddef.h> #include <stdio.h> #include <string.h> static char nomem[] = "out of memory, exiting"; quit(char *msg) { (void) fprintf(stderr, "%s\n", msg); exit(1); /* NOTREACHED */ } /* * Read an input string from a file. * Return a pointer to dynamically allocated space. */ char *readstr(FILE *f) { register char *s = NULL, *p; int more = 1, curlen = 0, l; char inbuf[BUFSIZ]; /* * The following loop is not terribly efficient if you have * many long input lines. */ while (fgets(inbuf, sizeof(inbuf), f) != NULL) { p = strchr(inbuf, '\n'); if (p != NULL) { /* got it all */ *p = 0; l = p - inbuf; more = 0; /* signal stop */ } else l = strlen(inbuf); /* * N.B. dpANS says realloc((void *)NULL, n) => malloc(n); * if your realloc does not work that way, you will * have to fix this. */ s = realloc(s, curlen + l + 1); if (s == NULL) quit(nomem); strcpy(s + curlen, inbuf); if (more == 0) /* done; stop */ break; curlen += l; } /* should check for input error, actually */ return (s); } /* * Read an array of strings into a vector. * Return a pointer to dynamically allocated space. * There are n+1 vectors, the last one being NULL. */ char **readfile(FILE *f) { register char **vec, *s; register int veclen; /* * This is terribly inefficent, but it should be correct. * * malloc below is implicitly cast to (char **), but this * depends on it returning (void *); old compilers need the * cast, since malloc() returns (char *). The same applies * to realloc() below. */ vec = malloc(sizeof(char *)); if (vec == NULL) quit(nomem); veclen = 0; while ((s = readstr(f)) != NULL) { vec = realloc(vec, (veclen + 2) * sizeof(char *)); if (vec == NULL) quit(nomem); vec[veclen++] = s; } vec[veclen] = NULL; return (vec); } /* * Read a list of files specified in an argv. * Each file's list of lines is stored as a vector at p[i]. * The end of the list of files is indicated by p[i] being NULL. * * It would probably be more useful, if less appropriate * for this example, to return a list of (filename, contents) pairs. */ char ***readlots(register char **names) { register char ***p; register int nread; register FILE *f; char **vp; extern int errno; p = malloc(sizeof(char **)); if (p == NULL) quit(nomem); for (nread = 0; *names != NULL; names++) { if ((f = fopen(*names, "r")) == NULL) { (void) fprintf(stderr, "ThisProg: cannot read %s: %s\n", *names, strerror(errno)); continue; } vp = readfile(f); (void) fclose(f); p = realloc(p, (nread + 2) * sizeof(char **)); if (p == NULL) quit(nomem); p[nread++] = vp; } p[nread] = NULL; return (p); } /* e.g., instead: struct file_data { char *fd_name; char **fd_text; }; struct file_data *readlots(register char **names) { register struct file_data *p; register int nread; register FILE *f; char **vp; extern int errno; p = malloc(sizeof(*p)); if (p == NULL) quit(nomem); for (nread = 0; *names != NULL; names++) { <...same file-reading code as above...> p = realloc(p, (nread + 2) * sizeof(*p)); if (p == NULL) quit(nomem); p[nread].fd_name = *names; p[nread].fd_text = vp; nread++; } p[nread].fd_name = NULL; p[nread].fd_text = NULL; return (p); } */ /* end of code */ -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
rbutterworth@watmath.waterloo.edu (Ray Butterworth) (11/19/88)
In article <1854@xyzzy.UUCP>, throopw@xyzzy.UUCP (Wayne A. Throop) writes: > This IS one of those topics that comes by once every few months, so > there is clearly a lot of confusion about how arrays and pointers > relate to each other in C. > Hope this posting is of some help. One additional suggestion for people that are trying to understand the difference between pointers and arrays is to write some small test programs that print the results of sizeof() on the various variables and expressions. Some of the values will be surprising and until one understands why the values are what they are, one will never properly understand how to use arrays and pointers. Also, play around with assigning various expressions to variables and try to understand why something is or is not legal. e.g. main(){ /* using different prime dimensions helps when looking at sizeof() */ auto char arr[3][5][7]; auto char ***p; auto char (*q)[7]; auto char *r; p = arr; /* this is illegal */ q = &arr[0][0]; /* this is legal, but old compilers ignore the & */ r = &arr[0][0][0]; printf("%d %d %d %d\n", sizeof(arr), sizeof(arr[0]), sizeof(arr[0][0]), sizeof(arr[0][0][0])); printf("%d %d %d %d\n", sizeof(p), sizeof(*p), sizeof(**p), sizeof(***p)); printf("%d %d %d\n", sizeof(q), sizeof(*q), sizeof(*q[0])); printf("%d %d %d\n", sizeof(r), sizeof(*r), sizeof(r[0])); /* %d should be %ld on some systems */ } Note that the sizeof(arr), sizeof(p), etc. indicate the total number of bytes that are actually allocated by the auto declarations. So, for instance "auto char ***p" only allocates something like 4 bytes, even though p may eventually end up pointing at memory containing thousands of bytes. The 4 should indicate that the compiler has not allocated that other memory and it is still up to the programmer to find it somewhere.