[comp.lang.c] char ***pointer;

root@mjbtn.MFEE.TN.US (Mark J. Bailey) (11/14/88)

Hello,

I have a question for the C gurus out there.  It has to do with a pointer
declaration and its subsequent use.

I have declared:

	char	***pointer;

Now I know that *ch is generally pointing to the start of a string, ie,
a list (array) of characters.  **ch would be equal to say a list of lists
(or an array of strings, ie, **argv, for example).  ***ch would support
a list of lists of lists of characters.  I suppose this could be associated
with a two dimensional array of strings.

My biggest question here is, when I declare a *ch and then call malloc
to allocate a space for it to point to, I have a string.  I can ch++ to
move up the string.  But when I have **ch, and subsequently, ***ch, do
I have to call malloc to allocate space for more pointers?

For example (using pointer above),

pointer = (char ***) malloc((x)*sizeof(char *));

to allocate space for the *LIST* of pointers I am wanting to use?    

Let me put it another way, I intend to maintain an array of arrays of
strings.  Because of the flexibility of pointers and the fact that I 
don't know ahead of time how many strings and arrays of strings that I  
am going to maintain in my top level array, I use ***pointer instead
of pointer[x][y][z];

Now, when I have a string.  I call malloc to allocate space for that 
string.  I store the string in the new space and then attach it to my
pointer structure.  Can I say **pointer = (char *) newspace?  Then can
I say **pointer++ to move to then next space?  Do I have to create spaces
to **pointer++ to?  A pointer *ch and **ch both occupy 4 bytes on my
in MSDOS.  How does the system view *ch, **ch, and ***ch differently
such that I theoretically could do this?

I apologize for my uncertainty in presenting this, yet I am quite 
uncertain about it.  I don't know exactly how one would view multi-
dimension pointers and I was taking a stab at it here hoping someone
better versed in the concept might shed some light on it for me.
I also know that *ch can be referenced with ch[] and **ch with
ch[][].  I assume ***ch can use ch[][][], but is this legal?  And
it just seems to me that somehow, one would have to ahead of
time allocate space for pointers that you might move to as in
**pointer++ above.  I guess you could view that as pointer[0][0] going
to pointer[0][1].

I feel like I am way off base completely.  Please tell me so if I 
am.  Again sorry for such a long posting.  I would be most appreciative
to any help (and light - PLEASE!!!) anyone might provide.  I know
I don't know what I am talking about! :-)

Mark.

P.S. Email responses would be fine as I am sure this is highly specialized
     and should occupy anymore bandwidth than is necssary.  I will summarize
     and post if warranted.

-- 
Mark J. Bailey                                    "Y'all com bak naw, ya hear!"
USMAIL: 511 Memorial Blvd., Murfreesboro, TN 37129 ___________________________
VOICE:  +1 615 893 4450 / +1 615 896 4153          |         JobSoft
UUCP:   ...!{ames,mit-eddie}!killer!mjbtn!mjb      | Design & Development Co.
DOMAIN: mjb@mjbtn.MFEE.US.TN                       |  Murfreesboro, TN  USA

throopw@xyzzy.UUCP (Wayne A. Throop) (11/18/88)

> root@mjbtn.MFEE.TN.US (Mark J. Bailey)
> I have declared:
>	char	***pointer;
> Now I know that *ch is generally pointing to the start of a string, ie,
> a list (array) of characters.

Mark has already departed into nonstandard (for C) terminology, and so
I imagine that many (like myself) will be having to guess at what he
really means by his questions.  For example, does he mean that the
expression (*ch) points to the start of a string when the variable ch
is declared as in the example just above?  I suspect not, because it
just isn't true.  Now if something were declared as char *ch, then the
expression (ch) (note: not (*ch)) would point to a character which
could be one of an array of characters.  (Further, in C, when somebody
talks about a "list", they are usually talking about independantly
allocated nodes connected by pointers.  Arrays are called arrays, or
sometimes vectors.  Much, *much* more rarely lists.)

So, to start off, I presume when Mark says "D is a mumble", I presume
he means "The identifier in D, if declared as in D, is a mumble".  That's
how the following makes the most sense.

> **ch would be equal to say a list of lists
> (or an array of strings, ie, **argv, for example).  ***ch would support
> a list of lists of lists of characters.  I suppose this could be associated
> with a two dimensional array of strings.

Well, no.  If declared as char ***ch, ch is not a two dimensional
array of strings, other than in the most abstract, conceptual way.
The object named by ch is a pointer.  The pointer denotes a member of
a one-dimensional array of pointers.  Members of this array of
pointers each denote a member of another (potentially unique)
one-dimentional array of pointers.  Finally, members of this last
array each denote a member of a (potentially unique) one-dimentional
array of characters.

The important thing to note is that the only object for which space is
allocated by the declaration char ***ch is that for the first pointer.
The array of pointers-to-pointers and the arrays of pointers-to-char
and the arrays of characters are all left up to the user to allocate.
Therefore:

> My biggest question here is, when I declare a *ch and then call malloc
> to allocate a space for it to point to, I have a string.  I can ch++ to
> move up the string.  But when I have **ch, and subsequently, ***ch, do
> I have to call malloc to allocate space for more pointers?

Yes, one must allocate all pointers and characters except for ch itself.

> For example (using pointer above),
>    pointer = (char ***) malloc((x)*sizeof(char *));
> to allocate space for the *LIST* of pointers I am wanting to use?

No.  Space is being allocated for items of type (char *), and then
that space is treated as if it were for items of type (char **).
The malloc should, instead, look like this:

    pointer = (char ***) malloc((x)*sizeof(char **));

And, of course, one would then have to allocate space for the second
level of pointers as well as for the character arrays:

    for( i=0; i<SOME_LIMIT; i++ ){
        pointer[i] = (char **) malloc((y)*sizeof(char *));
        for( j=0; j<SOME_OTHER_LIMIT; j++ ){
            pointer[i][j] = (char *) malloc((z)*sizeof(char));
        }
    }

> Can I say **pointer = (char *) newspace?  Then can
> I say **pointer++ to move to then next space?

Depends on what is mean by "then next space".  In the example above,
one can indeed say **pointer = (char *)malloc(mumble); and one can
indeed say **pointer++.  Of course, if p was a pointer typed (char
***) and (pointer==p) before the expression (**pointer++) is
evaluated, then (pointer==(&p[1])) afterwards.  In other words, one
would be starting to step through the allocated arrays in fortran, or
anti-odometer, order instead of C or odometer order.  (Conceptually
speaking, of course, since these aren't multi-dimentional arrays, but
rather collections of uni-dimentional arrays connected by pointers.)

> I also know that *ch can be referenced with ch[] and **ch with
> ch[][].  I assume ***ch can use ch[][][], but is this legal?

Yes, as long as the data structures to which the declaration char ***ch;
refers have been allocated as outlined above.  (And, of course, assuming
that the brackets are filled in with some subscript values, of course.)

> [...] as in
> **pointer++ above.  I guess you could view that as pointer[0][0] going
> to pointer[0][1].

No, that's p[0][0] to p[1][0].  (++) has greater prescedence than (*).

This IS one of those topics that comes by once every few months, so
there is clearly a lot of confusion about how arrays and pointers
relate to each other in C.  The way I keep it straight is to remember
that the declaration

     sometype *p;

is talking about two chunks of storage, one for a pointer, which is
allocated by the declaration, and one for an array of objects of
sometype, which the user must allocate.  On the other hand, the
declaration

     sometype a[SOMESIZE];

is talking about only one chunk of storage, that for an array of
objects of sometype, which the compiler and/or runtime system
allocates.  Then apply the distinction recursively for multiple
dimentions or indirections.

The confusion enters because when (a) is mentioned in an expression,
it usually evaluates to the address of the first element in the chunk
of storage it represents.  Since (p) evaluates to the address of the
first element in the chunk of storage p has been set to point to, this
means that (a) and (p) are often used in exactly the same way in
practice, the difference being that in the (p) case the user must
allocate and free the chunk of memory that is occupied by the array of
sometype objects, and in the (a) case, the compiler takes care of such
things.

Hope this posting is of some help.

--
A is for Atom; they are all so small,
That we have not really seen any at all.

B is for Bomb.  They are much bigger.
So, mister you better keep off of the trigger.
					--- Edward Teller
-- 
Wayne Throop      <the-known-world>!mcnc!rti!xyzzy!throopw

chris@mimsy.UUCP (Chris Torek) (11/18/88)

In article <360@mjbtn.MFEE.TN.US> root@mjbtn.MFEE.TN.US (Mark J. Bailey)
ends with:
>P.S. Email responses would be fine as I am sure this is highly specialized
>     and should occupy anymore bandwidth than is necssary.  I will summarize
>     and post if warranted.

Here I go, breaking the rules again :-) ...

>I have declared:
>
>	char	***pointer;
>
>Now I know that *ch is generally pointing to the start of a string, ie,
>a list (array) of characters.  **ch would be equal to say a list of lists
>(or an array of strings, ie, **argv, for example).  ***ch would support
>a list of lists of lists of characters.  I suppose this could be associated
>with a two dimensional array of strings.

The first thing to do is establish communication.  I think that the
text above is the wrong way to start.

	char *p;

declares an object p which has type `pointer to char' and no specific
value.  (If p is static or external, it is initialised to (char *)NULL;
if it is automatic, it is full of garbage.)  Similarly,

	char **p;

declares an object p which has type `pointer to pointer to char' and
no specific value.  We can keep this up for days :-) and write

	char *******p;

which declares an object p which has type `pointer to pointer ... to char'
and no specific value.  But we will stop with

	char ***pppc;

which declares `pppc' as type `pointer to pointer to pointer to char',
and leaves its value unspecified.  Now:

>My biggest question here is, when I declare a *ch and then call malloc
>to allocate a space for it to point to, I have a string.  I can ch++ to
>move up the string.  But when I have **ch, and subsequently, ***ch, do
>I have to call malloc to allocate space for more pointers?

Malloc has little to do with it.  None of these pointers points *to*
anything.  But if I say, e.g.,

	char c = '!';
	char *pc = &c;
	char **ppc = &pc;
	char ***pppc = &ppc;

then I have each pointer pointing to something.  pppc points to ppc;
ppc points to pc; pc points to c; and hence, ***pppc is the character
'!'.

Now, there is a peculiar status for pointers in C: they point not only
to the object immediately at *ptr, but also to any other objects an
an array named by *(ptr+offset).  (The latter can also be written as
ptr[offset].)  So I could say:

	int i, j, k;
	char c[NPPC][NPC][NC];
	char *pc[NPPC][NPC];
	char **ppc[NPPC];
	char ***pppc;

	pppc = ppc;
	for (i = 0; i < NPPC; i++) {
		ppc[i] = pc[i];
		for (j = 0; j < NPC; j++) {
			pc[i][j] = c[i][j];
			for (k = 0; k < NC; k++)
				c[i][j][k] = '!';
		}
	}

What this means is perhaps not immediately clear%.  There is a two-
dimensional array of pointers to characters pc[i][j], each of which
points to a number of characters, namely those in c[i][j][0] through
c[i][j][NC-1].  A one-dimensional array ppc[i] contains pointers to
pointers to characters; each ppc[i] points to a number of pointers to
characters, namely those in pc[i][0] through pc[i][NPC-1].  Finally,
pppc points to a number of pointers to pointers to characters, namely
those in ppc[0] through ppc[NPPC-1].
-----
% :-)
-----

The important thing to note is that each variable points to one or
more objects whose type is the type derived from removing one `*'
from the declaration of that variable.  (Clear? :-)  Maybe we should
try it this way:)  Since pppc is `char ***pppc', what ppc points to
(*pppc) is of type `char **'---one fewer `*'s.  pppc points to zero
or more objects of this type; here, it points to the first of NPPC
objects.

>For example (using pointer above),
>
>pointer = (char ***) malloc((x)*sizeof(char *));
>
>to allocate space for the *LIST* of pointers I am wanting to use?    

Back to malloc: malloc obtains a blob of memory of unspecified
shape.  The cast you put in front of malloc determines the shape
of the blob.  The argument to malloc determines its size.  These
should agree, or you will get into trouble later.  So the first thing
we need to do is this:

	pointer = (char ***)malloc(N * sizeof(char **));
	if (pointer == NULL) quit("out of memory... goodbye");

Pointer will then point to N objects, each of which is a `char **'.
None of those `char **'s will have any particular value (i.e., they
do not point anywhere at all).  If we make them point somewhere---
to some object(s) of type `char **'---and make those objects point
somewhere, then we will have something useful.

>Now, when I have a string.  I call malloc to allocate space for that 
>string.  I store the string in the new space and then attach it to my
>pointer structure.  Can I say **pointer = (char *) newspace?  Then can
>I say **pointer++ to move to then next space?  Do I have to create spaces
>to **pointer++ to?

Suppose we have done the one malloc above.  Then if we use:

	pointer[0] = (char **)malloc(N1 * sizeof(char *));
	if (pointer[0] == NULL) quit("out of memory");

we will have a value to which pointer[0] points, which can point to
N1 objects, each of type `char *'.  So we can then say, e.g.,

	i = 0;
	while (i < N1 && fgets(buf, sizeof(buf), input) != NULL)
		pointer[0][i++] = strdup(buf);

(strdup is a function that calls malloc to allocate space for a copy
of its string argument, and then copies the string to that space and
returns the new pointer.  If malloc fails, strdup() returns NULL.)
We could write instead

	i = 0;
	while (i < N1 && fgets(buf, sizeof(buf), input) != NULL)
		*(*pointer)++ = strdup(buf);

Note that

		**pointer++ = strdup(buf);

sets **pointer (equivalently, pointer[0][0]), then increments the
value in `pointer', not that in pointer[0].  But using *(*pointer)++
means that we will later have to write

	pointer[0] -= i;

to adjust pointer[0] backwards by the number of strings read in and
strdup()ed, or else use negative subscripts to locate the strings.

Probably all of this will be somewhat clearer with a more realistic
example.  The following code creates an array of arrays of lines.

/* begin code (untested) */
/* this assumes prototypes are available */

#include <stddef.h>
#include <stdio.h>
#include <string.h>

static char nomem[] = "out of memory, exiting";

quit(char *msg) {
	(void) fprintf(stderr, "%s\n", msg);
	exit(1);
	/* NOTREACHED */
}

/*
 * Read an input string from a file.
 * Return a pointer to dynamically allocated space.
 */
char *readstr(FILE *f) {
	register char *s = NULL, *p;
	int more = 1, curlen = 0, l;
	char inbuf[BUFSIZ];

	/*
	 * The following loop is not terribly efficient if you have
	 * many long input lines.
	 */
	while (fgets(inbuf, sizeof(inbuf), f) != NULL) {
		p = strchr(inbuf, '\n');
		if (p != NULL) {	/* got it all */
			*p = 0;
			l = p - inbuf;
			more = 0;	/* signal stop */
		} else
			l = strlen(inbuf);

		/*
		 * N.B. dpANS says realloc((void *)NULL, n) => malloc(n);
		 * if your realloc does not work that way, you will
		 * have to fix this.
		 */
		s = realloc(s, curlen + l + 1);
		if (s == NULL)
			quit(nomem);
		strcpy(s + curlen, inbuf);
		if (more == 0)		/* done; stop */
			break;
		curlen += l;
	}
	/* should check for input error, actually */
	return (s);
}

/*
 * Read an array of strings into a vector.
 * Return a pointer to dynamically allocated space.
 * There are n+1 vectors, the last one being NULL.
 */
char **readfile(FILE *f) {
	register char **vec, *s;
	register int veclen;

	/*
	 * This is terribly inefficent, but it should be correct.
	 *
	 * malloc below is implicitly cast to (char **), but this
	 * depends on it returning (void *); old compilers need the
	 * cast, since malloc() returns (char *).  The same applies
	 * to realloc() below.
	 */
	vec = malloc(sizeof(char *));
	if (vec == NULL)
		quit(nomem);
	veclen = 0;
	while ((s = readstr(f)) != NULL) {
		vec = realloc(vec, (veclen + 2) * sizeof(char *));
		if (vec == NULL)
			quit(nomem);
		vec[veclen++] = s;
	}
	vec[veclen] = NULL;
	return (vec);
}

/*
 * Read a list of files specified in an argv.
 * Each file's list of lines is stored as a vector at p[i].
 * The end of the list of files is indicated by p[i] being NULL.
 *
 * It would probably be more useful, if less appropriate
 * for this example, to return a list of (filename, contents) pairs.
 */
char ***readlots(register char **names) {
	register char ***p;
	register int nread;
	register FILE *f;
	char **vp;
	extern int errno;

	p = malloc(sizeof(char **));
	if (p == NULL)
		quit(nomem);
	for (nread = 0; *names != NULL; names++) {
		if ((f = fopen(*names, "r")) == NULL) {
			(void) fprintf(stderr, "ThisProg: cannot read %s: %s\n",
				*names, strerror(errno));
			continue;
		}
		vp = readfile(f);
		(void) fclose(f);
		p = realloc(p, (nread + 2) * sizeof(char **));
		if (p == NULL)
			quit(nomem);
		p[nread++] = vp;
	}
	p[nread] = NULL;
	return (p);
}

/* e.g., instead:
struct file_data {
	char	*fd_name;
	char	**fd_text;
};
struct file_data *readlots(register char **names) {
	register struct file_data *p;
	register int nread;
	register FILE *f;
	char **vp;
	extern int errno;

	p = malloc(sizeof(*p));
	if (p == NULL)
		quit(nomem);
	for (nread = 0; *names != NULL; names++) {
		<...same file-reading code as above...>
		p = realloc(p, (nread + 2) * sizeof(*p));
		if (p == NULL)
			quit(nomem);
		p[nread].fd_name = *names;
		p[nread].fd_text = vp;
		nread++;
	}
	p[nread].fd_name = NULL;
	p[nread].fd_text = NULL;
	return (p);
}
*/
/* end of code */
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

rbutterworth@watmath.waterloo.edu (Ray Butterworth) (11/19/88)

In article <1854@xyzzy.UUCP>, throopw@xyzzy.UUCP (Wayne A. Throop) writes:
> This IS one of those topics that comes by once every few months, so
> there is clearly a lot of confusion about how arrays and pointers
> relate to each other in C.
> Hope this posting is of some help.

One additional suggestion for people that are trying to understand
the difference between pointers and arrays is to write some small
test programs that print the results of sizeof() on the various
variables and expressions.  Some of the values will be surprising
and until one understands why the values are what they are, one
will never properly understand how to use arrays and pointers.
Also, play around with assigning various expressions to variables
and try to understand why something is or is not legal.
e.g.

main(){
    /* using different prime dimensions helps when looking at sizeof() */
    auto char arr[3][5][7];
    auto char ***p;
    auto char (*q)[7];
    auto char *r;

    p = arr;           /* this is illegal */
    q = &arr[0][0];    /* this is legal, but old compilers ignore the & */
    r = &arr[0][0][0];

    printf("%d %d %d %d\n",
        sizeof(arr), sizeof(arr[0]), sizeof(arr[0][0]), sizeof(arr[0][0][0]));

    printf("%d %d %d %d\n",
        sizeof(p), sizeof(*p), sizeof(**p), sizeof(***p));

    printf("%d %d %d\n",
        sizeof(q), sizeof(*q), sizeof(*q[0]));

    printf("%d %d %d\n",
        sizeof(r), sizeof(*r), sizeof(r[0]));

    /* %d should be %ld on some systems */
}

Note that the sizeof(arr), sizeof(p), etc. indicate the total number
of bytes that are actually allocated by the auto declarations.
So, for instance "auto char ***p" only allocates something like
4 bytes, even though p may eventually end up pointing at memory
containing thousands of bytes.  The 4 should indicate that the
compiler has not allocated that other memory and it is still up
to the programmer to find it somewhere.