[comp.lang.c] arrays of pointers - NOVICE QUESTION!

flatmas@ladder.cs.orst.edu (Scott Timothy Flatman) (06/04/89)

Recently someone posted a remark that these two declarations are the same:
char *array[some size - you choose];
and
char **array;
My understanding is that the first declaration is for an array of pointers to
char. The second one is confusing me. How is it interpreted?

I tried looking this up in Kernighan and Ritchie. I did not get a 
satisfactory explaination. So I tried to make up a simple example along these
lines:

ex#1:
static char *array[2] = {"Hello","World"};
printf("%s %s",*array,*(array+1));

This just prints out "Hello World". I would like to do the same thing using a
declaration such as:  char **array;
and then dynamically allocating the storage for the two strings "Hello","World".
Could anybody send me a simple example. I am trying to figure out how these 
declarations are similar.Also any other bits of code along these lines that
illustrate the idea of "arrays of pointers" would be helpful to me.

Thanks for your support!

----------------------------------------------------
Scott Flatman
INTERNET: flatmas@ladder.cs.orst.edu
UUCP: hplabs!hp-pcd!orstcs!ladder.cs.orst.edu!flatmas
---------------------------------

d88-jwa@nada.kth.se (Jon W{tte) (06/04/89)

In article <> flatmas@ladder.CS.ORST.EDU (Scott Flatman) writes:
>Recently someone posted a remark that these two declarations are the same:
>char *array[some size - you choose];
>and
>char **array;
>My understanding is that the first declaration is for an array of pointers to
>char. The second one is confusing me. How is it interpreted?

The first and the second declarations are identical for all normal uses
as long as you leave the size of the array out i.e.
char *foo[]    ==     char **foo
The ** stands for "pointer to pointer", and since the first pointer
can be temporarily incremented using an "index", this pointer could
as well be a pointer to an array of pointers. The difference is
syntactical -- char *foo[] may be used in formal declarations/parameters
only! Also, the char *argv[] works, the char **argv doesn't, on my machine
(Speaking of main(argc, argv), that is...) This might be a bug in the
* ancient * compiler I use.

Be warned ! IF you specify a size within the brackets, the compiler
reserves memory for your array, but if you use **, you have to do
the memory allocation yourself using malloc() (Or, on certain systems,
mreserve(), NewHandle(), getmem() ...)

I hope this clarifies at least some of your questions. Flames may
be sent to /dev/null or your local fortune cookie administrator.
-- 
 __       Jon W{tte (The dread Smiley Shark) email:h+@nada.kth.se
/  \      (+46 (0) 8 258 268)
   /---   (c) 1989 Yessbox Allright Professional Products Inc. - Y.A.P.P.I.
  /       -- No More --

chris@mimsy.UUCP (Chris Torek) (06/05/89)

In article <10971@orstcs.CS.ORST.EDU> flatmas@ladder.cs.orst.edu
(Scott Timothy Flatman) writes:
>Recently someone posted a remark that these two declarations are the same:
>char *array[some size - you choose];
>and
>char **array;

They are not the same, nor do they have the same meaning, except in one
special case: as a declaration for a formal parameter.  If you write

	int
	main(argc, argv)
		int argc;
		char *argv[];
	{
		...

the compiler sees the declaration for `argv' as one saying `this is
an array of unknown size, each element of which is a pointer to zero
or more characters'---in pseudo-English,

	declare argv as array ? of pointer to char

(`?' here means `unknown size').  Since the C language definition does
not allow one to call a function with an array parameter---if you try,
e.g., with

	f() {
		char *myargv[10];
		... set up myargv[] ...
		(void) main(9, myargv);
	}

the array in that (`rvalue') position is converted to a pointer to the
array's first element---the compiler says, `Oh, you REALLY meant

	declare argv as pointer to pointer to char

or

		char **argv;

so I shall silently pretend you wrote that.'

>My understanding is that the first declaration is for an array of pointers to
>char.

Correct.

>The second one is confusing me. How is it interpreted?

It declares a single pointer, which can be set to nil (or NULL) or to
point to a pointer to a character.  Hence if we have a character:

	char c;

and a pointer to it:

	char *p = &c;

we can set a pointer to point to that pointer:

	char **q = &p;

This is not very interesting, because each pointer points to one object
only---we can use q[0] (which is an alias for p) or p[0] (which is an
alias for c) or q[0][0] (another alias for c) but not p[1] nor q[3][17].
To be more interesting, make p point at a whole slew of characters:

	char c[23];
	char *p = &c[0];
	char **q = &p;

Now we can talk about p[0] (an alias for c[0]) through p[22] (an alias
for c[22]) or q[0][0] (c[0] again) through q[0][22] (c[22]), but still
not q[1][?] or q[2][?].  To make q more interesting, make it point at
a slew of pointers:

	char c[23], d[5], e[17];
	char *p[3] = { &c[0], &d[0], &e[0] };
	char **q = &p[0];

Now we can use p[0][0] (an alias for c[0]) or p[1][0] (d[0]) or p[2][0]
(e[0]), and, similarly, q[0][?] through q[2][?].

There is still not much reason to use the pointer `q' instead of `p'
Since I am getting tired of typing, I will just segue into previous
posting (or perhaps slam into one :-) ).

From: chris@mimsy.UUCP (Chris Torek)
Subject: Re: char ***pointer;
Keywords: allocating space
Message-ID: <14617@mimsy.UUCP>
Date: 18 Nov 88 07:40:26 GMT

	char *p;

declares an object p which has type `pointer to char' and no specific
value.  (If p is static or external, it is initialised to (char *)NULL;
if it is automatic, it is full of garbage.)  Similarly,

	char **p;

declares an object p which has type `pointer to pointer to char' and
no specific value.  We can keep this up for days :-) and write

	char *******p;

which declares an object p which has type `pointer to pointer ... to char'
and no specific value.  But we will stop with

	char ***pppc;

which declares `pppc' as type `pointer to pointer to pointer to char',
and leaves its value unspecified.  None of these pointers point *to*
anything, but if I say, e.g.,

	char c = '!';
	char *pc = &c;
	char **ppc = &pc;
	char ***pppc = &ppc;

then I have each pointer pointing to something.  pppc points to ppc;
ppc points to pc; pc points to c; and hence, ***pppc is the character
'!'.

Now, there is a peculiar status for pointers in C: they point not only
to the object immediately at *ptr, but also to any other objects an
an array named by *(ptr+offset).  (The latter can also be written as
ptr[offset].)  So I could say:

	int i, j, k;
	char c[NPPC][NPC][NC];
	char *pc[NPPC][NPC];
	char **ppc[NPPC];
	char ***pppc;

	pppc = ppc;
	for (i = 0; i < NPPC; i++) {
		ppc[i] = pc[i];
		for (j = 0; j < NPC; j++) {
			pc[i][j] = c[i][j];
			for (k = 0; k < NC; k++)
				c[i][j][k] = '!';
		}
	}

What this means is perhaps not immediately clear%.  There is a two-
dimensional array of pointers to characters pc[i][j], each of which
points to a number of characters, namely those in c[i][j][0] through
c[i][j][NC-1].  A one-dimensional array ppc[i] contains pointers to
pointers to characters; each ppc[i] points to a number of pointers to
characters, namely those in pc[i][0] through pc[i][NPC-1].  Finally,
pppc points to a number of pointers to pointers to characters, namely
those in ppc[0] through ppc[NPPC-1].
-----
% :-)
-----

The important thing to note is that each variable points to one or
more objects whose type is the type derived from removing one `*'
from the declaration of that variable.  (Clear? :-)  Maybe we should
try it this way:)  Since pppc is `char ***pppc', what ppc points to
(*pppc) is of type `char **'---one fewer `*'s.  pppc points to zero
or more objects of this type; here, it points to the first of NPPC
objects.

As to malloc: malloc obtains a blob of memory of unspecified shape.
The cast you put in front of malloc determines the shape of the blob.
The argument to malloc determines its size.  These should agree, or you
will get into trouble later.  So the first thing we need to do is
this:

	pointer = (char ***)malloc(N * sizeof(char **));
	if (pointer == NULL) quit("out of memory... goodbye");

Pointer will then point to N objects, each of which is a `char **'.
None of those `char **'s will have any particular value (i.e., they
do not point anywhere at all; they are garbage).  If we make them
point somewhere---to some object(s) of type `char **'---and make
those objects point somewhere, then we will have something useful.

Suppose we have done the one malloc above.  Then if we use:

	pointer[0] = (char **)malloc(N1 * sizeof(char *));
	if (pointer[0] == NULL) quit("out of memory");

we will have a value to which pointer[0] points, which can point to
N1 objects, each of type `char *'.  So we can then say, e.g.,

	i = 0;
	while (i < N1 && fgets(buf, sizeof(buf), input) != NULL)
		pointer[0][i++] = strdup(buf);

(strdup is a function that calls malloc to allocate space for a copy
of its string argument, and then copies the string to that space and
returns the new pointer.  If malloc fails, strdup() returns NULL.)
We could write instead

	i = 0;
	while (i < N1 && fgets(buf, sizeof(buf), input) != NULL)
		*(*pointer)++ = strdup(buf);

Note that

		**pointer++ = strdup(buf);

sets **pointer (equivalently, pointer[0][0]), then increments the
value in `pointer', not that in pointer[0].  But using *(*pointer)++
means that we will later have to write

	pointer[0] -= i;

to adjust pointer[0] backwards by the number of strings read in and
strdup()ed, or else use negative subscripts to locate the strings.

Probably all of this will be somewhat clearer with a more realistic
example.  The following code creates an array of arrays of lines.

/* begin code (untested) */
/* this assumes prototypes are available */

#include <stddef.h>
#include <stdio.h>
#include <string.h>

static char nomem[] = "out of memory, exiting";

quit(char *msg) {
	(void) fprintf(stderr, "%s\n", msg);
	exit(1);
	/* NOTREACHED */
}

/*
 * Read an input string from a file.
 * Return a pointer to dynamically allocated space.
 */
char *readstr(FILE *f) {
	register char *s = NULL, *p;
	int more = 1, curlen = 0, l;
	char inbuf[BUFSIZ];

	/*
	 * The following loop is not terribly efficient if you have
	 * many long input lines.
	 */
	while (fgets(inbuf, sizeof(inbuf), f) != NULL) {
		p = strchr(inbuf, '\n');
		if (p != NULL) {	/* got it all */
			*p = 0;
			l = p - inbuf;
			more = 0;	/* signal stop */
		} else
			l = strlen(inbuf);

		/*
		 * N.B. dpANS says realloc((void *)NULL, n) => malloc(n);
		 * if your realloc does not work that way, you will
		 * have to fix this.
		 */
		s = realloc(s, curlen + l + 1);
		if (s == NULL)
			quit(nomem);
		strcpy(s + curlen, inbuf);
		if (more == 0)		/* done; stop */
			break;
		curlen += l;
	}
	/* should check for input error, actually */
	return (s);
}

/*
 * Read an array of strings into a vector.
 * Return a pointer to dynamically allocated space.
 * There are n+1 vectors, the last one being NULL.
 */
char **readfile(FILE *f) {
	register char **vec, *s;
	register int veclen;

	/*
	 * This is terribly inefficent, but it should be correct.
	 *
	 * malloc below is implicitly cast to (char **), but this
	 * depends on it returning (void *); old compilers need the
	 * cast, since malloc() returns (char *).  The same applies
	 * to realloc() below.
	 */
	vec = malloc(sizeof(char *));
	if (vec == NULL)
		quit(nomem);
	veclen = 0;
	while ((s = readstr(f)) != NULL) {
		vec = realloc(vec, (veclen + 2) * sizeof(char *));
		if (vec == NULL)
			quit(nomem);
		vec[veclen++] = s;
	}
	vec[veclen] = NULL;
	return (vec);
}

/*
 * Read a list of files specified in an argv.
 * Each file's list of lines is stored as a vector at p[i].
 * The end of the list of files is indicated by p[i] being NULL.
 *
 * It would probably be more useful, if less appropriate
 * for this example, to return a list of (filename, contents) pairs.
 */
char ***readlots(register char **names) {
	register char ***p;
	register int nread;
	register FILE *f;
	char **vp;
	extern int errno;

	p = malloc(sizeof(char **));
	if (p == NULL)
		quit(nomem);
	for (nread = 0; *names != NULL; names++) {
		if ((f = fopen(*names, "r")) == NULL) {
			(void) fprintf(stderr, "ThisProg: cannot read %s: %s\n",
				*names, strerror(errno));
			continue;
		}
		vp = readfile(f);
		(void) fclose(f);
		p = realloc(p, (nread + 2) * sizeof(char **));
		if (p == NULL)
			quit(nomem);
		p[nread++] = vp;
	}
	p[nread] = NULL;
	return (p);
}

/* e.g., instead:
struct file_data {
	char	*fd_name;
	char	**fd_text;
};
struct file_data *readlots(register char **names) {
	register struct file_data *p;
	register int nread;
	register FILE *f;
	char **vp;
	extern int errno;

	p = malloc(sizeof(*p));
	if (p == NULL)
		quit(nomem);
	for (nread = 0; *names != NULL; names++) {
		<...same file-reading code as above...>
		p = realloc(p, (nread + 2) * sizeof(*p));
		if (p == NULL)
			quit(nomem);
		p[nread].fd_name = *names;
		p[nread].fd_text = vp;
		nread++;
	}
	p[nread].fd_name = NULL;
	p[nread].fd_text = NULL;
	return (p);
}
*/
/* end of code */
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

d88-jwa@nada.kth.se (Jon W{tte) (06/05/89)

>	 * malloc below is implicitly cast to (char **), but this
>	 * depends on it returning (void *); old compilers need the
>	 * cast, since malloc() returns (char *).  The same applies

Hmmm. I thought that malloc() returned an unsigned long. At least
it does on the Motorola compiler from 1982 I use, under Unix sys V
for M68k-based machines. Can anyone who has more knowledge than me
clarify this point ?

-- 
 __       Jon W{tte (The dread Smiley Shark) email:h+@nada.kth.se
/  \      (+46 (0) 8 258 268)
   /---   (c) 1989 Yessbox Allright Professional Products Inc. - Y.A.P.P.I.
  /       -- No More --

guy@auspex.auspex.com (Guy Harris) (06/06/89)

>The first and the second declarations are identical for all normal uses
>as long as you leave the size of the array out i.e.
>char *foo[]    ==     char **foo

Wrong!  The first and the second declarations are identical only when
used to declare a formal argument to a procedure.

>The ** stands for "pointer to pointer", and since the first pointer
>can be temporarily incremented using an "index", this pointer could
>as well be a pointer to an array of pointers. The difference is
>syntactical -- char *foo[] may be used in formal declarations/parameters
>only!

Wrong.

	extern char *foo[];

is perfectly legitimate, for example, and is a declaration of an array
of pointers to "char" that is defined elsewhere.  The *definition* has
to specify a size for the array, but the *declaration* doesn't.

>Also, the char *argv[] works, the char **argv doesn't, on my machine
>(Speaking of main(argc, argv), that is...) This might be a bug in the
>* ancient * compiler I use.

"Might be a bug"?  It definitely *is* a bug!  The "char **argv" is a
perfectly legitimate declaration, and if the compiler you use can't handle
it, it's broken (and wouldn't be able to cope with a *lot* of code out
there).

gwyn@smoke.BRL.MIL (Doug Gwyn) (06/06/89)

In article <1170@draken.nada.kth.se> d88-jwa@nada.kth.se (Jon W{tte) writes:
>Hmmm. I thought that malloc() returned an unsigned long.

Maybe somebody's version had an (unsigned long) size argument, but no
malloc() implementation should return other than a "generic pointer"
(char* or void*, depending on the age of the compilation environment).

karzes@mfci.UUCP (Tom Karzes) (06/06/89)

I once wrote a simple test to exercise our C compiler.  Actually I wrote
a program which will create a program that tests all forms of simple
declarators that can be constructed from indirection "*", function
calls "()", and arrays "[N]" (with grouping parentheses "(...)" thrown
in as needed to obtain the desired precedence), up to a desired level
of declarator composition.  Of course, it weeded out the illegal
combinations (e.g., f()(), g[10](), h()[12], etc.).  The underlying
type is int, and the program declares and statically initializes a
variable or function (determined by the declarator) of each form, then
dynamically loads and prints their values.  Most forms require auxiliary
variables and/or functions.  By convention, these auxiliary objects were
given names which are the name of the primary object followed by an
underscore and a number (the larger the number, the "further away" from
the primary object).

I've included a tiny excerpt from the 7-ply portion of the test.  This
excerpt defines and initializes x361, x362, x363, x364, x365, and x366,
each of which uses auxialiary variables and/or functions.  Also shown is
the section of code from the main program which fully dereferences and
prints the values of these same objects.

Here is what the types are.  Notice that each pointer is treated as
pointing to a single object of the given type, although of course
any pointer that points to a non-function can really point into an
array of those objects.  E.g., int (*x)[7] is treated as a pointer
to an array of 7 ints (which is really what it is), although it could
also be treated as a pointer into an array of arrays of 7 ints (e.g.,
into int y[123][7]).  Note that this latter style is more common in C,
particularly in the single subscript case (e.g., int *x pointing into
int y[123], rather than int (*x)[123] pointing at int y[123]).

    ...

    int *(**(*x361())[4])[3]

        A function which returns a pointer to an array of 4 pointers to
        pointers to arrays of 3 pointers to ints.

    int *(**(*x362)[3][4])[3]

        A pointer to an array of 3 arrays of 4 pointers to pointers to
        arrays of 3 pointers to ints.

    int *(**x363[2][3][4])[3]

        An array of 2 arrays of 3 arrays of 4 arrays of pointers to pointers
        to arrays of 3 pointers to ints.

    int *(**(**x364)())[3]

        A pointer to a pointer to a function which returns a pointer to a
        pointer to an array of 3 pointers to ints.

    int *(**(*x365[2])())[3]

        An array of 2 pointers to functions which return pointers to pointers
        to arrays of 3 pointers to ints.

    int *(**(*x366())())[3]

        A function which returns a pointer to a function which returns a
        pointer to a pointer to an array of 3 pointers to ints.

    ...


Here are the actual excerpts from the program:

...

int x361_4 = 361;
int *x361_3[3] =
    {0,
     &x361_4};
int *(*x361_2)[3] = (int *(*)[3]) /* & */ x361_3;
int *(**x361_1[4])[3] =
    {0, 0,
     &x361_2};
int *(**(*x361())[4])[3] { return (int *(**(*)[4])[3]) /* & */ x361_1; }

int x362_4 = 362;
int *x362_3[3] =
    {0,
     &x362_4};
int *(*x362_2)[3] = (int *(*)[3]) /* & */ x362_3;
int *(**x362_1[3][4])[3] =
    {{0},
     {0, 0,
      &x362_2}};
int *(**(*x362)[3][4])[3] = (int *(**(*)[3][4])[3]) /* & */ x362_1;

int x363_3 = 363;
int *x363_2[3] =
    {0,
     &x363_3};
int *(*x363_1)[3] = (int *(*)[3]) /* & */ x363_2;
int *(**x363[2][3][4])[3] =
    {{{0}},
     {{0},
      {0, 0,
       &x363_1}}};

int x364_5 = 364;
int *x364_4[3] =
    {0,
     &x364_5};
int *(*x364_3)[3] = (int *(*)[3]) /* & */ x364_4;
int *(**x364_2())[3] { return &x364_3; }
int *(**(*x364_1)())[3] = /* & */ x364_2;
int *(**(**x364)())[3] = &x364_1;

int x365_4 = 365;
int *x365_3[3] =
    {0,
     &x365_4};
int *(*x365_2)[3] = (int *(*)[3]) /* & */ x365_3;
int *(**x365_1())[3] { return &x365_2; }
int *(**(*x365[2])())[3] =
    {0,
     /* & */ x365_1};

int x366_4 = 366;
int *x366_3[3] =
    {0,
     &x366_4};
int *(*x366_2)[3] = (int *(*)[3]) /* & */ x366_3;
int *(**x366_1())[3] { return &x366_2; }
int *(**(*x366())())[3] { return /* & */ x366_1; }

...

int c1 = 1;
int c2 = 2;

main()
{
    ...

    printf(" %3d",   *(**(*x361())[c2])[c1]);
    printf(" %3d",   *(**(*x362)[c1][c2])[c1]);
    printf(" %3d",   *(**x363[c1][c1][c2])[c1]);
    printf(" %3d",   *(**(**x364)())[c1]);
    printf(" %3d",   *(**(*x365[c1])())[c1]);
    printf(" %3d",   *(**(*x366())())[c1]);

    ...
}

ken@laidbak.UUCP (Ken Eglaston) (06/06/89)

In article <1170@draken.nada.kth.se> d88-jwa@nada.kth.se (Jon W{tte) writes:
>>	 * malloc below is implicitly cast to (char **), but this
>>	 * depends on it returning (void *); old compilers need the
>>	 * cast, since malloc() returns (char *).  The same applies
>Hmmm. I thought that malloc() returned an unsigned long. At least
>it does on the Motorola compiler from 1982 I use, under Unix sys V
>for M68k-based machines. Can anyone who has more knowledge than me
>clarify this point ?
> __       Jon W{tte (The dread Smiley Shark) email:h+@nada.kth.se
   According to the manual pages on most UNIX systems, malloc() is defined as:
	char *malloc(size)
	      unsigned size;
so a (char *) pointer to the chunk of memory asked for (if enough memory
exists) is returned, and you cast the pointer to whatever your need is.
    You might have mistaken the unsigned size for an unsigned long.....with
the constant revisions going on, it's not hard to get turned around! :>
-- 
Ken Eglaston   /!iexist!ken <- preferred -> FIDO: 115/777
UUCP:       att|!laidbak!ken                      115/108
               \!laidbak!laidy!ken

guy@auspex.auspex.com (Guy Harris) (06/07/89)

>Hmmm. I thought that malloc() returned an unsigned long. At least
>it does on the Motorola compiler from 1982 I use, under Unix sys V
>for M68k-based machines. Can anyone who has more knowledge than me
>clarify this point ?

"malloc" returns a "char *" on any valid C implementation.  It probably
happens that, on the implementation to which you refer, "unsigned long"
and "char *" have the same number of bits; however, this does not, of
course, make them equivalent.

The only ways in which I could see "malloc" returning an "unsigned long"
on the implementation of which you speak are if:

	1) they botched the documentation (*and* the "lint" library!)
	   horribly, and claim therein that it returns an "unsigned
	   long";

	2) the compiler in question returns integral values in, say, D0,
	   and pointer values in, say, A0, and "malloc" returns its
	   value in D0 - which means that they'd have had to tamper with
	   the vanilla S5 source, since said source defines it as
	   returning a "char *" (yes, I looked, and the source
	   *clearly* starts with "char *malloc(<argument>)", for both
	   versions of "malloc"), as well as botching the documentation
	   and the "lint" library.

I doubt that either one occurred, but if either did, the person
responsible should be shot. 

gorpong@telxon.uucp (Gordon C. Galligher) (06/08/89)

In article <1163@draken.nada.kth.se> h+@nada.kth.se (Jon W{tte) writes:
>Also, the char *argv[] works, the char **argv doesn't, on my machine
>(Speaking of main(argc, argv), that is...) This might be a bug in the
>* ancient * compiler I use.

Yes, it is a bug.  It's ancient, but I'd still complain if it can't handle
a simple thing like that!

		-- Gordon.


Gordon C. Galligher  <|> ...!uunet!telxon!gorpong <|> gorpong@telxon.uucp.uu.net
Telxon Corporation   <|> 
Akron, Ohio, 44313   <|> "Life is NOT a dream." -Spock "Go to sleep Spock" - Jim
(216) 867-3700 (3512)<|>  	Star Trek V:  The Final Frontier