[net.lang.c] C declarations

Fred <jfn%vanderbilt.csnet@csnet-relay.arpa> (01/25/85)

   I have a question about C declarations.  The [] notation is equivalent
to the * notation, right?  We have

	int ptr[]   <=>   int *ptr

and 

	int *ptr[]  <=>   int **ptr


   The question concerns the [] syntax, which takes on a different meaning if 
data initialization occurs.  For example:

	int ptr[];	declares one pointer

but

	int ptr[] = { 1, 2, 3 };	declares a three element int array.


  Is this a desirable characteristic of C?  Could someone please comment on
the precise meaning of [] in declarations.

  Thanks, my address is
	CS-Net:  jfn@vanderbilt

keesan@bbncca.ARPA (Morris M. Keesan) (01/29/85)

-------------------------------
>From: Fred <jfn%vanderbilt.csnet@csnet-relay.arpa>
>Subject: C declarations
>   I have a question about C declarations.  The [] notation is equivalent
>to the * notation, right?  We have
>	 int ptr[]   <=>   int *ptr
>and 
>	 int *ptr[]  <=>   int **ptr
>   The question concerns the [] syntax, which takes on a different meaning if 
>data initialization occurs.  For example:
>	 int ptr[];	 declares one pointer
>but
>	 int ptr[] = { 1, 2, 3 };	 declares a three element int array.
>  Is this a desirable characteristic of C?  Could someone please comment on
>the precise meaning of [] in declarations.

Sigh.  This is something we cover in this newsgroup/list at least twice a year.
[] and * are NOT the same.  A few citations from the C Reference Manual (CRM),
as printed in "The C Programming Language", by Kernighan & Ritchie (K&R):

    CRM section 8.4, K&R pp. 194-5: 
        Now imagine a declaration T D1 where T is a type-specifier (like int,
    etc.) and D1 is a declarator. . . . If D1 has the form
    D[constant-expression] or D[] then the contained identifier has type "...
    array of T". [Ed. note:  NOT "pointer to T".] . . . When several "array of"
    specifications are adjacent . . .  the constant expressions . . . may be
    missing only for the first member of the sequence. This elision is useful
    when the array is external and the actual definition, which allocates
    storage, is given elsewhere.  The first constant-expression may also be
    omitted when the declarator is followed by initialization.  In this case the
    size is calculated from the number of initial elements supplied. 
    CRM section 10.1, K&R p. 205, "External function definitions":
        . . . since a reference to an array . . . is taken to mean a pointer to
    the first element of the array, declarations of formal parameters declared
    "array of ..." are adjusted to read "pointer to ...".

So in the examples above, the declarations declare:

	int *ptr[];            /* ptr is a pointer to an array of int */
	int ptr[];             /* ptr is an array of int, of unspecified size */
	int ptr[] = { 1, 2, 3 }; /* ptr is an array of three ints */

The ONLY time (repeat ONLY) when [] is equivalent to * is in the declaration of
formal parameters to a function.  E.g.

    f(ptr) int ptr[]; { . . . }
	is indeed equivalent to
    f(ptr) int *ptr; { . . . }

In retrospect, considering the confusion this has caused through the years, I
think it was probably a mistake to allow this equivalence.
-- 
			    Morris M. Keesan
			    {decvax,linus,ihnp4,wivax,wjh12,ima}!bbncca!keesan
			    keesan @ BBN-UNIX.ARPA

ag4@pucc-h (Angus Greiswald the fourth) (01/29/85)

> 	int ptr[];	declares one pointer
> but
> 	int ptr[] = { 1, 2, 3 };	declares a three element int array.
>
>   Is this a desirable characteristic of C?  Could someone please comment on
> the precise meaning of [] in declarations.

Well, I look at it this way: foo[] is an array whose location and/or size is
variable and thus needs to be declared as a pointer, and biff[4] is a fixed
size array whose location is constant, and thus biff can be declared as
a constant.  When there is an initializer, you can explicitly declare the
size of an array, leave the compiler to count for itself, or specify
it to be non-fixed in size and position with an initial value.

    int foobung[3] = {13, 42, 93}, Ack[] = {7, 6}, ichabod[] = foobung;

Of course, int *foo is a different matter.  Hope I covered what you
were interested in.

--
Jeff Lewis                                         vvvvvvvvvvvv
{decvax|ucbvax|allegra|seismo|harpo|teklabs|ihnp4}!pur-ee!lewie
                                                   ^^^^^^^^^^^^

rwl@uvacs.UUCP (Ray Lubinsky) (01/30/85)

>    I have a question about C declarations.  The [] notation is equivalent
> to the * notation, right?

   Well, not exactly.  To declare something as ptr[] is to say that you want
an array of objects of the type that you specify and that the identifier 'ptr'
is to point to the zeroth element.  Declaring *ptr only reserved 'ptr' to mean
a pointer to that type.  For example here are the errors I got on two test
programs:

% cat > test1.c <<EOF			|	% cat > test2.c <<EOF
main() {				|	char	x[];
	char	y[];			|	main()
	y = "abc";			|	{
}					|	}
EOF					|	EOF
% cc test1.c				|	% cc test2.c
"test1.c", line 3: illegal lhs of	|	Undefined:
assignment operator			|	_x

   In test1, the compiler tells us that you can't change the value of the
identifier which indicates the start of an array.  No matter that they array
has no elements -- it just won't permit it.  Otherwise, a programmer could
lose track of his array.  In test2, the compiler assumes that the (evidently)
null array 'x' must be declared in some other load module; when it's not found,
the loader complains.

   When you declare an array, you must define how much storage you want to
allocate for it.  There are three possiblities:

int	x[2];			/* x points to a block of 2 elements */
int	y[] = { 0 , 1 };	/* y has implicitly 2 elements */
extern	int	z[];		/* z has dimensions declared elsewhere */

   If you have a pointer to integer call 'ptr', it can be assigned to point to
an element of any of these arrays.  The statement  ptr = z;  just points 'ptr'
to the zeroth element of array z.

>   Is this a desirable characteristic of C?

   What can I say?  C will let you do all sorts of crazy things that you had no
intention of doing (like accessing the 11th element of a ten-element array) but
it won't let you risk losing all references to a block of allocated memory.
Seems like a good idea to me.

------------------------------------------------------------------------------

Ray Lubinsky		     University of Virginia, Dept. of Computer Science
			     uucp: decvax!mcnc!ncsu!uvacs!rwl

arnold@gatech.UUCP (Arnold Robbins) (01/30/85)

Morris M. Keesan {decvax,linus,ihnp4,wivax,wjh12,ima}!bbncca!keesan writes:
> 
>	[.....]
> 	int *ptr[];            /* ptr is a pointer to an array of int */
>	[.....]
>

Sorry, but this declaration means ptr is an array of pointers to ints (similar
to the char *argv[] declaration of argv).

A pointer to an array of ints would be

	int array[] = { 1, 2, 3 };
	int *ptr = & array[0];	/* just use a simple pointer */
	/* or int *ptr = array; but that is what started this whole mess */

since there is no difference between pointing to a single int, or the first
element in an array.

I heartily agree that pointers and array are probably the most confusing
aspect of C.
-- 
Arnold Robbins
CSNET:	arnold@gatech	ARPA:	arnold%gatech.csnet@csnet-relay.arpa
UUCP:	{ akgua, allegra, hplabs, ihnp4, seismo, ut-sally }!gatech!arnold

Help advance the state of Computer Science: Nuke a PR1ME today!

jss@sjuvax.UUCP (J. Shapiro) (02/05/85)

[Aren't you hungry...?]

	My recollection from K&R is that in practice, strings and arrays of
characters are supposed to behave the same way. Yet we all know this isn't
true, and that some functions (e.g. strcpy) don't work right on one but
work fine on the other.

	Is there a real reason for this, or did it just happen that way?

karen@vaxwaller.UUCP (02/06/85)

> >    I have a question about C declarations.  The [] notation is equivalent
> > to the * notation, right?
>
>% cat > test1.c <<EOF			|	% cat > test2.c <<EOF
>main() {				|	char	x[];
>	char	y[];			|	main()
>	y = "abc";			|	{
>}					|	}
>EOF					|	EOF
>% cc test1.c				|	% cc test2.c
>"test1.c", line 3: illegal lhs of	|	Undefined:
>assignment operator			|	_x
>
>   In test1, the compiler tells us that you can't change the value of the
>identifier which indicates the start of an array.  No matter that they array
>has no elements -- it just won't permit it.  Otherwise, a programmer could
>lose track of his array.  In test2, the compiler assumes that the (evidently)
>null array 'x' must be declared in some other load module; when it's not found,
>the loader complains.
>...
>it won't let you risk losing all references to a block of allocated memory.
>Seems like a good idea to me.
>
>Ray Lubinsky		     University of Virginia, Dept. of Computer Science

it is true that you can't change the value of the identifier which indicates
the start of the array, but i disagree as to why.  if you compile into 
assembly language ("cc" with "-S" on unix) the assembly explains things well.
the following example shows that:  

	1.  the compiler doesn't care if i lose all reference to "ppp",

and	2.  "a" has no contents; it is only an address, whereas "p" has
	    assignable space in addition to the data it points to.

	/*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
c code:
	char	*p = "ppp";
	char	a[] = "aaa";
	
	main ()
	{
		p = a;
	}
	/*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
assembly:	(...)
		.globl	_p
	_p:
		.data	2		; p gets allocated space
	L18:				;    for a pointer
		.ascii	"ppp\0"		; plus what it points to
		.data
		.long	L18
		.data
		.globl	_a
	_a:				; a is merely an address
		.long	0x616161	;    pointing to its data
		(...)
	/*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/

			- karen maleski
			{ucbvax!zehntel, amd}!varian!karen

jim@ISM780B.UUCP (02/07/85)

>Well, I look at it this way: foo[] is an array whose location and/or size is
>variable and thus needs to be declared as a pointer,

Except in the case of a parameter declaration, this is not correct.
The location and size of foo are *unknown*, but not variable.
foo is not a pointer; it is a *reference* to a fixed sized and located
array defined somewhere else.

-- Jim Balter, INTERACTIVE Systems (ima!jim)

guy@rlgvax.UUCP (Guy Harris) (02/09/85)

> 	My recollection from K&R is that in practice, strings and arrays of
> characters are supposed to behave the same way. Yet we all know this isn't
> true, and that some functions (e.g. strcpy) don't work right on one but
> work fine on the other.

Huh?  A "string" is a *null-terminated* array of characters, so not all
arrays of characters behave like strings (one thing the "strn..." routines
are useful for is for dealing with arrays of characters which may not
be null-terminated, i.e. a pseudo-string in a table which is either terminated
by a null character or by the Nth character).  "strcpy" won't work unless
the source string is null-terminated, so, indeed, it won't work on non-null-
terminated arrays of characters.

	Guy Harris
	{seismo,ihnp4,allegra}!rlgvax!guy

Doug Gwyn (VLD/VMB) <gwyn@Brl-Vld.ARPA> (02/10/85)

There is an established C and UNIX convention that character strings
are NUL-terminated (the alternative is to keep a length count with
every string).  General char[] arrays do not necessarily have to
follow this convention.  There are str*() routines for manipulating
NUL-terminated strings and mem*() routines for handling general
char[] arrays.  This is not really an accident, since what is nice
for one usage is not so nice for the other and vice-versa.

howard@cyb-eng.UUCP (Howard Johnson) (02/11/85)

>> 	int ptr[];	declares one pointer
>> but
>> 	int ptr[] = { 1, 2, 3 };	declares a three element int array.
>>
>>   Is this a desirable characteristic of C?  Could someone please comment on
>> the precise meaning of [] in declarations.

I wonder who came up with the idea that
	int foo[];
declares a pointer VARIABLE!?  True, foo[n] is EVALUATED as *(foo+n), but
I've never understood that this behavior should be extended to declarations.
Now I can live with foo[] being called a pointer CONSTANT, but not a
pointer variable--at least that's how array declarations are treated by
the C compilers I use.

>    int foobung[3] = {13, 42, 93}, Ack[] = {7, 6}, ichabod[] = foobung;

I hope the declaration
	int ichabod[] = foobung;
produces an error on your compiler, since this can be described as:
	int *ichabod = foobung;
-- 
	Howard Johnson		Cyb Systems, Austin, TX
..!{gatech,harvard,ihnp4,nbires,noao,seismo}!ut-sally!cyb-eng!howard

jsdy@SEISMO.ARPA (02/13/85)

Becuse of all the verbiage, I had mailed this only to the original
poster of the message.  It now seems that confusion is more rampnt
than I had believed.  I'm therefore going to post this publicly.
(*sigh*)

>                                       ...   The [] notation is equivalent
> to the * notation, right?

Wrong.

A pointer [int *ip;] is a unit of memory whose contents will be the
address of that thing to which it points.  It initially points to --
well, nothing in particular.  If you use a pointer, you must put the
address of an existing (or allocated) object into it, first.  That's
why it's an error to take all the UNIX Section 2 function declarations
literally -- they all use pointer notation, but some of them (read,
write, stat, e.g.) really need a real object to act on.

An array (int ia[N];) is actually N real objects of the type of which
you have the array.  (Did I say that right?)  So, in this case, if
N == 4, then I have just reserved space for 4 int's.  The real objects
that exist, here, are ia[0], ia[1], ia[2], and ia[3].  The symbol "ia"
here refers to no single existing object.

So, now comes the confusing part.  "Ia" doesn't refer to any existing
object -- so, for instance, an attempt to say:
	ia = new_value;
gets an error from C.  But if we use "ia" as a pure value, it appears
to have a value which is the address of the first element of the array
(often put, "the address of the array").  We can then do pointer arith-
metic with this value!  In this way, it  a p p e a r s  to be (but is
not) a pointer.  Just to be reciprocal, if we now set
	ip = ia;
and try to access ip[0], we find to our delight that it works the other
way around:  the pointer can  a p p e a r  to be the pure address of
the array (first element of ...).  In fact, the pointer still is a
memory unit containing said address.

Just to confuse things a little bit more, there is one case in which
all distinctions are lost.  A little history:  in early C compilers,
there was no way in which anything larger than an int (or maybe a long)
could be passed as arguments to functions.  [Pointers at that time were
considered to be about the size of one or another int.  On a really
abstract machine, this may or may not be true.  It is not true on one
poorly designed family of microprocessor: the 80*86.]  However, people
wanted to pass arrays.  "No problem," says the lone language designer,
"we'll just pass the pure value which is the address of the array."

			PRESTO.

Whether you pass an array name or a pointer to a function, the argument
will be a pointer.  Many people first learn this with main():
	main(argc, argv, envp)
	 int argc;
	 char **argv;
	 char **envp;
	{
	}
is exactly identical to:
	main(argc, argv, envp)
	 int argc;
	 char *argv[];
	 char *envp[];
	{
	}
Within the functions, they are really pointers, no matter how you
declare them; and they behave entirely like them.  They are separate
words of storage containing the address of the arrays of (char *)'s
or strings that are, respectively, the arguments and environment
variables.  However, it is still true that for any other automatic
declarations and for all static and external declarations, the
distinction between array and pointer remains as I have said.  (As far
as I know, no implementation of C allows register arrays -- but register
pointers are the greatest thing since sliced bits.)

> 	int ptr[]   <=>   int *ptr
> 	int *ptr[]  <=>   int **ptr

Hopefully, you now understand that the upper left and upper right items
are not equivalent, unless they are function arguments.  The UL item
declares that somewhere there is a set of objects (int's), while the
UR object is a single unit of memory pointing off to one or a sequence
of said objects.  Note that the pointer can point to the first element
of an array ("to the array"), and then be incremented by 1 to point to
the next element of the array, no matter how big the object in the array
"actually" is.  Thus the popular notion that the pointer to an object
may also be a pointer to an array of said objects.

The LL item, of course, is an array of pointers to strings.  The '[]'
operator binds more closely than the '*' operator.  (The only case I can
think of offhand where a binop is tighter than a unop.)  The LR item is
a pointer to a pointer to an int.  Of course, the pointer that it points
to may be the first in an array of pointers!  Thus:
	ptr ->  _____  -> (int)
		_____  -> (int)
		...
And it must never be confused with either int iaa[M][N]; or int *(iap[]);
the former of which is an actual 2-D array, and the latter of which is a
pointer to a single array of int's!  Think about it:  in the first,
iaa[M] is an array of N int's: so you get:
	int 0, int 1, ..., int N-1,	(0 row)
		...
	int 0, int 1, ..., int N-1,	(M-1 row)
or M * N int's closely packed together!  In the second, you have a unit
of memory which is the address of a series of int's in a row.  Gee, you
might as well have said int *iap; for all the goos that does you.  (I am
hedging the truth here -- that notation is sometimes useful.)

> 	int ptr[];	declares one pointer

I'm sure you see the problem with this, now.  In fact, with no subscript
and no initialiser, this is a zero-length array!

> 	int ptr[] = { 1, 2, 3 };	declares a three element int array.
Yup!

Mnmmm ... it's getting late.  Any further questions will have to be
deferred until the next class.  [;-)]  [Go ahead & write if you want.]

Joe Yao		hadron!jsdy@seismo.{ARPA,UUCP}

MLY.G.SHADES%MIT-OZ@MIT-MC.ARPA (02/13/85)

	the declaration:

	funct()
	{
		type var[];
	...
	}

is declaring a null array (probably one element actually allocated).
using the name var will produce a constant(!) address/ptr to this
mythological location.

	the declaration:

	funct()
	{
		type *var;
		...
	}

allocates a location var the contents of which is a ptr to type.
using var returns the contents of the location which can then be used
as the ptr to the variable pointed to.

	does this help define the usage of var[] and *var more
clearly?  i hope so.

                      shades%mit-oz@mit-mc.arpa

friesen@psivax.UUCP (Stanley Friesen) (02/14/85)

I think the article by jsdy@SEISMO is generally good, and clarifies
a major confusion, but I disagree on one minor point.

In article <8302@brl-tgr.ARPA> jsdy@SEISMO.ARPA writes:
>
>An array (int ia[N];) is actually N real objects of the type of which
>you have the array.  (Did I say that right?)  So, in this case, if
>N == 4, then I have just reserved space for 4 int's.  The real objects
>that exist, here, are ia[0], ia[1], ia[2], and ia[3].  The symbol "ia"
>here refers to no single existing object.
>
>So, now comes the confusing part.  "Ia" doesn't refer to any existing
>object -- so, for instance, an attempt to say:
>	ia = new_value;
>gets an error from C.  But if we use "ia" as a pure value, it appears
>to have a value which is the address of the first element of the array
>(often put, "the address of the array").  We can then do pointer arith-
>metic with this value!  In this way, it  a p p e a r s  to be (but is
>not) a pointer.

	I say "ia" *is* a pointer, a pointer *constant*, with the
same relation to a pointer variable as an integer constant has to
an integer varible. This way the basis for both the similarities
*and* differences between "int ia[n]" and "int *ip" are explained.
-- 

				Sarima (Stanley Friesen)

{trwrb|allegra|cbosgd|hplabs|ihnp4|aero!uscvax!akgua}!sdcrdcf!psivax!friesen
 or
quad1!psivax!friesen