Fred <jfn%vanderbilt.csnet@csnet-relay.arpa> (01/25/85)
I have a question about C declarations. The [] notation is equivalent to the * notation, right? We have int ptr[] <=> int *ptr and int *ptr[] <=> int **ptr The question concerns the [] syntax, which takes on a different meaning if data initialization occurs. For example: int ptr[]; declares one pointer but int ptr[] = { 1, 2, 3 }; declares a three element int array. Is this a desirable characteristic of C? Could someone please comment on the precise meaning of [] in declarations. Thanks, my address is CS-Net: jfn@vanderbilt
keesan@bbncca.ARPA (Morris M. Keesan) (01/29/85)
------------------------------- >From: Fred <jfn%vanderbilt.csnet@csnet-relay.arpa> >Subject: C declarations > I have a question about C declarations. The [] notation is equivalent >to the * notation, right? We have > int ptr[] <=> int *ptr >and > int *ptr[] <=> int **ptr > The question concerns the [] syntax, which takes on a different meaning if >data initialization occurs. For example: > int ptr[]; declares one pointer >but > int ptr[] = { 1, 2, 3 }; declares a three element int array. > Is this a desirable characteristic of C? Could someone please comment on >the precise meaning of [] in declarations. Sigh. This is something we cover in this newsgroup/list at least twice a year. [] and * are NOT the same. A few citations from the C Reference Manual (CRM), as printed in "The C Programming Language", by Kernighan & Ritchie (K&R): CRM section 8.4, K&R pp. 194-5: Now imagine a declaration T D1 where T is a type-specifier (like int, etc.) and D1 is a declarator. . . . If D1 has the form D[constant-expression] or D[] then the contained identifier has type "... array of T". [Ed. note: NOT "pointer to T".] . . . When several "array of" specifications are adjacent . . . the constant expressions . . . may be missing only for the first member of the sequence. This elision is useful when the array is external and the actual definition, which allocates storage, is given elsewhere. The first constant-expression may also be omitted when the declarator is followed by initialization. In this case the size is calculated from the number of initial elements supplied. CRM section 10.1, K&R p. 205, "External function definitions": . . . since a reference to an array . . . is taken to mean a pointer to the first element of the array, declarations of formal parameters declared "array of ..." are adjusted to read "pointer to ...". So in the examples above, the declarations declare: int *ptr[]; /* ptr is a pointer to an array of int */ int ptr[]; /* ptr is an array of int, of unspecified size */ int ptr[] = { 1, 2, 3 }; /* ptr is an array of three ints */ The ONLY time (repeat ONLY) when [] is equivalent to * is in the declaration of formal parameters to a function. E.g. f(ptr) int ptr[]; { . . . } is indeed equivalent to f(ptr) int *ptr; { . . . } In retrospect, considering the confusion this has caused through the years, I think it was probably a mistake to allow this equivalence. -- Morris M. Keesan {decvax,linus,ihnp4,wivax,wjh12,ima}!bbncca!keesan keesan @ BBN-UNIX.ARPA
ag4@pucc-h (Angus Greiswald the fourth) (01/29/85)
> int ptr[]; declares one pointer > but > int ptr[] = { 1, 2, 3 }; declares a three element int array. > > Is this a desirable characteristic of C? Could someone please comment on > the precise meaning of [] in declarations. Well, I look at it this way: foo[] is an array whose location and/or size is variable and thus needs to be declared as a pointer, and biff[4] is a fixed size array whose location is constant, and thus biff can be declared as a constant. When there is an initializer, you can explicitly declare the size of an array, leave the compiler to count for itself, or specify it to be non-fixed in size and position with an initial value. int foobung[3] = {13, 42, 93}, Ack[] = {7, 6}, ichabod[] = foobung; Of course, int *foo is a different matter. Hope I covered what you were interested in. -- Jeff Lewis vvvvvvvvvvvv {decvax|ucbvax|allegra|seismo|harpo|teklabs|ihnp4}!pur-ee!lewie ^^^^^^^^^^^^
rwl@uvacs.UUCP (Ray Lubinsky) (01/30/85)
> I have a question about C declarations. The [] notation is equivalent > to the * notation, right? Well, not exactly. To declare something as ptr[] is to say that you want an array of objects of the type that you specify and that the identifier 'ptr' is to point to the zeroth element. Declaring *ptr only reserved 'ptr' to mean a pointer to that type. For example here are the errors I got on two test programs: % cat > test1.c <<EOF | % cat > test2.c <<EOF main() { | char x[]; char y[]; | main() y = "abc"; | { } | } EOF | EOF % cc test1.c | % cc test2.c "test1.c", line 3: illegal lhs of | Undefined: assignment operator | _x In test1, the compiler tells us that you can't change the value of the identifier which indicates the start of an array. No matter that they array has no elements -- it just won't permit it. Otherwise, a programmer could lose track of his array. In test2, the compiler assumes that the (evidently) null array 'x' must be declared in some other load module; when it's not found, the loader complains. When you declare an array, you must define how much storage you want to allocate for it. There are three possiblities: int x[2]; /* x points to a block of 2 elements */ int y[] = { 0 , 1 }; /* y has implicitly 2 elements */ extern int z[]; /* z has dimensions declared elsewhere */ If you have a pointer to integer call 'ptr', it can be assigned to point to an element of any of these arrays. The statement ptr = z; just points 'ptr' to the zeroth element of array z. > Is this a desirable characteristic of C? What can I say? C will let you do all sorts of crazy things that you had no intention of doing (like accessing the 11th element of a ten-element array) but it won't let you risk losing all references to a block of allocated memory. Seems like a good idea to me. ------------------------------------------------------------------------------ Ray Lubinsky University of Virginia, Dept. of Computer Science uucp: decvax!mcnc!ncsu!uvacs!rwl
arnold@gatech.UUCP (Arnold Robbins) (01/30/85)
Morris M. Keesan {decvax,linus,ihnp4,wivax,wjh12,ima}!bbncca!keesan writes: > > [.....] > int *ptr[]; /* ptr is a pointer to an array of int */ > [.....] > Sorry, but this declaration means ptr is an array of pointers to ints (similar to the char *argv[] declaration of argv). A pointer to an array of ints would be int array[] = { 1, 2, 3 }; int *ptr = & array[0]; /* just use a simple pointer */ /* or int *ptr = array; but that is what started this whole mess */ since there is no difference between pointing to a single int, or the first element in an array. I heartily agree that pointers and array are probably the most confusing aspect of C. -- Arnold Robbins CSNET: arnold@gatech ARPA: arnold%gatech.csnet@csnet-relay.arpa UUCP: { akgua, allegra, hplabs, ihnp4, seismo, ut-sally }!gatech!arnold Help advance the state of Computer Science: Nuke a PR1ME today!
jss@sjuvax.UUCP (J. Shapiro) (02/05/85)
[Aren't you hungry...?] My recollection from K&R is that in practice, strings and arrays of characters are supposed to behave the same way. Yet we all know this isn't true, and that some functions (e.g. strcpy) don't work right on one but work fine on the other. Is there a real reason for this, or did it just happen that way?
karen@vaxwaller.UUCP (02/06/85)
> > I have a question about C declarations. The [] notation is equivalent > > to the * notation, right? > >% cat > test1.c <<EOF | % cat > test2.c <<EOF >main() { | char x[]; > char y[]; | main() > y = "abc"; | { >} | } >EOF | EOF >% cc test1.c | % cc test2.c >"test1.c", line 3: illegal lhs of | Undefined: >assignment operator | _x > > In test1, the compiler tells us that you can't change the value of the >identifier which indicates the start of an array. No matter that they array >has no elements -- it just won't permit it. Otherwise, a programmer could >lose track of his array. In test2, the compiler assumes that the (evidently) >null array 'x' must be declared in some other load module; when it's not found, >the loader complains. >... >it won't let you risk losing all references to a block of allocated memory. >Seems like a good idea to me. > >Ray Lubinsky University of Virginia, Dept. of Computer Science it is true that you can't change the value of the identifier which indicates the start of the array, but i disagree as to why. if you compile into assembly language ("cc" with "-S" on unix) the assembly explains things well. the following example shows that: 1. the compiler doesn't care if i lose all reference to "ppp", and 2. "a" has no contents; it is only an address, whereas "p" has assignable space in addition to the data it points to. /*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/ c code: char *p = "ppp"; char a[] = "aaa"; main () { p = a; } /*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/ assembly: (...) .globl _p _p: .data 2 ; p gets allocated space L18: ; for a pointer .ascii "ppp\0" ; plus what it points to .data .long L18 .data .globl _a _a: ; a is merely an address .long 0x616161 ; pointing to its data (...) /*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/ - karen maleski {ucbvax!zehntel, amd}!varian!karen
jim@ISM780B.UUCP (02/07/85)
>Well, I look at it this way: foo[] is an array whose location and/or size is >variable and thus needs to be declared as a pointer, Except in the case of a parameter declaration, this is not correct. The location and size of foo are *unknown*, but not variable. foo is not a pointer; it is a *reference* to a fixed sized and located array defined somewhere else. -- Jim Balter, INTERACTIVE Systems (ima!jim)
guy@rlgvax.UUCP (Guy Harris) (02/09/85)
> My recollection from K&R is that in practice, strings and arrays of > characters are supposed to behave the same way. Yet we all know this isn't > true, and that some functions (e.g. strcpy) don't work right on one but > work fine on the other. Huh? A "string" is a *null-terminated* array of characters, so not all arrays of characters behave like strings (one thing the "strn..." routines are useful for is for dealing with arrays of characters which may not be null-terminated, i.e. a pseudo-string in a table which is either terminated by a null character or by the Nth character). "strcpy" won't work unless the source string is null-terminated, so, indeed, it won't work on non-null- terminated arrays of characters. Guy Harris {seismo,ihnp4,allegra}!rlgvax!guy
Doug Gwyn (VLD/VMB) <gwyn@Brl-Vld.ARPA> (02/10/85)
There is an established C and UNIX convention that character strings are NUL-terminated (the alternative is to keep a length count with every string). General char[] arrays do not necessarily have to follow this convention. There are str*() routines for manipulating NUL-terminated strings and mem*() routines for handling general char[] arrays. This is not really an accident, since what is nice for one usage is not so nice for the other and vice-versa.
howard@cyb-eng.UUCP (Howard Johnson) (02/11/85)
>> int ptr[]; declares one pointer >> but >> int ptr[] = { 1, 2, 3 }; declares a three element int array. >> >> Is this a desirable characteristic of C? Could someone please comment on >> the precise meaning of [] in declarations. I wonder who came up with the idea that int foo[]; declares a pointer VARIABLE!? True, foo[n] is EVALUATED as *(foo+n), but I've never understood that this behavior should be extended to declarations. Now I can live with foo[] being called a pointer CONSTANT, but not a pointer variable--at least that's how array declarations are treated by the C compilers I use. > int foobung[3] = {13, 42, 93}, Ack[] = {7, 6}, ichabod[] = foobung; I hope the declaration int ichabod[] = foobung; produces an error on your compiler, since this can be described as: int *ichabod = foobung; -- Howard Johnson Cyb Systems, Austin, TX ..!{gatech,harvard,ihnp4,nbires,noao,seismo}!ut-sally!cyb-eng!howard
jsdy@SEISMO.ARPA (02/13/85)
Becuse of all the verbiage, I had mailed this only to the original poster of the message. It now seems that confusion is more rampnt than I had believed. I'm therefore going to post this publicly. (*sigh*) > ... The [] notation is equivalent > to the * notation, right? Wrong. A pointer [int *ip;] is a unit of memory whose contents will be the address of that thing to which it points. It initially points to -- well, nothing in particular. If you use a pointer, you must put the address of an existing (or allocated) object into it, first. That's why it's an error to take all the UNIX Section 2 function declarations literally -- they all use pointer notation, but some of them (read, write, stat, e.g.) really need a real object to act on. An array (int ia[N];) is actually N real objects of the type of which you have the array. (Did I say that right?) So, in this case, if N == 4, then I have just reserved space for 4 int's. The real objects that exist, here, are ia[0], ia[1], ia[2], and ia[3]. The symbol "ia" here refers to no single existing object. So, now comes the confusing part. "Ia" doesn't refer to any existing object -- so, for instance, an attempt to say: ia = new_value; gets an error from C. But if we use "ia" as a pure value, it appears to have a value which is the address of the first element of the array (often put, "the address of the array"). We can then do pointer arith- metic with this value! In this way, it a p p e a r s to be (but is not) a pointer. Just to be reciprocal, if we now set ip = ia; and try to access ip[0], we find to our delight that it works the other way around: the pointer can a p p e a r to be the pure address of the array (first element of ...). In fact, the pointer still is a memory unit containing said address. Just to confuse things a little bit more, there is one case in which all distinctions are lost. A little history: in early C compilers, there was no way in which anything larger than an int (or maybe a long) could be passed as arguments to functions. [Pointers at that time were considered to be about the size of one or another int. On a really abstract machine, this may or may not be true. It is not true on one poorly designed family of microprocessor: the 80*86.] However, people wanted to pass arrays. "No problem," says the lone language designer, "we'll just pass the pure value which is the address of the array." PRESTO. Whether you pass an array name or a pointer to a function, the argument will be a pointer. Many people first learn this with main(): main(argc, argv, envp) int argc; char **argv; char **envp; { } is exactly identical to: main(argc, argv, envp) int argc; char *argv[]; char *envp[]; { } Within the functions, they are really pointers, no matter how you declare them; and they behave entirely like them. They are separate words of storage containing the address of the arrays of (char *)'s or strings that are, respectively, the arguments and environment variables. However, it is still true that for any other automatic declarations and for all static and external declarations, the distinction between array and pointer remains as I have said. (As far as I know, no implementation of C allows register arrays -- but register pointers are the greatest thing since sliced bits.) > int ptr[] <=> int *ptr > int *ptr[] <=> int **ptr Hopefully, you now understand that the upper left and upper right items are not equivalent, unless they are function arguments. The UL item declares that somewhere there is a set of objects (int's), while the UR object is a single unit of memory pointing off to one or a sequence of said objects. Note that the pointer can point to the first element of an array ("to the array"), and then be incremented by 1 to point to the next element of the array, no matter how big the object in the array "actually" is. Thus the popular notion that the pointer to an object may also be a pointer to an array of said objects. The LL item, of course, is an array of pointers to strings. The '[]' operator binds more closely than the '*' operator. (The only case I can think of offhand where a binop is tighter than a unop.) The LR item is a pointer to a pointer to an int. Of course, the pointer that it points to may be the first in an array of pointers! Thus: ptr -> _____ -> (int) _____ -> (int) ... And it must never be confused with either int iaa[M][N]; or int *(iap[]); the former of which is an actual 2-D array, and the latter of which is a pointer to a single array of int's! Think about it: in the first, iaa[M] is an array of N int's: so you get: int 0, int 1, ..., int N-1, (0 row) ... int 0, int 1, ..., int N-1, (M-1 row) or M * N int's closely packed together! In the second, you have a unit of memory which is the address of a series of int's in a row. Gee, you might as well have said int *iap; for all the goos that does you. (I am hedging the truth here -- that notation is sometimes useful.) > int ptr[]; declares one pointer I'm sure you see the problem with this, now. In fact, with no subscript and no initialiser, this is a zero-length array! > int ptr[] = { 1, 2, 3 }; declares a three element int array. Yup! Mnmmm ... it's getting late. Any further questions will have to be deferred until the next class. [;-)] [Go ahead & write if you want.] Joe Yao hadron!jsdy@seismo.{ARPA,UUCP}
MLY.G.SHADES%MIT-OZ@MIT-MC.ARPA (02/13/85)
the declaration: funct() { type var[]; ... } is declaring a null array (probably one element actually allocated). using the name var will produce a constant(!) address/ptr to this mythological location. the declaration: funct() { type *var; ... } allocates a location var the contents of which is a ptr to type. using var returns the contents of the location which can then be used as the ptr to the variable pointed to. does this help define the usage of var[] and *var more clearly? i hope so. shades%mit-oz@mit-mc.arpa
friesen@psivax.UUCP (Stanley Friesen) (02/14/85)
I think the article by jsdy@SEISMO is generally good, and clarifies a major confusion, but I disagree on one minor point. In article <8302@brl-tgr.ARPA> jsdy@SEISMO.ARPA writes: > >An array (int ia[N];) is actually N real objects of the type of which >you have the array. (Did I say that right?) So, in this case, if >N == 4, then I have just reserved space for 4 int's. The real objects >that exist, here, are ia[0], ia[1], ia[2], and ia[3]. The symbol "ia" >here refers to no single existing object. > >So, now comes the confusing part. "Ia" doesn't refer to any existing >object -- so, for instance, an attempt to say: > ia = new_value; >gets an error from C. But if we use "ia" as a pure value, it appears >to have a value which is the address of the first element of the array >(often put, "the address of the array"). We can then do pointer arith- >metic with this value! In this way, it a p p e a r s to be (but is >not) a pointer. I say "ia" *is* a pointer, a pointer *constant*, with the same relation to a pointer variable as an integer constant has to an integer varible. This way the basis for both the similarities *and* differences between "int ia[n]" and "int *ip" are explained. -- Sarima (Stanley Friesen) {trwrb|allegra|cbosgd|hplabs|ihnp4|aero!uscvax!akgua}!sdcrdcf!psivax!friesen or quad1!psivax!friesen