chris@umcp-cs.UUCP (Chris Torek) (06/29/86)
In article <2201@umcp-cs.UUCP> chris@maryland.UUCP (Chris Torek) writes: >Perhaps I just have an odd mind, but all this pointer/array stuff >never really bothered me. Or perhaps I simply read K&R, chapter 5, Pointers and Arrays. I needed to refer to K&R recently (see article <2204@umcp-cs.UUCP>), and while I was looking at it, I just happened to stumble across some text in this chapter that seems to me quite clear. Let me give some excerpts, with commentary. (Suggestion: while reading this, imagine me grinning teasingly at points. I hope the tone comes across properly, but I have spent enough time revising this now---great grief, an hour and a half now!) It is also necessary to declare the variables that participate in all of this: int x, y; int *px; The declaration of x and y is what we've seen all along. The declaration of the pointer px is new. int *px; is intended as a mnemonic; it says that the combination *px is an int, that is, if px occurs in the context *px, it is equivalent to a variable of type int. In effect, the syntax of the declaration for a variable mimics the syntax of expressions in which the variable might appear. This reasoning is useful in all cases involving complicated declarations. For example double atof(), *dp; says that in an expression atof() and *dp have values of type double. So much for understanding declarations. K&R said it all, eight years ago. ... Any operation which can be acheived by array subscripting can also be done with pointers. The pointer version will in general be faster but, at least to the uninitiated, somewhat harder to grasp immediately. K&R seem to have a gift for understatement. The correspondence between indexing and pointer arithmetic is evidently very close. In fact, a reference to an array is converted by the compiler to a pointer to the beginning of the array. The effect is that an array name *is* a pointer expression. ... (Note `expression', not `variable'. The above does not apply to sizeof.) There is one difference between an array name and a pointer that must be kept in mind. A pointer is a varible, so pa=a and pa++ are sensible operations. But an array name is a *constant*, not a variable: constructions like a=pa or a++ or p=&a are illegal. `p = &a' is much like `p = &3': illegal by fiat, not because it cannot be done. If it were legal, `&a' would have type `pointer to <type of a>' (compare with `a', which has type `pointer to <type of a[0]>'). When an array name is passed to a function, what is passed is the location of the beginning of the array. Within the called function, this argument is a variable, just like any other variable, and so an array name argument is truly a pointer, that is, a variable containing an address. ... As formal parameters in a function definition, char s[]; and char *s; are exactly equivalent; ... This is all in the context of singly-dimensioned arrays, but with the proper mindset applies to multi-dimensional arrays without trouble. (With the wrong mindset it leads to much confusion.) K&R will have more to say about this later. Note that this is where sizeof starts acting odd: A compiler treats the following as equivalent: array pointer ----- ------- f(arr) f(ap) int arr[]; int *ap; { { ... ... f(a2) f(a2p) int a2[][5]; int (*a2p)[5]; { { ... ... The second equivalent pointer version is neither `int **a2p' nor `int *a2p'; nor for that matter is it `int *a2p[5]'. This is consistent, if (painfully apparently, given recent net.lang.c articles) confusing. 5.7 Multi-Dimensional Arrays C provides for rectangular multi-dimensional arrays, though in practice they tend to be much less used than arrays of pointers. ... ... In C, by definition a two-dimensional array is really a one- dimensional array, each of whose elements is an array. Hence subscripts are written as day_tab[i][j] rather than day_tab[i, j] as in most languages. ... What they do *not* mention is that day_tab[i,j] is a valid expression, and tends to surprise people. Lint does not, unfortunately, warn about these. If a two-dimensional array is to be passed to a function, the argument declaration in the function *must* include the column dimension; the row dimension is irrelevant, since what is passed is, as before, a pointer. What did I tell you? Note that this *is* consistent. One cannot pass an array as an argument to a function. Pointers, however, are fine, *including pointers to arrays*. Given a two or more dimensional array, the array `constant' is converted to a pointer to an array of one fewer dimensions. This is now a *pointer*, and remains a pointer until dereferenced. For example, in int day_tab[2][13] = { ... }; the following are type-correct calls: f2d(p) int (*p)[13]; { ... } f1d(p) int *p; { ... } proc() { /* argument types: */ f2d(day_tab); /* pointer to array 13 of int */ f2d(&day_tab[0]); /* pointer to array 13 of int */ f1d(day_tab[0]); /* pointer to int */ f1d(&day_tab[0][0]); /* pointer to int */ } Calling f2d(&day_tab[0][0]) passes the right *value* but the wrong *type*. That it happens to work is not an excuse to do it. If C were different, it would be different, but it is not, so it is not. To return to K&R: 5.10 Pointers vs. Multi-dimensional [sic] Arrays (So they are not consistent with capitalisation in section names.) Newcomers to C are sometimes confused about the difference between a two-dimensional array and an array of pointers, ... Ah, a gift indeed. Given the declarations int a[10][10]; int *b[10]; the usage of a and b may be similar, in that a[5][5] and b[5][5] are both legal references to a single int. But a is a true array: all 100 storage cells ahve been allocated, and the conventional rectangular subscript calculation is done to find any given element. For b, however, the declaration only allocates 10 pointers; each must be set to point to an array of integers. Assuming that each does point to a ten-element array, then there will be 100 storage cells set aside, plus the ten cells for the pointers. Thus the array of pointers uses slightly more space, and may require an explicit initialization step. But it has two advantages: accessing an element is done by indirection through a pointer rather than by a multiplication and addition, and the rows of the array may be of different lengths. That is, each element of b need not point to a ten-element vector; some may point to two elements, some to twenty, and some to none at all. Now for some even more horrid examples of my own, all type-correct: /* declare st as array 1 of array 5 of pointer to char */ char *st[1][5] = { { "fee", "fie", "foo", "fum", "foobar" } }; /* declare x as pointer to array 5 of pointer to char */ char *(*x)[5] = st; /* declare y as array 1 of array 3 of array 4 of pointer to array 5 of pointer to char */ char *(*y[1][3][4])[5] = { { { st, 0, 0, st }, { 0, st, st, 0 }, { 0, 0, st, st } } } ; /* declare p as array 2 of pointer to array 3 of array 4 of pointer to array 5 of pointer to char */ char *(*(*p[2])[3][4])[5] = { y, 0 }; It does take some trickery to do this. Given the declaration char *strings[5] = { ... }; the type of `strings' is `array 5 of pointer to char', which, when used in an expression, becomes `pointer to pointer to char' (by changing the first `array of' to `pointer to'), but for `x' and `y' I wanted a type of `pointer to array 5 of pointer to char'. It might be nice if I could write `&strings' to get this, but I cannot; however, I can use the declaration above for `st' to get `array 1 of array 5 of pointer to char'. Changing the first `array of' yeilds `pointer to array 5 of pointer to char', which was what I wanted. Likewise, for `p' I wanted `y' to evaluate to `pointer to array 3 of array 4 of pointer to array 5 of pointer to char'; in order to get that, I again used a `fake' [1] in the declaration. `You can hack anything you want, with pointers and funny C . . .' -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1516) UUCP: seismo!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@mimsy.umd.edu
throopw@dg_rtp.UUCP (Wayne Throop) (07/07/86)
> chris@umcp-cs.UUCP (Chris Torek) Good article overall. However, I have a nit to pick with one of the examples: > int day_tab[2][13] = { ... }; > > the following are type-correct calls: > > f2d(p) int (*p)[13]; { ... } > > f1d(p) int *p; { ... } > > proc() > { > /* argument types: */ > f2d(day_tab); /* pointer to array 13 of int */ > f2d(&day_tab[0]); /* pointer to array 13 of int */ > > f1d(day_tab[0]); /* pointer to int */ > f1d(&day_tab[0][0]); /* pointer to int */ > } There is one problem here. The expression (&day_tab[0]) is illegal. Given this simplified example: 1 int aa[2][3]; 2 3 void f1(pa) int (*pa)[3]; { } 4 5 void f2(){ 6 f1(aa); 7 f1(&aa[0]); 8 } lint has this to say (among other things): (7) warning: & before array or function: ignored Granted, things are very strange here, since the [] operator is always supposed to yield an lvalue. However, as Chris pointed out, values of type ([]) are always coerced to expressions of type (*) in all contexts except sizeof. Thus, the subscription yields a (potentially) lvalued expression of array type, and it is coerced to a non-lvalued pointer type, and thus the address-of is illegal. (I think.) Note well: The problem isn't with the type of aa. Lint is *not* complaining about the fact that aa follows the "&". This peculiarity arises because (aa[0]) has type (int [3]), and is immediately coerced to an expression of type (int *) in all contexts except sizeof. Thus, while (&aa[0]) is illegal, (&aa[0][0]) is quite legal indeed. Of course, the latter operation doesn't yield a pointer of the correct type to be passed to function f1 above. -- > `You can hack anything you want, > with pointers and funny C . . .' Sung to "Alice's Restaurant", I presume? (We'll wait 'til it comes around, then join in...) (That's what we're doing now... waiting for it to come around...) (Here it comes...) You can hack anything you want, With pointers and funny C. You can hack anything you want, With pointers and funny C. Dive right in, if you feel the need, It's a great language but it's gone to seed. You can hack anything you want, With pointers and funny C (Excepting Ritchie...) (That was pittiful.) -- "I wanna *HACK*! I wanna *HACK*!!! I wanna feel *BITS* between my teeth!" -- Wayne Throop <the-known-world>!mcnc!rti-sel!dg_rtp!throopw
davidsen@steinmetz.UUCP (07/17/86)
After thinking about the discussions about using the address operator on an array by name, I come to the reluctant conclusiont that it SHOULD be allowed in the new ANSI standard. I have read an understood the arguments against it, I have spent hours teaching the language and convincing students that they should not do it, but now I have to reluctantly say that there are some fairly good reasons why it should be allowed. Reason 1: "codification of existing practice" If three ports of SysV, 2 of 4.2BSD and three of the largest selling C compiler for PCDOS represent current practice, then it is legal. I admit that a few compiler BITCHED about it, but they compiled it. Reason 2: "modularity and information hiding" If I am writing a modular program in which I have the typedefs in an include file used by the programmers writing the modules, there is no way to allow them to take the address of an item which is defined by typedef unless they know that the item is (or isn't) an array. Example: typedef int PARTA[10]; typedef struct { int x,y; float t[40]; } PARTB; in a module... PARTA source, dest, *whead, work[2]; PARTB *head, workb; If PARTA is an array, I must say: whead = &dest[0]; while if it's not, I say: whead = &dest; This means that the beauty of having the content of types changable at some time is no longer present, and every programmer who works with them, even is s/he never uses the contents (passes addresses, etc, like FILES) must know the type. Reason 3: "common sense" After five years of teaching C, I have to agree with my students that it makes no sense to forbid this construct. To take the address of something use the address operator. I have seen this mistake made by students from major universities, and graduates of courses taught by high priced consultants, so it's not just my students. Moreover, there is already a major peculiarity in the was array names are handled, as compared to the way pointers work. This is in the operation of the sizeof operator, which gives the size of a pointer or sizeof an entire array. Conclusion: I don't find this very desirable, I just think that it makes more sense to allow it that not allow it. Hopefully the next language will do away with arrays, and eliminate the whole problem :> -- -bill davidsen ihnp4!seismo!rochester!steinmetz!--\ \ unirot ------------->---> crdos1!davidsen chinet ------/ sixhub ---------------------/ (davidsen@ge-crd.ARPA) "Stupidity, like virtue, is its own reward"
bs@alice.UucP (Bjarne Stroustrup) (07/19/86)
for what it is worth: In C++ it is allowed (and encourraged practice) to use the addressof operator & explicitly for array and function names.
ark@alice.UucP (Andrew Koenig) (07/19/86)
> After five years of teaching C, I have to agree with my students that > it makes no sense to forbid this construct. To take the address of > something use the address operator. I have seen this mistake made by > students from major universities, and graduates of courses taught by > high priced consultants, so it's not just my students. All right, tell me: What is the type of the address of an array? That is, suppose I write: int a[10]; What type is &a? Don't tell me it's "pointer to integer" because that is the type of &a[0], and a and a[0] are different things.
rgenter@BBN-LABS-B.ARPA (Rick Genter) (07/20/86)
If a is declared as: int a[10]; then &a is of type "pointer to array [10] of int", or (int (*) [10]) in cast terminology. (Or at least it should be) (Cdecl is wonderful). -------- Rick Genter BBN Laboratories Inc. (617) 497-3848 10 Moulton St. 6/512 rgenter@labs-b.bbn.COM (Internet new) Cambridge, MA 02238 rgenter@bbn-labs-b.ARPA (Internet old) linus!rgenter%BBN-LABS-B.ARPA (UUCP)
guy@sun.uucp (Guy Harris) (07/21/86)
> All right, tell me: What is the type of the address of an array? > > That is, suppose I write: > > int a[10]; > > What type is &a? Don't tell me it's "pointer to integer" because > that is the type of &a[0], and a and a[0] are different things. No, it's "pointer to array of 10 'int's", a type that is already in C; consider int a[10][10]; and ask what the type of "&a[5]" is. PCC, at least, even handles this sort of type correctly; you can declare such a pointer, assign a value to it (even though it's only possible to construct such a value by taking the address of a subarray, at least in C as she currently is spoke), and select an element from the array that it points to by the obvious method. This brings up an interesting problem. The ANSI C draft I have (Aug 11, 1985) says C.2.2.1 Arrays, functions, and pointers Except when used as the operand of the "sizeof" operator or when a string literal is used to initialize an array of "char"s, an expression that has type "array of 'type'" is converted to an expression that has type "pointer to 'type'" and that points to the initial member of the array object. This is a generalization of what "The C Reference Manual" says, which is: 7.1 Primary expressions ... An identifier is a primary expression, provided that it has been suitably declared as discussed below. Its type is specified by its declaration. If the type of the identifier is "array of ...", however, then the value of the identifier-expression is a pointer to the first object in the array, and the type of the expression is "pointer to ...". This is silent about "array-valued expressions", except that it implies that the name of an array is not an array-valued expression. It later (in 8.7 Type names) acknowledges the existence of the type "pointer to array of ...", but doesn't indicate what happens if it encounters an expression of that type. The ANSI C statement seems to be the obvious way of correcting this omission. However, it now makes it harder to construct a value of this type. Neither K&R C nor ANSI C allow you to construct a pointer to an array that is not a member of another array (if you declare "int a[10]", "&a" is illegal and "a" is a pointer to the first member of the array, not to the array itself). However, K&R C does not explicitly *forbid* putting an "&" in front of an expression that is a member of an array of arrays. E.g., if you declare "int a[10][10]", it doesn't forbid "&a[3]". (Our PCC, and probably most, if not all PCCs, *will* complain about this; I don't know if this is plugging a loophole in the rules, or just an accident of the implementation.) ANSI C, however, says that *any* expression of type "array of 'type'" is converted to a pointer to the first element of that array (hence of type "pointer to 'type'". This means that the expression "&a[3]" is invalid, since "a[3]" is an array-valued expression referring to the fourth member of "a", and this is converted to a pointer to the first member of the fourth member of "a"; this expression cannot have its address taken. You can get a pointer to the *first* member of "a"; the expression "a" is converted to such a pointer. You can then get a pointer to other members with pointer arithmetic; i.e., "a + 1" is a pointer to the second member of "a" (which is another 10-element array of "int"). Unfortunately, this means something that works for arrays of types that are not arrays won't work for arrays of types that are. If you have "int a[10]", "&a[5]" is a point to "a"s sixth element; if you have "int a[10][10]", however, "&a[5]" is illegal. This is a bit of a rough spot in C's type system. It would be preferable if the operand of the "&" operator, like the operand of the "sizeof" operator, were not converted from "array of 'type'" to "pointer to first element of array of 'type". If this were the case, "&a" would be legal, regardless of the type of "a" (except, possibly, if "a" were of type "function returning 'type', and perhaps even that could be allowed). This would make pointers to arrays more useful, and would permit a routine that took a pointer to an array to be written as such, instead of using the subterfuge of declaring the argument in question to be a pointer to an element of such an array. One would presumably be allowed to declare a pointer of type "pointer to array of 'type' of unspecified size", thus permitting a function to take arrays of arbitrary size as arguments. The Aug 11, 1985 ANSI C draft seems to forbid this; an array size specifier "must be present, except that the first size may be omitted when an array is being declared as a formal parameter of a function, or when the array declaration has storage-class specifier 'extern' and the definition that actually allocates storage is given elsewhere." One is currently allowed to do so by PCC, at least; K&R doesn't forbid it, although the only contexts in which it discusses such array specifiers are the two mentioned by the ANSI C draft. If "&" is to be changed to work like "sizeof", the rules for type specifiers should also be changed in this fashion. Yes, there will be a problem with pointer arithmetic involving pointers to arrays of unspecified size"; this will have to be forbidden. However, ANSI C already has object like this; consider a pointer to a structure of unspecified shape. One can declare such pointers - this is needed to deal with mutually-recursive structures, where an object of type "struct a" contains a pointer to an object of type "struct b", and *vice versa* - and the language must somehow forbid pointer arithmetic on such pointers, at least until the structure's shape is declared. If anyone wonders why the type "pointer to array of 'type'" would be useful, and is not swayed by arguments involving the completeness of type systems or the relative merits of using "pointer to array of 'type'" to point to something of type "array of 'type'" rather than using "pointer to 'type'", consider a program stepping through an array of "vectors", defined as arrays of three "double"s, computing the norm of each one. It *could* do so by stepping an index of integral type, but one reason why C pointers work the way they do is so you can step through an array by stepping a pointer into that array! (These arguments sound somewhat similar to the arguments about taking the address of a "jmp_buf" using "&", since forbidding "&" to be applied to an array forces a programmer to know whether "jmp_buf" is implemented as an array or a structure. In both cases, one is forced to treat arrays differently from other sorts of objects, and it seems unnecessary to require this.) -- Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.com (or guy@sun.arpa)
throopw@dg_rtp.UUCP (07/21/86)
> davidsen@steinmetz.UUCP (Davidsen) > After thinking about the discussions about using the address operator > on an array by name, I come to the reluctant conclusiont that it SHOULD > be allowed in the new ANSI standard. [...] > > Reason 1: "codification of existing practice" Well, maybe. However, this argument means that the standard should say that the compiler ought to warn about it, yet compile it. An odd thing for a standard to say. > Reason 2: "modularity and information hiding" Unfortunately this argument doesn't hold up, for two reasons. First, &array (when it is allowed) currently most often evaluates to the address of the first element of the array, not to the address of the whole array. Thus, you can't hide the array-ness anyhow, since this is different than other applications of the & operator. Second, the inability to hide information (in particular, the allowable operators for a type) is not unique to arrays in any event. Integers can be "+"ed, but not structures, structures can be "."ed, but not pointers, etc etc etc, and none of this can be hidden. And even if "&" were allowed, the assignment would not be. I'll admit that "&" is a peculiar operator to not be mostly universal, but I'm still not convinced that C's peculiar treatment of arrays makes taking their address sensible. In effect, it adds yet-another-special-case, rather than regularizing things. > Reason 3: "common sense" > After five years of teaching C, I have to agree with my students that > it makes no sense to forbid this construct. To take the address of > something use the address operator. I have a great deal of sympathy for this view. But NOTE WELL, that it should yield the address of the WHOLE ARRAY, and NOT the address of the first element of the array. This is DIFFERENT than current usage. Note that it would make int actual[10]; void f(formal) int formal[10];{} void g(){ f(&actual); } type-incorrect, since the formal is expecting type (int *), and gets type (int (*)[]) instead. Also note that "to take the address of something, use the address operator" is overly simplistic, even if arrays could be "&"ed. There are many "somethings" that cannot be "&"ed, such as register variables, bit fields, expressions, and so on. Arrays happen currently to be one of these. To sum up, I wouldn't be absolutely aghast if ANSI legislated that &array should work. But NOTE WELL that it would constitute YASC, and it would be a crime against reason to make it work as it does in some compilers now, such that &array gives a conceptually different type than &non_array. And, on balance, I'd say it isn't really that good an idea. -- The string is a stark data structure and everywhere it is passed there is much duplication of process. It is a perfect vehicle for hiding information. --- Alan J. Perlis -- Wayne Throop <the-known-world>!mcnc!rti-sel!dg_rtp!throopw
rbj@icst-cmr (Root Boy Jim) (07/23/86)
> davidsen@steinmetz.UUCP (Davidsen) > Reason 2: "modularity and information hiding" Unfortunately this argument doesn't hold up, for two reasons. First, &array (when it is allowed) currently most often evaluates to the address of the first element of the array, not to the address of the whole array. Thus, you can't hide the array-ness anyhow, since this is different than other applications of the & operator. Hopefully, the first element is at the same address as the entire array. Second, the inability to hide information ... I pretty much agree, but then we're talking data type. I don't care what I get back from localtime; I just want to pass it to ctime. > Reason 3: "common sense" > After five years of teaching C, I have to agree with my students that > it makes no sense to forbid this construct. To take the address of > something use the address operator. I have a great deal of sympathy for this view. But NOTE WELL, that it should yield the address of the WHOLE ARRAY, and NOT the address of the first element of the array. This is DIFFERENT than current usage. Note that it would make int actual[10]; void f(formal) int formal[10];{} void g(){ f(&actual); } type-incorrect, since the formal is expecting type (int *), and gets type (int (*)[]) instead. If you think about it, a pointer to an int can be used (and is) as a pointer to an array of ints. Unless you apply ++ to it, they are the same thing. (I can already feel the flames approaching). Also note that "to take the address of something, use the address operator" is overly simplistic, even if arrays could be "&"ed. There are many "somethings" that cannot be "&"ed, such as register variables, bit fields, expressions, and so on. Arrays happen currently to be one of these. Taking a larger view, while I think that it's relatively harmless, except for macros, there's no compelling reason to clamor for this either. On the other hand, compilers should warn about but ignore the `&' as that's what the guy probably meant anyhow, and he shouldn't have to recompile just for that. To sum up, I wouldn't be absolutely aghast if ANSI legislated that &array should work. But NOTE WELL that it would constitute YASC, and it would be a crime against reason to make it work as it does in some compilers now, such that &array gives a conceptually different type than &non_array. And, on balance, I'd say it isn't really that good an idea. Yeah. Who cares? Wayne Throop <the-known-world>!mcnc!rti-sel!dg_rtp!throopw (Root Boy) Jim Cottrell <rbj@icst-cmr.arpa> I hope I bought the right relish...zzzzzzzzz..."
tainter@ihlpg.UUCP (Tainter) (07/24/86)
> > After five years of teaching C, I have to agree with my students that > > it makes no sense to forbid this construct. To take the address of > > something use the address operator. I have seen this mistake made by > > students from major universities, and graduates of courses taught by > > high priced consultants, so it's not just my students. > > All right, tell me: What is the type of the address of an array? > > That is, suppose I write: > > int a[10]; > > What type is &a? Don't tell me it's "pointer to integer" because > that is the type of &a[0], and a and a[0] are different things. The answer is : It doesn't have one. That isn't valid C. Compilers will give you warnings about this and interpret it as &a[0], or will give you an error message (or are broken!). --j.a.tainter
throopw@dg_rtp.UUCP (Wayne Throop) (07/26/86)
> rbj%icst-cmr@smoke.UUCP ((Root Boy) Jim Cottrell) >> throopw@dg_rtp.UUCP (Wayne Throop) >>> davidsen@steinmetz.UUCP (Davidsen) >>> [arguments for allowing (&array)] >>> Reason 2: "modularity and information hiding" >> [Still have problem, since current practice is type-anomalous] > Hopefully, the first element is at the same address as the entire array. Yes this is often true, and is necessary in C. But only if you are looking at "addresses" as typeless entities. Which they are not, at least not in C. >>> Reason 3: "common sense" >> [Agreed, but make sure that the type is (int (*)[]), not (int *)] > If you think about it, a pointer to an int can be used (and is) as a pointer > to an array of ints. Unless you apply ++ to it, they are the same thing. > (I can already feel the flames approaching). I assume Jim really means "unless you apply ++, --, [], *, +, -, +=, or -=" (unless I'm overlooking one). That is, unless you use it in arithmetic, subscripting, or indirection. Sort of covers what you can do with a pointer, doesn't it? Remember, a type is not just an interpretation of a pattern of bits. It also has to do with what operations are legal on those bits, and what their effects are. Thus, just because a pointer to the first int in an array has the same bit pattern as a pointer to the whole array does NOT indicate that they are "the same thing", any more than the fact that an integer zero and a floating point zero often have the same bit pattern indicates that these are "the same thing". -- C types require, when pointers pair, Conversions which are never there. They aren't there again today, Please, Dennis, make them go away. -- Wayne Throop <the-known-world>!mcnc!rti-sel!dg_rtp!throopw
jsdy@hadron.UUCP (Joseph S. D. Yao) (08/05/86)
I have seen several references to the address of an array vs. the address of the first element of the array. Would someone care to address what they think this difference is, aside from data type? I.e., it is clear that the types *int and *(int[]) should be different. But the values should be the same: int countdown[] = { 10, 9, 8, ... }; gives something like _countdown: => .word 10 .word 9 .word 8 ... The values of both addresses should be the address of the word '10'. Well, yes, in some theoretical architectures I've heard tell of pointers include arbitrary information on e.g. the size of the object. Any of these actually implemented? -- Joe Yao hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP} jsdy@hadron.COM (not yet domainised)
jon@amdahl.UUCP (Jonathan Leech) (08/07/86)
In article <513@hadron.UUCP>, jsdy@hadron.UUCP (Joseph S. D. Yao) writes: > I have seen several references to the address of an array vs. > the address of the first element of the array. Would someone > care to address what they think this difference is, aside from > data type? I.e., it is clear that the types *int and *(int[]) > should be different. But the values should be the same: > int countdown[] = { 10, 9, 8, ... }; > gives something like > _countdown: > => .word 10 > .word 9 > .word 8 > ... > The values of both addresses should be the address of the word > '10'. > > Well, yes, in some theoretical architectures I've heard tell of > pointers include arbitrary information on e.g. the size of the > object. Any of these actually implemented? You could implement pointers as a triple: (low address, length, offset of current member) for range checking. Doesn't the Symbolics machine do something like this? I recall a reference in a C compiler manual for the Symbolics but have never actually used the machine or compiler. -- Jon Leech (...seismo!amdahl!jon) UTS Products / Amdahl Corporation __@/
mouse@mcgill-vision.UUCP (08/09/86)
[ > through >>>> re &array ] >> If you think about it, a pointer to an int can be used (and is) as a >> pointer to an array of ints. Unless you apply ++ to it, they are the >> same thing. (I can already feel the flames approaching). > I assume Jim really means "unless you apply ++, --, [], *, +, -, +=, or > -=" (unless I'm overlooking one). That is, unless you use it in > arithmetic, subscripting, or indirection. Sort of covers what you can > do with a pointer, doesn't it? You are overlooking something important. They are the same thing UNTIL the size of what the pointer points to becomes important. These situations are: ) ++ ) -- ) [] with a non-zero subscript ) + ) - ) += ) -= As for what they SHOULD be....a pointer to an array should be just that; indirecting off it should result in an array. There are good reasons this isn't done; I have yet to hear an implementation suggested that doesn't have worse flaws than the flaw currently under discussion. -- der Mouse USA: {ihnp4,decvax,akgua,utzoo,etc}!utcsri!mcgill-vision!mouse think!mosart!mcgill-vision!mouse Europe: mcvax!decvax!utcsri!mcgill-vision!mouse ARPAnet: utcsri!mcgill-vision!mouse@uw-beaver.arpa "Come with me a few minutes, mortal, and we shall talk." - Thanatos (Piers Anthony's Bearing an Hourglass)
throopw@dg_rtp.UUCP (Wayne Throop) (08/11/86)
Several interesting points are raised about just what is the difference between a pointer to an array, and a pointer to the first element of an array. > jsdy@hadron.UUCP (Joseph S. D. Yao) > I have seen several references to the address of an array vs. > the address of the first element of the array. Would someone > care to address what they think this difference is, aside from > data type? I.e., it is clear that the types *int and *(int[]) > should be different. But the values should be the same: What is really the case is that the first addressable element of the array is the same as the first addressable element of the first element of the array. (This is not true in all languages, but must be so in C, unless I'm overlooking something quite tricky.) But one *MUST* keep in mind that this is *NOT* the same thing as saying that the bit patterns of these pointers must be the same. Nor is it meaningful to say that "they have the same value". The most one should say is that they have the same bit-pattern on most common architectures. Consider an analogous argument. "I have seen several references to floating point zero vs integer zero. It is clear that the types (float) and (int) should be different. But the values should be the same." (And, just as the two pointers above are "really" indicating the same storage element, the two values are "really" indicating the same point on the "number line".) Most people see that this is bogus (I hope). And it is bogus for *precisely* the same reason that the same statement about pointers is bogus. The fact that the bit patterns of these objects are the same on most common architectures is *irrelevant*. Because the same operations performed on these objects give different results, and because of this the bit patterns are best thought of as having a *different* *interpretation*. Given that the interpretation is different, it is at best meaningless and at worst dangerously misleading to say that they "have the same value". (If this is what Joe meant by "only different in datatype", then I agree with him. But I disagree that this is a good phrase to use to describe this difference.) (I agree that there are interesting distinctions between the (float) vs (int) case and the (int *) vs (int (*)[]) case. But these distinctions are (I believe) not relevant to this point.) (And lastly, if I thought "a and &a[0] have the same value" meant "((void *)a)==((void *)&a[0]) is true", I'd agree. But I think the meaning of "same value" in common use means more than this, so I don't agree.) > jon@amdahl.UUCP (Jonathan Leech) >> jsdy@hadron.UUCP (Joseph S. D. Yao) >> Well, yes, in some theoretical architectures I've heard tell of >> pointers include arbitrary information on e.g. the size of the >> object. Any of these actually implemented? > You could implement pointers as a triple: > (low address, length, offset of current member) > for range checking. The interesting thing here is that the address of an array and the address of the first element, under the above scheme, would *still* have exactly the same bitwise value! This odd result depends on my interpretation of "length", and is derived from the fact that the length is used for range checking. The point about range checking leads to the conclusion that the length is used to regulate pointer arithmetic, and is thus *not* the length of the item the pointer denotes, but rather the length of valid addressing arithmetic from the "low address". Now, in the case of an array declared at top level, this range is the length of the array. But, in C, the valid addressing range for the first element of an array declared at top level is *still* *the* *length* *of* *the* *array*! Thus, in terms of the above triple, the address of the array char a[10]; ought to be (whatever,10,0), and the address of the first element of this array ought also to be (whatever,10,0). > mouse@mcgill-vision.UUCP (der Mouse) > As for what they SHOULD be....a pointer to an array should be just > that; indirecting off it should result in an array. There are good > reasons this isn't done; I'm a little confused by this. On most compilers I've used indirecting a pointer to an array yields an array, so I'm not sure what is meant by saying that it "isn't done". Perhaps it means that this isn't common usage? I'll agree with that. But nevertheless, most implementations get this particular fine point right. > I have yet to hear an implementation suggested > that doesn't have worse flaws than the flaw currently under discussion. I presume this means that most implementations of C have worse bugs than that of allowing (&array) and returning the address of the first element instead. I won't argue with that. But what I was objecting to was elevating this common bug to a "standard feature". I still think it would be wrong to do so. If it means anything (and currently it does *NOT*), (&array) should indicate the address of the whole array, not the address of its first element. -- God made integers, all else is the work of man. --- Leopold Kronecker {1823-1891} -- Wayne Throop <the-known-world>!mcnc!rti-sel!dg_rtp!throopw
guy@sun.uucp (Guy Harris) (08/11/86)
> As for what they SHOULD be....a pointer to an array should be just > that; indirecting off it should result in an array. There are good > reasons this isn't done; I have yet to hear an implementation suggested > that doesn't have worse flaws than the flaw currently under discussion. I suspect the implementors of PCC would be interested in hearing those "good reasons", since PCC *does* implement the type "pointer to array", and if you dereference something of that type it does yield an array (which gets converted to a pointer into its first element, along the lines mentioned in the ANSI C draft). In fact, they had no choice *but* to implement that type, since K&R clearly indicates that "pointer to array" is a valid type (an example is given of a pointer of that type in the C Reference Manual). It is a nuisance to generate a *value* of that type to assign to a variable of that type, but that's another matter. -- Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.com (or guy@sun.arpa)
karl@haddock (08/13/86)
hadron!jsdy writes: >I have seen several references to the address of an array vs. >the address of the first element of the array. Would someone >care to address what they think this difference is, aside from >data type? On most machines, as you imply, &a and &a[0] do indeed have the same bit-pattern, and will compare equal if you cast them to a common type. If, however, you want to *do* something with the pointer (*, [], +, -, ++, --, etc.) you'd better have the correct type as well as value. In particular, the effect of ++p on an int(*)[] is not the same as on an int*. Btw, someone suggested earlier that ANSI C doesn't interpret &a as pointer to array. I think it does: "The operand of the unary & operator shall be a function locator or an lvalue [other than bit-field or register]. ... The result ... is a pointer to the object [and has type] `pointer to _t_y_p_e'." (3.3.3.2, 01-May-1986 draft.) Arrays are not mentioned as a special case. And yes, arrays *are* (non-modifiable) lvalues in X3J11. Karl W. Z. Heuer (ihnp4!ima!haddock!karl), The Walking Lint
karl@haddock (08/16/86)
dg_rtp!throopw (Wayne Throop) writes: >What I was objecting to was elevating this common bug [interpreting &a as >&a[0]] to a "standard feature". I still think it would be wrong to do so. >If it means anything (and currently it does *NOT*), (&array) should indicate >the address of the whole array, not the address of its first element. As I mentioned before, X3J11 seems to have accepted the "correct" meaning. In the case of the other "optional ampersand", if f is a function locator, then "&f" and "f" are equivalent. I personally think the first is more meaningful (language purity and all that; "f" should denote the function as a whole, even if you can't do anything with it other than "&" or "()"), but I don't use it because lint prefers the second. I also detest the usage of "pf()" for "(*pf)()", but X3J11 has blessed this as well. (In fact, they defined "()" to *always* operate on a function *pointer* (possibly obtained from the implied "&" on a function locator), so now the first form is more "correct"!) Karl W. Z. Heuer (ihnp4!ima!haddock!karl), The Walking Lint
throopw@dg_rtp.UUCP (Wayne Throop) (08/16/86)
> karl@haddock (Karl W. Z. Heuer) Ghak!! It's a good thing I said I wouldn't be aghast if ANSI C made &array legal, since they already *have* made it legal! Karl raised a point: > Btw, someone suggested earlier that ANSI C doesn't interpret &a as pointer > to array. I think it does: "The operand of the unary & operator shall be > a function locator or an lvalue [other than bit-field or register]. ... > The result ... is a pointer to the object [and has type] `pointer to type'." > (3.3.3.2, 01-May-1986 draft.) Arrays are not mentioned as a special case. > And yes, arrays *are* (non-modifiable) lvalues in X3J11. And, looking up the references, I found in addition to the above points a another that I don't know how I missed before: C.2.2.1 Except when used as an operand that may or shall be an lvalue, [...] an expression that has type "array of *type*" is converted to an expression that has type "pointer to *type*" and that points to the initial member of the array object. Interestingly, the "may or shall be an lvalue" is a little strange. I assume that what is meant here is that the conversion is not done iff an lvalue is required. (After all, an lvalue *may* be used anywhere a value may be used, but maybe I'm missing some subtle point.) Note that this is *different* than what K&R say. K&R say that arrays are not lvalues, and the only anomaly is for "sizeof" (and this anomaly is listed along with sizeof, not with arrays). H&S agree with K&R, (on page 97, for example): [array of T is converted to pointer to T]. This rule is one of the usual unary conversions. The only exception to the conversion rule is when an array identifier is used as an operand of the sizeof operator, So, when ANSI-compliant compilers hit the streets, arrays will be lvalues, &array will be legal, and it will even behave reasonably. They even made the rule for array promotion reasonably simple, though the "sizeof" is still a separate special case, and is still listed separately. And it's still a little difficult to get an array-typed rvalue, so assignment still doesn't work, even aside from the fact that ANSI doesn't make array-typed lvalues modifiable. -- "They couldn't hit an elephant at this dist......" --- The last words of General John Sedgwick, at the battle of Spotsylvania, 1864 -- Wayne Throop <the-known-world>!mcnc!rti-sel!dg_rtp!throopw
karl@haddock (08/20/86)
dg_rtp!throopw (Wayne Throop) writes: >And [in ANSI C] it's still a little difficult to get an array-typed >rvalue, so assignment still doesn't work, even aside from the fact that >ANSI doesn't make array-typed lvalues modifiable. I've got some ideas about that, but the first step is to deprecate the "feature" that allows you to write "f(int a[])" for "f(int *a)". (I refer here to the declaration of the function, not its call.) In my mind, since arrays may not currently be passed as arguments, the declaration is an error, and the compiler is "politely" figuring out what you must have *meant*. As has already been pointed out, "sizeof(a)" gives you "sizeof(int *)" in this context, so the apparent acceptance of the declaration tends to be confusing. Karl W. Z. Heuer (ihnp4!ima!haddock!karl), The Walking Lint