std-unix@longway.TIC.COM (Moderator, John S. Quarterman) (12/16/89)
From: Mark Brader <uunet!sq.sq.com!msb> Well, I've just seen the same topic being discussed independently in three different newsgroups, with three different Subject lines (four, now...). I've cross-posted this article to all three groups, and directed followups to comp.std.c; I suggest that further followups on the topic be made from this article (to keep the same Subject line), and in that group unless they refer specifically to existing C implementations or to POSIX. The issue is the legality of: struct foo_struct { int bar; char baz[1]; } *foo; foo = (struct foo_struct *) malloc(sizeof(struct foo_struct)+1); foo->baz[1] = 1; /* error? */ [Note that it is not disputed that, if this IS done, an assignment of *foo to another struct foo_struct won't copy the entire contents of the "extended" baz member; for this reason if no other, the construct may be undesirable.] Both Doug Gwyn and Dennis Ritchie have recently stated without proof, unless I misunderstood them, that this is not safe. I believe Doug has stated that there are implementations where it doesn't work, but hasn't named any. Can someone do so (in comp.lang.c)? A second issue is whether the usage is in conformance with the proposed ANSI Standard (pANS) for C. I claim that it is. The article from which the above code was taken continues: > Note that it is provable that the char pointer (foo->baz + 1) points > within the object returned by malloc. (The + here is of course the one derived from replacing x[y] with *(x+(y)).) To this another poster replied (in an article that was for some reason posted with Distribution usa, but which made it here anyway): | Unfortunately, it is not provable that the char pointer(foo->baz + 1) | points within the sub-object baz. Hence, the behavior is undefined | (X3J11/88-158, 3.3.6, page 48, lines 24-27). But this, I say, is irrelevant. I'll quote the actual words: # Unless both the pointer operand and the result point to elements of # the same array, or the pointer operand points one past the last element # of an array object and the result points to an element of the same array # object, the behavior is undefined if the result is used as an operand # of the unary * operator. There is NO REQUIREMENT here that the "array" spoken of, and the array whose name was mentioned in the pointer operand, be the same. In this case the pointer operand (char pointer value foo->baz), and the result (foo->baz + 1), both point into the space returned by malloc() which, it is guaranteed, may be treated as an array of sizeof(struct foo_struct)+1 chars. So they do point into the same array. Section 4.10.3, page 155, lines 13-15 (gee, this sounds familiar): # The pointer returned ... may be assigned to a pointer to any type of # object and then used to access such an object or an array of such # objects in the space allocated ... Well, to be fair, we didn't literally do that. To do it literally, we would have had to do: char *fooc = (char *) malloc(sizeof(struct foo_struct)+1); fooc += offsetof (struct foo, baz); /* sets fooc to foo->baz */ fooc[1] = 1; /* error? */ Is anyone claiming that fooc in the last line of this code could have a different value from foo->baz in the original? If not, can anyone cite another reason why this code is not conforming? Offsetof is a macro defined in section 4.1.5, page 99, lines 24-30, of which the key part is: # offsetof(type, memberdesignator) ... expands to an integral constant # expression ... the value of which is the offset in bytes, to the structure # member ..., from the beginning of the structure ... -- Mark Brader "Either the universe works in a predictable, analyzable way Toronto or it works spasmodically, with miracles, action at a distance utzoo!sq!msb and wishful thinking as the three fundamental forces. People msb@sq.com tend to take one view or the other." -- Frank D. Kirschner This article is in the public domain. Volume-Number: Volume 17, Number 104