norvell@csri.toronto.edu (Theo Norvell) (12/09/89)
In article <809@prles2.UUCP> meulenbr@cstw68.prl.philips.nl (Frans Meulenbroeks) writes: >(by the way, does ANSI allow index out of >bound checks? Are they forbidden? Is it left to the implementor? I could >not find anything in the draft) > The drafts were not very explicit on this point, but when I was writing a compiler that did bounds checks, I read the then current draft and came to the following conclusion. Loading or storing out of bounds results in undefined behaviour. The standard does not say this directly, but it does say: (1) Adding or subtracting from a pointer such that it points outside of the array it is pointing into results in an invalid pointer (I think that is the term used). (2) Loading or storing through an invalid pointer is undefined. Note that forming an invalid pointer is not always undefined. In the special case of a pointer value that points just past the end of an array you can still compare with it (consider int A[N] ; for(p=A; p < A+N; ++p) ... ) and even dereference it to form a (invalid) lvalue (consider for(p=A; p < &A[N]; ++p) ... recalling that A[N] is the same as *(A+N)) but you can not load or store at that lvalue. Thus the implementor is free to check bounds so long as she is careful about the one past the end case. The programmer must not form pointer values that point out of bounds except for the one past the end case, and in any case must not load or store via such a pointer. Theo Norvell
norvell@csri.toronto.edu (Theo Norvell) (12/12/89)
In article <1989Dec8.161820.24804@jarvis.csri.toronto.edu> norvell@csri.toronto.edu (I) write: >The drafts were not very explicit on this point [bounds checks], but when >I was writing a compiler that did bounds checks, I read the then current > draft and came to the following conclusion. [nonsense omitted] After looking at a more recent draft (May 88) I found that (at least) 3 things I said were either out of date or plain wrong. Let me make amends by saying: (1) The draft is very explicit (3.3.6) that bounds checking is allowed. (2) Even creating a pointer that points out of the array is undefined with the exception of the pointer just past the end. (3) Merely dereferencing the just past the end pointer is undefined, not as I said loading or storing the resultant lvalue (although that is naturally undefined too). Thus int A[N], *p; for(p=A; p < A+N ; ++p ) { ... } /* Good */ for(p=A; p < &A[N] ; ++p ) { ... } /* Undefined! */ Theo Norvell
rhg@cpsolv.UUCP (Richard H. Gumpertz) (12/12/89)
In article <1989Dec11.181631.3864@jarvis.csri.toronto.edu> norvell@csri.toronto.edu (Theo Norvell) writes: > int A[N], *p; > for(p=A; p < A+N ; ++p ) { ... } /* Good */ > for(p=A; p < &A[N] ; ++p ) { ... } /* Undefined! */ Gee, that is kind of interesting. Is the result of &A[N] "used as an operand of the unary * operator" (which is prohibited in 3.3.6)? That is, does the & operator cancel out the * implicit in [...]? I think some special language might be required in 3.3.6 to allow &* without undefined results, since this is probably what the committee desired anyway. It would be silly to allow A+N but not &A[N]! -- =============================================================================== | Richard H. Gumpertz rhg%cpsolv@uunet.uu.NET -or- ...uunet!amgraf!cpsolv!rhg | | Computer Problem Solving, 8905 Mohawk Lane, Leawood, Kansas 66206-1749 | ===============================================================================
bill@twwells.com (T. William Wells) (12/13/89)
In article <1989Dec11.181631.3864@jarvis.csri.toronto.edu> norvell@csri.toronto.edu (Theo Norvell) writes:
: int A[N], *p;
: for(p=A; p < A+N ; ++p ) { ... } /* Good */
: for(p=A; p < &A[N] ; ++p ) { ... } /* Undefined! */
The two are exactly the same:
&A[N] = &(*(A + N)) = A + N
---
Bill { uunet | novavax | ankh | sunvice } !twwells!bill
bill@twwells.com
rhg@cpsolv.UUCP (Richard H. Gumpertz) (12/13/89)
In article <1989Dec12.190347.13521@twwells.com> bill@twwells.com (T. William Wells) writes: >The two are exactly the same: > > &A[N] = &(*(A + N)) = A + N No, the two are not exactly alike. According to 3.3.6, *(A+N) is undefined ("...the behavior is undefined if the result is used as an operand of a unary * operator") and so &(*(A+N)) is undefined. A+N, on the other hand, is well defined. I really believe that the wording in 3.3.6 is wrong; I cannot believe that the committee intended for A+N to be legal but not &A[N]. Anyone on the committee care to respond? -- =============================================================================== | Richard H. Gumpertz rhg%cpsolv@uunet.uu.NET -or- ...uunet!amgraf!cpsolv!rhg | | Computer Problem Solving, 8905 Mohawk Lane, Leawood, Kansas 66206-1749 | ===============================================================================
gwyn@smoke.BRL.MIL (Doug Gwyn) (12/14/89)
In article <465@cpsolv.UUCP> rhg@cpsolv.uucp (Richard H. Gumpertz) writes: >In article <1989Dec11.181631.3864@jarvis.csri.toronto.edu> norvell@csri.toronto.edu (Theo Norvell) writes: >>int A[N]; >>&A[N] /* Undefined! */ >I think some special language might be required in 3.3.6 to allow &* >without undefined results, since this is probably what the committee >desired anyway. It would be silly to allow A+N but not &A[N]! I seem to vaguely recall discussion of this point in some X3J11 meeting, and it is not clear to me whether or not &A[N] being undefined was intended or not. This is another case where an official query should be sent to X3. For people who didn't follow the argument, &A[N] is equivalent to &(*(A+(N))) but *(A+(N)) is a semantic violation. (A+(N)) is okay, but applying * to it is not okay. If this reminds you of the sizeof(((foo*)0)->bar) argument, well...
ejp@bohra.cpg.oz (Esmond Pitt) (12/15/89)
>In article <1989Dec12.190347.13521@twwells.com> bill@twwells.com (T. William Wells) writes: >The two are exactly the same: > > &A[N] = &(*(A + N)) = A + N Is this really correct? &(*(anything)) does not have a defined meaning anywhere else in C. Perhaps we mean: &A[N] = &A[0+N] = &A[0]+N = A+N -- Esmond Pitt, Computer Power Group ejp@bohra.cpg.oz
bdm659@csc.anu.oz (12/15/89)
In article <11809@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn) writes: > In article <465@cpsolv.UUCP> rhg@cpsolv.uucp (Richard H. Gumpertz) writes: >>In article <1989Dec11.181631.3864@jarvis.csri.toronto.edu> norvell@csri.toronto.edu (Theo Norvell) writes: >>>int A[N]; >>>&A[N] /* Undefined! */ >>I think some special language might be required in 3.3.6 to allow &* >>without undefined results, since this is probably what the committee >>desired anyway. It would be silly to allow A+N but not &A[N]! > > I seem to vaguely recall discussion of this point in some X3J11 meeting, > and it is not clear to me whether or not &A[N] being undefined was > intended or not. This is another case where an official query should > be sent to X3. > > For people who didn't follow the argument, &A[N] is equivalent to &(*(A+(N))) > but *(A+(N)) is a semantic violation. (A+(N)) is okay, but applying * to > it is not okay. Section 3.3.6 in the Rationale (Dec. 1988 version) has an example using &A[N], so at least one member of the committee thought it was ok. On the other hand, that part of the Rationale is up the creek in another way. It says "This stipulation [allowing a pointer to point just after an array] merely requires that every object be followed by one byte whose address is representable." I don't think that follows at all. An implementation could easily make a special case in the internal representation of a pointer to cover the case where an array ends at the end of addressable memory, or at the end of a memory segment, if it wanted to. I don't see anything in the Standard which says that a pointer just past the end of an array must be a pointer to an object. In fact, in places it is clearly distinguished from pointers to objects. Brendan McKay. bdm@anucsd.oz.au or bdm659@csc1.anu.oz.au