[comp.std.c] Bounds checks.

norvell@csri.toronto.edu (Theo Norvell) (12/09/89)

In article <809@prles2.UUCP> meulenbr@cstw68.prl.philips.nl (Frans Meulenbroeks) writes:
>(by the way, does ANSI allow index out of
>bound checks? Are they forbidden? Is it left to the implementor? I could
>not find anything in the draft)
>
The drafts were not very explicit on this point, but when I was writing
a compiler that did bounds checks, I read the then current draft
and came to the following conclusion.

Loading or storing out of bounds results in undefined behaviour.  The
standard does not say this directly, but it does say:
	(1) Adding or subtracting from a pointer such that
	    it points outside of the array it is pointing into
	    results in an invalid pointer (I think that is the
	    term used).
	(2) Loading or storing through an invalid pointer is
	    undefined.
Note that forming an invalid pointer is not always undefined.
In the special case of a pointer value that points just past
the end of an array you can still compare with it (consider
int A[N] ; for(p=A; p < A+N; ++p) ... ) and even dereference it
to form a (invalid) lvalue (consider for(p=A; p < &A[N]; ++p) ...
recalling that A[N] is the same as *(A+N)) but you can not load or
store at that lvalue.

Thus the implementor is free to check bounds so long as she is
careful about the one past the end case.  The programmer must
not form pointer values that point out of bounds except for the one past
the end case, and in any case must not load or store via such a pointer.

Theo Norvell

norvell@csri.toronto.edu (Theo Norvell) (12/12/89)

In article <1989Dec8.161820.24804@jarvis.csri.toronto.edu> norvell@csri.toronto.edu (I) write:
>The drafts were not very explicit on this point [bounds checks], but when
>I was writing a compiler that did bounds checks, I read the then current
> draft and came to the following conclusion. [nonsense omitted]

After looking at a more recent draft (May 88) I found that (at least) 3 things
I said were either out of date or plain wrong.  Let me make amends by saying:
	(1) The draft is very explicit (3.3.6) that bounds checking is allowed.
	(2) Even creating a pointer that points out of the array is
	    undefined with the exception of the pointer just past the end.
	(3) Merely dereferencing the just past the end pointer is undefined,
	    not as I said loading or storing the resultant lvalue (although
	    that is naturally undefined too). Thus
		    int A[N], *p;
		    for(p=A; p < A+N ; ++p ) { ... } /* Good */
		    for(p=A; p <  &A[N] ; ++p ) { ... } /* Undefined! */

Theo Norvell

rhg@cpsolv.UUCP (Richard H. Gumpertz) (12/12/89)

In article <1989Dec11.181631.3864@jarvis.csri.toronto.edu> norvell@csri.toronto.edu (Theo Norvell) writes:
>		    int A[N], *p;
>		    for(p=A; p < A+N ; ++p ) { ... } /* Good */
>		    for(p=A; p <  &A[N] ; ++p ) { ... } /* Undefined! */

Gee, that is kind of interesting.  Is the result of &A[N] "used as an
operand of the unary * operator" (which is prohibited in 3.3.6)?  That
is, does the & operator cancel out the * implicit in [...]?  I think some
special language might be required in 3.3.6 to allow &* without
undefined results, since this is probably what the committee desired
anyway.  It would be silly to allow A+N but not &A[N]!

-- 
===============================================================================
| Richard H. Gumpertz rhg%cpsolv@uunet.uu.NET -or- ...uunet!amgraf!cpsolv!rhg |
| Computer Problem Solving, 8905 Mohawk Lane, Leawood, Kansas 66206-1749      |
===============================================================================

bill@twwells.com (T. William Wells) (12/13/89)

In article <1989Dec11.181631.3864@jarvis.csri.toronto.edu> norvell@csri.toronto.edu (Theo Norvell) writes:
:                   int A[N], *p;
:                   for(p=A; p < A+N ; ++p ) { ... } /* Good */
:                   for(p=A; p <  &A[N] ; ++p ) { ... } /* Undefined! */

The two are exactly the same:

	&A[N] = &(*(A + N)) = A + N

---
Bill                    { uunet | novavax | ankh | sunvice } !twwells!bill
bill@twwells.com

rhg@cpsolv.UUCP (Richard H. Gumpertz) (12/13/89)

In article <1989Dec12.190347.13521@twwells.com> bill@twwells.com (T. William Wells) writes:
>The two are exactly the same:
>
>	&A[N] = &(*(A + N)) = A + N

No, the two are not exactly alike.  According to 3.3.6, *(A+N) is
undefined ("...the behavior is undefined if the result is used as an
operand of a unary * operator") and so &(*(A+N)) is undefined.  A+N, on
the other hand, is well defined.  I really believe that the wording in
3.3.6 is wrong; I cannot believe that the committee intended for A+N to
be legal but not &A[N].

Anyone on the committee care to respond?
-- 
===============================================================================
| Richard H. Gumpertz rhg%cpsolv@uunet.uu.NET -or- ...uunet!amgraf!cpsolv!rhg |
| Computer Problem Solving, 8905 Mohawk Lane, Leawood, Kansas 66206-1749      |
===============================================================================

gwyn@smoke.BRL.MIL (Doug Gwyn) (12/14/89)

In article <465@cpsolv.UUCP> rhg@cpsolv.uucp (Richard H. Gumpertz) writes:
>In article <1989Dec11.181631.3864@jarvis.csri.toronto.edu> norvell@csri.toronto.edu (Theo Norvell) writes:
>>int A[N];
>>&A[N] /* Undefined! */
>I think some special language might be required in 3.3.6 to allow &*
>without undefined results, since this is probably what the committee
>desired anyway.  It would be silly to allow A+N but not &A[N]!

I seem to vaguely recall discussion of this point in some X3J11 meeting,
and it is not clear to me whether or not &A[N] being undefined was
intended or not.  This is another case where an official query should
be sent to X3.

For people who didn't follow the argument, &A[N] is equivalent to &(*(A+(N)))
but *(A+(N)) is a semantic violation.  (A+(N)) is okay, but applying * to
it is not okay.

If this reminds you of the sizeof(((foo*)0)->bar) argument, well...

ejp@bohra.cpg.oz (Esmond Pitt) (12/15/89)

>In article <1989Dec12.190347.13521@twwells.com> bill@twwells.com (T. William Wells) writes:
>The two are exactly the same:
>
>	&A[N] = &(*(A + N)) = A + N

Is this really correct?

	&(*(anything))
	
does not have a defined meaning anywhere else in C. Perhaps we mean:

	&A[N] = &A[0+N] = &A[0]+N = A+N


-- 
Esmond Pitt, Computer Power Group
ejp@bohra.cpg.oz

bdm659@csc.anu.oz (12/15/89)

In article <11809@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
> In article <465@cpsolv.UUCP> rhg@cpsolv.uucp (Richard H. Gumpertz) writes:
>>In article <1989Dec11.181631.3864@jarvis.csri.toronto.edu> norvell@csri.toronto.edu (Theo Norvell) writes:
>>>int A[N];
>>>&A[N] /* Undefined! */
>>I think some special language might be required in 3.3.6 to allow &*
>>without undefined results, since this is probably what the committee
>>desired anyway.  It would be silly to allow A+N but not &A[N]!
>
> I seem to vaguely recall discussion of this point in some X3J11 meeting,
> and it is not clear to me whether or not &A[N] being undefined was
> intended or not.  This is another case where an official query should
> be sent to X3.
>
> For people who didn't follow the argument, &A[N] is equivalent to &(*(A+(N)))
> but *(A+(N)) is a semantic violation.  (A+(N)) is okay, but applying * to
> it is not okay.

Section 3.3.6 in the Rationale (Dec. 1988 version) has an example using
&A[N], so at least one member of the committee thought it was ok.  On the
other hand, that part of the Rationale is up the creek in another way.
It says

"This stipulation [allowing a pointer to point just after an array] merely
requires that every object be followed by one byte whose address is
representable."

I don't think that follows at all.  An implementation could easily make a
special case in the internal representation of a pointer to cover the
case where an array ends at the end of addressable memory, or at the end
of a memory segment, if it wanted to.  I don't see anything in the Standard
which says that a pointer just past the end of an array must be a pointer to
an object.  In fact, in places it is clearly distinguished from pointers to
objects.

Brendan McKay.   bdm@anucsd.oz.au  or  bdm659@csc1.anu.oz.au