[comp.std.c] a[], *p: if 0 <= p - a < sizeof

brnstnd@stealth.acf.nyu.edu (01/05/90)

Basically, I want to have a guaranteed test of whether p points to an
element of a, with foo a[N] and foo *p. I know that *if* p points to an
element of a, *then* 0 <= p - a < sizeof(a); is the reverse true? (If
yes, please explain the logic.)

The standard should define pointer subtraction more carefully.

---Dan

gwyn@smoke.BRL.MIL (Doug Gwyn) (01/05/90)

In article <875@stealth.acf.nyu.edu> brnstnd@stealth.acf.nyu.edu (Dan Bernstein) writes:
>Basically, I want to have a guaranteed test of whether p points to an
>element of a, with foo a[N] and foo *p. I know that *if* p points to an
>element of a, *then* 0 <= p - a < sizeof(a); is the reverse true? (If
>yes, please explain the logic.)

No, in fact no strictly conforming program could rely on the converse
since p-a would be an illegal operation when p didn't point into a[]
(or just past its last element).

>The standard should define pointer subtraction more carefully.

I think it was pretty darn careful.

bill@twwells.com (T. William Wells) (01/05/90)

In article <875@stealth.acf.nyu.edu> brnstnd@stealth.acf.nyu.edu (Dan Bernstein) writes:
: Basically, I want to have a guaranteed test of whether p points to an
: element of a, with foo a[N] and foo *p. I know that *if* p points to an
: element of a, *then* 0 <= p - a < sizeof(a); is the reverse true? (If
: yes, please explain the logic.)

No. In general, if p does not point to a member of a, p - a is
undefined.

: The standard should define pointer subtraction more carefully.

It has defined this most explicitly. Since my fingers are tired,
I won't type in the appropriate text. But it is in section 3.3.6.

---
Bill                    { uunet | novavax | ankh | sunvice } !twwells!bill
bill@twwells.com

brnstnd@stealth.acf.nyu.edu (01/07/90)

In article <1990Jan5.040710.23691@twwells.com> bill@twwells.com (T. William Wells) writes:
> In article <875@stealth.acf.nyu.edu> brnstnd@stealth.acf.nyu.edu (Dan Bernstein) writes:
> : Basically, I want to have a guaranteed test of whether p points to an
> : element of a, with foo a[N] and foo *p. I know that *if* p points to an
> : element of a, *then* 0 <= p - a < sizeof(a); is the reverse true? (If
> : yes, please explain the logic.)
> No. In general, if p does not point to a member of a, p - a is
> undefined.

That doesn't answer the question!

Take a program that computes p - a. There are four possible results for
one run of the program on one machine under one compiler:

  1. FPE is generated.
  2. SEGV is generated.
  3. p - a is between 0 and sizeof(a)/sizeof(*a) - 1 inclusive.
  4. p - a is outside that range.

The standard specifies that if p points to an element of a, then the
result is 3. Also, if p points just past a, then the result is 4.
Otherwise, as you point out, the result is undefined.

My question is about the opposite direction. If cases 1, 2, or 4 happen,
then a conforming program can safely conclude that p does not point to
an element of a. I want to complete that test.

ANSI saw fit to specify that a*(a/b)+(a%b) equals a for any numbers a
and b, provided that a/b doesn't generate a signal. Could they specify
that a + (p - a) equals p for pointers a and p, provided that p - a
doesn't generate a signal? This would make pointer subtraction slightly
slower on most machines: after subtracting the machine values of p and
a and dividing by the size of the type, the code would have to generate
a signal if the division wasn't exact. ANSI has always sacrificed
efficiency for consistency; why not here?

> : The standard should define pointer subtraction more carefully.
> It has defined this most explicitly. Since my fingers are tired,
> I won't type in the appropriate text. But it is in section 3.3.6.

It doesn't define pointer subtraction well enough to answer my questions.

---Dan

bill@twwells.com (T. William Wells) (01/07/90)

In article <1978@stealth.acf.nyu.edu> brnstnd@stealth.acf.nyu.edu (Dan Bernstein) writes:
: In article <1990Jan5.040710.23691@twwells.com> bill@twwells.com (T. William Wells) writes:
: > In article <875@stealth.acf.nyu.edu> brnstnd@stealth.acf.nyu.edu (Dan Bernstein) writes:
: > : Basically, I want to have a guaranteed test of whether p points to an
: > : element of a, with foo a[N] and foo *p. I know that *if* p points to an
: > : element of a, *then* 0 <= p - a < sizeof(a); is the reverse true? (If
: > : yes, please explain the logic.)
: > No. In general, if p does not point to a member of a, p - a is
: > undefined.
:
: That doesn't answer the question!

Well, with just a little thought, it should.

There is no test of whether p points to an element of a that is
built into the language. If you do want to do such a test, you
have to go way 'round the barn to do it. For example,

	for (i = 0; i < N && p != &A[i]; ++i)
		;
	if (i != N) {
		/* is a member */
	} else {
		/* is not a member */
	}

: Take a program that computes p - a. There are four possible results for
: one run of the program on one machine under one compiler:
:
:   1. FPE is generated.
:   2. SEGV is generated.
:   3. p - a is between 0 and sizeof(a)/sizeof(*a) - 1 inclusive.
:   4. p - a is outside that range.

Don't forget 5:

    5. Places a call to your mother and plays a recording saying
       you've been a bad boy and subtracted two pointers you
       shouldn't have. :-) Perfectly legit if p is not in a. And,
       oh yes, p - a could then return 0.

: The standard specifies that if p points to an element of a, then the
: result is 3. Also, if p points just past a, then the result is 4.
: Otherwise, as you point out, the result is undefined.

And the result *may* be any of 0..sizeof whatever - 1. Meaning
that case 3 does not mean that p is in a.

:                                                     Could they specify
: that a + (p - a) equals p for pointers a and p, provided that p - a
: doesn't generate a signal? This would make pointer subtraction slightly
: slower on most machines: after subtracting the machine values of p and
: a and dividing by the size of the type, the code would have to generate
: a signal if the division wasn't exact.

What you are suggesting won't always work. Not all machines have
addresses that can be mapped into integers. But, even when it will
it would make pointer subtraction much slower on some very common
machines. On an 8086 (small model), integer pointer subtraction
looks like:

	mov     ax,p1
	sub     ax,p2
	ror     ax,1

For your test, add:

	jc      error

Timings: without the test, and assuming p1 and p2 in registers, 7
cycles; with the test, 11 cycles. Now, as is typically the case,
that jc really has to be a jnc followed by a jmp (because error
is external), the result is not 11 cycles, but 23 cycles.
Assuming the arguments are both on the stack, these become 37,
41, and 53 cycles.

Hardly slight.

And the biggest cost is in heavily optimized code.

For large model code, this gets really ugly. The standard pointer
subtraction code is essentially the same as in small model with a
cwd instruction added. Because of the undefinedness of p - a when
p is not in a, the compiler can ignore the case where the two
pointers are in different segments. But to achieve the results
you want, you have to add in code for comparing segments,
something like:

	mov     ax,segment p1
	cmp     ax,segment p2
	jne     error

or you can go to the pain of converting segment addresses to flat
addresses and doing long arithmetic with them. Both cases would
be much slower than as it is done now.

:                                        ANSI has always sacrificed
: efficiency for consistency; why not here?

That is a canard. Don't make baseless accusations. We're not
ignorant around here and will catch you out when you do.

: > : The standard should define pointer subtraction more carefully.
: > It has defined this most explicitly. Since my fingers are tired,
: > I won't type in the appropriate text. But it is in section 3.3.6.
:
: It doesn't define pointer subtraction well enough to answer my questions.

Oh yes it does.

What you said: "they should define *pointer subtraction* more
carefully". The implication is that there is some ambiguity or
imprecision in their specification. There are certainly such in
the standard, and there have been many discussion of such on this
newsgroup, but there is no imprecision or ambiguity in the
standard that relates to your specific question, as far as I know.

It is possible that you meant: "the standard should make pointer
subtraction definite in more cases than it does", but, if so, you
need to make your assertions more carefully, since that is not
what you *said*.

---
Bill                    { uunet | novavax | ankh | sunvice } !twwells!bill
bill@twwells.com

peter@ficc.uu.net (Peter da Silva) (01/09/90)

Here is a real-life example of a system wherein an optimising compiler
could fail to tell whether a pointer pointed into an array: the famous
intel 8086 family.

Suppose a is XXXX:AAAA through XXXX:BBBB (segment XXXX, offset AAAA through
BBBB).

Suppose p is YYYY:CCCC

Suppose AAAA < CCCC < BBBB.

Since pointer subtraction is only defined within an object, the compiler is
free to evaluate !p-a! by comparing only the offset portions of the
address. !p-a! will evaluate to !(CCCC-AAAA)/sizeof *p!, which is positive
and less than !sizeof a/sizeof *a!.

But !p! doesn't even point into the same segment as !a!. More complex,
in 8086 mode segments may overlap, so !p! may actually point into !a!
but in a different place than !a[p-a]!.

There's no way around this for a segmented architecture, except synthesising
your pointer operations. Not a good idea if you want performance.
-- 
 _--_|\  Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>.
/      \ Also <peter@ficc.lonestar.org> or <peter@sugar.lonestar.org>
\_.--._/
      v  "Have you hugged your wolf today?" `-_-'

brnstnd@stealth.acf.nyu.edu (01/10/90)

The answer to the specific question in the subject: No. The standard
guarantees nothing; and the test will fail on micros in practice. It may
also fail on some real computers if the type is of size 2 or more and if
p points between elements of a.

The answer to the question of how to reliably test whether p points to
an element of a: Run through the elements of a and test each against p.
There doesn't seem to be a faster portable method.

When ANSI says ``undefined,'' it really does mean that your computer
could blow up. This is an unfortunate gap in the standard (though, I
admit, not a sufficient incentive to switch to Ada).

---Dan