[comp.std.c] Testing Equal Pointers

turner@sdti.SDTI.COM (Prescott K. Turner) (03/24/89)

The quotes below are from comp.lang.c.

In article <1989Mar21.085704.15894@ateng.ateng.com> chip@ateng.ateng.com (Chip Salzenberg) writes:
> For totally correct comparisons of all pointers, it's necessary
> to normalize them by hand, or be sure that they are cast to "huge *" when
> any pointer arithmetic is done.  Otherwise, the different combinations of
> segment+offset that actually refer to the same address do not compare equal.

In article <16039@cup.portal.com> Kevin_P_McCarty@cup.portal.com writes:
> It is possible however to have two pointers point to the same storage
> location but which compare unequal.  A one-to-one mapping between
> pointers and storage locations is not required.

Why is this dismal situation reflected in the pANS?  How can standard C
let pointers to the same object compare unequal?  Even in Microsoft C
large model there are ways to be sure that your pointer comparisons for
equality/inequality will yield the appropriate result.  

But STANDARD C PROVIDES NO WAY at all to tell if two pointers point to the
same object!  In fact, a compiler could implement (p1==p2) as
(p1==0 && p2==0) and still pass all tests for standard conformance.

Any program is not strictly conforming if it tests for equality of
pointers, gets a 0 result, and then proceeds on the basis that those
pointers do not point to the same object.

It is reasonable that X3J11 made allowances for hardware in exceptional
situations, as when the result of + overflows.  But that is different,
because a strictly conforming program has means to avoid arithmetic overflow.
--
Prescott K. Turner, Jr.
Software Development Technologies, Inc.
375 Dutton Rd., Sudbury, MA 01776 USA        (508) 443-5779
UUCP: ...{harvard,mit-eddie}!sdti!turner    Internet: turner@sdti.sdti.com

gwyn@smoke.BRL.MIL (Doug Gwyn ) (03/27/89)

In article <375@sdti.SDTI.COM> turner@sdti.SDTI.COM (Prescott K. Turner, Jr.) writes:
-The quotes below are from comp.lang.c.

You shouldn't believe everything you read in comp.lang.c.

-In article <1989Mar21.085704.15894@ateng.ateng.com> chip@ateng.ateng.com (Chip Salzenberg) writes:
-> For totally correct comparisons of all pointers, it's necessary
-> to normalize them by hand, or be sure that they are cast to "huge *" when
-> any pointer arithmetic is done.  Otherwise, the different combinations of
-> segment+offset that actually refer to the same address do not compare equal.
-In article <16039@cup.portal.com> Kevin_P_McCarty@cup.portal.com writes:
-> It is possible however to have two pointers point to the same storage
-> location but which compare unequal.  A one-to-one mapping between
-> pointers and storage locations is not required.

If these are talking about ANSI C, they are incorrect.
(Except that certain operations not permitted of conforming applications
might produce pointers that accidentally refer to the same location; but
since that would be totally erroneous code, you shouldn't worry about it.)

-Why is this dismal situation reflected in the pANS?  How can standard C
-let pointers to the same object compare unequal?  Even in Microsoft C
-large model there are ways to be sure that your pointer comparisons for
-equality/inequality will yield the appropriate result.  
-But STANDARD C PROVIDES NO WAY at all to tell if two pointers point to the
-same object!  In fact, a compiler could implement (p1==p2) as
-(p1==0 && p2==0) and still pass all tests for standard conformance.
-Any program is not strictly conforming if it tests for equality of
-pointers, gets a 0 result, and then proceeds on the basis that those
-pointers do not point to the same object.

That's all wrong.  Pointers to the same object compare equal, and object
pointers that compare equal are either both null pointers or both refer
to the same object (or one past it).  Similarly for function pointers.
See sections 3.3.8 and 3.3.9 in the pANS.  Null pointers compare unequal
to pointers to objects or functions; see section 3.2.2.3.

msb@sq.com (Mark Brader) (03/28/89)

> ... from comp.lang.c ...
> > It is possible however to have two pointers point to the same storage
> > location but which compare unequal.
> 
> Why is this dismal situation reflected in the pANS?  How can standard C
> let pointers to the same object compare unequal?  ...  STANDARD C
> PROVIDES NO WAY at all to tell if two pointers point to the same object!

Wrong (in pANS, "standard", C), no matter how loud you shout.  (I've added
a cross-posting back to comp.lang.c to point out that the compiler(s) referred
to in the original article would not be complying to the pANS in this respect.
Followups are directed to comp.lang.c also, since this article should close
the topic from a pANS viewpoint!)

The wording here did undergo some late editorial changes and may be
different between the October draft and the final December version of the
proposed standard.  Also, note that the discussion of equality is partly
under relational operators, rather than equality operators.  This is simply
because the presentation (in order of precedence) happens to get to them first.
Equality affects the relational operators too because the two expressions

		a <= b && !(a < b)
	and
		a == b

are equivalent in the absence of undefined behavior.

From Section 3.3.8 "Relational operators", page 50, lines 7-10:

#  If two pointers to object or incomplete types both point to the same
#  object, or both point one past the last element of the same array
#  object, they compare equal.  If two pointers to object or incomplete
#  types compare equal, both point to the same object, or both point
#  one past the last element of the same array object.

There is a footnote about invalid prior operations and undefined behavior.

Then, in Section 3.3.9 "Equality operators", page 50, lines 28-36:

#  [These] operators are analogous to the relational operators except
#  for their lower precedence.  Where the operands have types and values
#  suitable for the relational operators, the semantics detailed in
#  Section 3.3.8 apply.
#
#  If two pointers to object or incomplete types are both null pointers,
#  they compare equal.  If two pointers to object or incomplete types
#  compare equal, they both are null pointers, or both point to the same
#  object, or both point one past the last element of the same array
#  object.  If two pointers to function types compare equal, either both
#  are null pointers, or both point to the same function.  If two pointers
#  to function types compare equal, either both are null pointers, or
#  both point to the same function.

There is a footnote which is redundant.

Mark Brader, SoftQuad Inc., Toronto, utzoo!sq!msb, msb@sq.com
	A standard is established on sure bases, not capriciously but with
	the surety of something intentional and of a logic controlled by
	analysis and experiment. ... A standard is necessary for order
	in human effort.				-- Le Corbusier

turner@sdti.SDTI.COM (Prescott K. Turner) (03/28/89)

In article <375@sdti.SDTI.COM> turner@sdti.SDTI.COM (I) write:
>But STANDARD C PROVIDES NO WAY at all to tell if two pointers point to the
>same object!
On second thought, the correct interpretation of the proposed standard is
probably that comparison for equality/inequality will always work in
those situations where the other relational pointer comparisons are
guaranteed to work.

>Why is this dismal situation reflected in the pANS?
Since compilers for MS-DOS and other tricky architectures will compare
pointers correctly that point within the same aggregate object, the
situation is not dismal.  Sorry for the flame.
--
Prescott K. Turner, Jr.
Software Development Technologies, Inc.
P.O. Box 366, Sudbury, MA 01776 USA         (508) 443-5779
UUCP: ...{harvard,mit-eddie}!sdti!turner    Internet: turner@sdti.sdti.com

turner@sdti.SDTI.COM (Prescott K. Turner) (03/28/89)

In article <9930@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>That's all wrong.  Pointers to the same object compare equal ...
Yes, I was off base in my earlier article.  But doesn't the standard permit
odd behavior in a case like the following:
    char a[LIMIT];
    char b[LIMIT];
    ...
    if (a+LIMIT == b) ...
It appears that a+LIMIT may compare equal to b, in which case
3.3.9 says they point to the same object (b[0]) -- this is what one would
expect if the hardware had a simple address space and the allocation happened
a certain way.  But doesn't the standard permit the behavior which happens
with large model compilers for MS-DOS, in which they point to the same object
and compare not equal?  I guess you're implying that the MS-DOS behavior
is covered by saying that a+LIMIT does not point to an object.

>(Except that certain operations not permitted of conforming applications
>might produce pointers that accidentally refer to the same location; but
>since that would be totally erroneous code, you shouldn't worry about it.)
The discussion in comp.lang.c arose from a question of how to detect erroneous
code, i.e. how to verify that a pointer argument points into a particular
array rather than contains garbage.  Erroneous code is definitely a concern.

It seems that the solution proposed (by Karl Heuer I think) which
compares the argument for equality against every element of the array would
do the job (inefficiently).  If the pointer argument "accidentally" pointed
into the array, it might or might not be caught.  But any pointer which
passes the test does point into the array.
--
Prescott K. Turner, Jr.
Software Development Technologies, Inc.
P.O. Box 366, Sudbury, MA 01776 USA         (508) 443-5779
UUCP: ...{harvard,mit-eddie}!sdti!turner    Internet: turner@sdti.sdti.com

chip@ateng.ateng.com (Chip Salzenberg) (03/29/89)

Unfortunately, I apparently did not make myself clear when writing about
pointer comparison under Microsoft C.  Prescott Turner misunderstood me,
posting his expressions of disbelief.  For reference, I wrote in part:

> For totally correct comparisons of all pointers, it's necessary
> to normalize them by hand, or be sure that they are cast to "huge *" when
> any pointer arithmetic is done.  Otherwise, the different combinations of
> segment+offset that actually refer to the same address do not compare equal.

In the context of the quoted article, "all pointers" includes pointers
constructed *by hand*, i.e. non-portably.  If all pointers in a Microsoft C
program are generated in the normal C fashion, that is, by use of the "&"
operator and pointer arithmetic, the "==" and "!=" operators work as
expected.  No need to worry, Prescott.

Incidentally, these comments apply equally to Turbo C.

On the other hand, Doug Gwyn comments:

>(Except that certain operations not permitted of conforming applications
>might produce pointers that accidentally refer to the same location; but
>since that would be totally erroneous code, you shouldn't worry about it.)

I beg to disagree.  Non-portable is not the same as erroneous, at least in
the real world of applications programming.  (I'm sure Doug knows what I
mean here, even if he avoids such situations wherever possible, as I do.)

After all, if a given algorithm is non-portable and is therefore surrounded
by "#ifdef MSDOS"/"#endif", why bother trying to make its implementation
portable?

For an example:  Name a portable way to get the complete shift state of a
keyboard.  No, wait, that's too hard.  Name a portable way to *express* the
complete shift state of a keyboard.  Code involving such an entity is non-
portable from the get-go; so there's no reason to avoid non-portable coding
in its implementation.
-- 
Chip Salzenberg             <chip@ateng.com> or <uunet!ateng!chip>
A T Engineering             Me?  Speak for my company?  Surely you jest!
	  "It's no good.  They're tapping the lines."

gwyn@smoke.BRL.MIL (Doug Gwyn ) (03/30/89)

In article <377@sdti.SDTI.COM> turner@sdti.UUCP (0006-Prescott K. Turner, Jr.) writes:
>... doesn't the standard permit odd behavior in a case like the following:
>    char a[LIMIT];
>    char b[LIMIT];
>    ...
>    if (a+LIMIT == b) ...
>It appears that a+LIMIT may compare equal to b, in which case 3.3.9

Actually, 3.3.8.

>says they point to the same object (b[0])

or one past the last element of the same array object (a).  These are
indistinguishable possibilities in this example, but that's not
considered to pose a practical problem.

>But doesn't the standard permit the behavior which happens with large
>model compilers for MS-DOS, in which they point to the same object and
>compare not equal?  I guess you're implying that the MS-DOS behavior
>is covered by saying that a+LIMIT does not point to an object.

`a+LIMIT' is not guaranteed to point to an object, although it might
do so "accidentally" as in this example.  There are a number of ways
I might construct a pointer to a valid object, many of them quite
implementation-dependent.  The pANS requires that all "valid" (i.e.
portable) pointer operations that are guaranteed by the standard to
produce pointers to the same object (or one past the last element of
an array), will produce pointers that compare equal.  The important
point is that valid pointer operations that could produce equal
pointers (depending on run-time variables) but happen to have
produced unequal pointers are definitely guaranteed not to refer to
the same object, so that for example simple comparison of links in a
circular list can be used to determine when the list has been
traversed. If MS-DOS implementations can, through valid pointer
operations, produce pointers to the same object that might compare
unequal, then the standard requires some form of "normalization" be
performed before the comparison, in order to assure that the pointers
do compare equal.  This property is considered essential for reliable
C programming.

>Erroneous code is definitely a concern.

It is impractical for the standard to attempt to constrain the
behavior of code that does not obey the constraints of the standard.
Think about it.

If a pointer has been produced via valid operations, it may be tested
against the addresses of each element of an array object to tell
whether or not the pointer "lies within" the array.  Comparison with
just the first and last (plus one?) elements of the array is not
sufficient, because "normalization" is not required in such a case.
As I recall, the committee didn't think < or > between different
array object pointers was of sufficient practical importance to
require the extra normalization operations.

Frankly, I have to wonder how anyone could let their code get so far
out of hand as to not know what their pointers are pointing to.  The
only practical application for this question I've been able to think
of is to determine whether it is possible to implement memmove() in
portable ANSI C.  If segmented implementations normalize pointers when
they are passed as function arguments, there should be no problem,
otherwise memmove() requires an implementation-dependent solution.

flaps@dgp.toronto.edu (Alan J Rosenthal) (04/04/89)

chip@ateng.ateng.com (Chip Salzenberg) writes:
>After all, if a given algorithm is non-portable and is therefore surrounded
>by "#ifdef MSDOS"/"#endif", why bother trying to make its implementation
>portable?
>
>For an example:  Name a portable way to get the complete shift state of a
>keyboard.  No, wait, that's too hard.  Name a portable way to *express* the
>complete shift state of a keyboard.  Code involving such an entity is non-
>portable from the get-go; so there's no reason to avoid non-portable coding
>in its implementation.

Wellll, it's true that it's not worthwhile to squeeze the last bit of annoying
unportable code out, if there is such a last bit.  However, I believe that most
of the guidelines for portable programming are also good guidelines for robust
programming, and in particular even though something may be intrinsically
ms-dos dependent in might not be ms-dos v3.3.1 dependent, and more portable
programming techniques can make your code more likely to survive an operating
system upgrade.  (Or compiler upgrade (or change), for that matter.)

ajr

--
"The goto statement has been the focus of much of this controversy."
	    -- Aho & Ullman, Principles of Compiler Design, A-W 1977, page 54.