[comp.sys.ibm.pc] correct code for pointer subtraction

egisin@mks.UUCP (Eric Gisin) (12/09/88)

How come I can't find a compiler that generates correct
code for pointer subtraction in C on 8086s?
Neither Turbo, Microsoft, or Watcom do it right.
Here's an example:

struct six {
	int i[3];
};

int diff(struct six far* p, struct six far* q) {
  	return p - q;
}

main(void) {
	struct six s[1];
	printf("%d\n", diff(s+10000, s));	/* 10000 */
	printf("%d\n", diff(s, s+100));		/* -100 */
}

All of the compilers I tried computed a 16 bit difference,
then sign extended it before dividing.
This does not work if the pointers differ by more than 32K.
The proper method is to propagate the borrow flag into the high
order 16 bits, like this:

	mov	ax,WORD PTR [bp+4]	;p
	sub	ax,WORD PTR [bp+8]	;q
	sbb	dx,dx			; NOT cwd !!!
	mov	cx,6
	idiv	cx

Now I have to manually grep through a thousand lines of code
looking for any pointer subtractions to fix a bug.

adtaiwo@athena.mit.edu (Ademola Taiwo) (12/10/88)

Hi,
	You should read your reference manual, (Turbo-C) specifically says
that there are no denormalizations done in all memory models except HUGE, so
you may want to use the Huge model and you are guaranteed 32-bit computations
on your pointers, including proper denormals.
	I think the compilers are right in generating 16bit code for all other
memory models, since you have been warned about the trade-off of speed/space
that you are making by chosing any other model but huge.
	On thesame note, pointer comparisons are not guaranteed to be correct
in any model but huge. So if you want to fool arund with very big arrays, turn
on the HUGE flag.

ralf@b.gp.cs.cmu.edu (Ralf Brown) (12/10/88)

In article <597@mks.UUCP> egisin@mks.UUCP (Eric Gisin) writes:
-How come I can't find a compiler that generates correct
-code for pointer subtraction in C on 8086s?
-Neither Turbo, Microsoft, or Watcom do it right.
-Here's an example:
-
-struct six {
-	int i[3];
-};
-int diff(struct six far* p, struct six far* q) {
-  	return p - q;
-}
-main(void) {
-	struct six s[1];
-	printf("%d\n", diff(s+10000, s));	/* 10000 */
-	printf("%d\n", diff(s, s+100));		/* -100 */
-}
-
-All of the compilers I tried computed a 16 bit difference,
-then sign extended it before dividing.
-This does not work if the pointers differ by more than 32K.

In addition to HUGE model, which someone already pointed out, I would like to
mention that K&R only guarantees valid results for pointer subtractions 
between pointers to the SAME array.  In non-HUGE models, the largest object
is 64K, so for multi-byte data types, the result is correct (since they can't
differ by more than 32K objects).  For char arrays, you would have an 
ambiguity as to which pointer points lower in the array.
-- 
{harvard,uunet,ucbvax}!b.gp.cs.cmu.edu!ralf -=-=- AT&T: (412)268-3053 (school) 
ARPA: RALF@B.GP.CS.CMU.EDU |"Tolerance means excusing the mistakes others make.
FIDO: Ralf Brown at 129/31 | Tact means not noticing them." --Arthur Schnitzler
BITnet: RALF%B.GP.CS.CMU.EDU@CMUCCVMA -=-=- DISCLAIMER? I claimed something?
--

few@quad1.quad.com (Frank Whaley) (12/11/88)

In article <597@mks.UUCP> egisin@mks.UUCP (Eric Gisin) writes:
>How come I can't find a compiler that generates correct
>code for pointer subtraction in C on 8086s?
>struct six {
>	int i[3];
>};
>int diff(struct six far* p, struct six far* q) {
>  	return p - q;
>}
Yet another subtlety of segmented architectures...
Use "huge" instead of "far" -- and return a "long" instead of an "int".
Both "far" and "huge" imply 32-bit values, but since most programs work
with <64K objects, the optimization has been taken towards 16-bit arithmetic
(offset only).  Note that using "huge model" does not make pointers "huge" --
they must be individually specified (in Turbo C at least).

I prefer Lattice's method of handling this via a command line switch, rather
than by requiring source changes.

>The proper method is to propagate the borrow flag into the high
>order 16 bits, like this:
>	mov	ax,WORD PTR [bp+4]	;p
>	sub	ax,WORD PTR [bp+8]	;q
>	sbb	dx,dx			; NOT cwd !!!
>	mov	cx,6
>	idiv	cx
Actually, the correct method is to convert both pointers to 20-bit absolute
addresses, and then perform the arithmetic.  Most compilers provide subroutines
for this, as it's a bit much for inline code.

-- 
Frank Whaley
Senior Development Engineer
Quadratron Systems Incorporated
few@quad1.quad.com

Water separates the people of the world;
Wine unites them.

sch@sequent.UUCP (Steve Hemminger) (12/12/88)

I believe the ANSI std and K&R say pointer subtraction is only allowed inside
one data structure/array.  Since the small/medium/large model only allows
arrays <64K, only a 16bit result needs to be computed. Any code that
depends on subtracting pointers into two totally seperate arrays is
non-portable.

scjones@sdrc.UUCP (Larry Jones) (12/13/88)

In article <8455@sequent.UUCP>, sch@sequent.UUCP (Steve Hemminger) writes:
> I believe the ANSI std and K&R say pointer subtraction is only allowed inside
> one data structure/array.  Since the small/medium/large model only allows
> arrays <64K, only a 16bit result needs to be computed. Any code that
> depends on subtracting pointers into two totally seperate arrays is
> non-portable.

But even then you need a **17** bit (intermediate) result since
intermediate results between -65534 and +65534 are possible
(assuming the worst case - an array of 2-byte objects).  Since
dpANS treats the element just past the end of the array as valid
for pointer arithmetic, 17 bits really only allows objects to be
64k - 1 bytes long.

Huge model is the only one that supports 64k objects completely.
All the others only support 32767.  Compiler vendors who tell you
otherwise are lying (although not necessarily intentionally).

----
Larry Jones                         UUCP: uunet!sdrc!scjones
SDRC                                      scjones@sdrc.uucp
2000 Eastman Dr.                    BIX:  ltl
Milford, OH  45150                  AT&T: (513) 576-2070
"Save the Quayles" - Mark Russell

egisin@mks.UUCP (Eric Gisin) (12/13/88)

In article <8377@bloom-beacon.MIT.EDU>, adtaiwo@athena.mit.edu (Ademola Taiwo) writes:
> Hi,
> 	You should read your reference manual, (Turbo-C) specifically says
> that there are no denormalizations done in all memory models except HUGE, so
> you may want to use the Huge model and you are guaranteed 32-bit computations
> on your pointers, including proper denormals.

You don't know what you are talking about.
My example deals with pointers into an object of size 32K to 64K,
which near * and far * are claimed to work with.
Huge * is for objects larger than 64K.

My example program may be misleading because it declares diff() as long,
I really only expect an int.  An int is sufficient unless you are
comparing char *'s, but even then if you cast the difference to size_t
you get the correct value when p > q.
Even that should be unecessary, I thought ANSI added the ptrdiff_t
type specifically for 8086 brain-damage. It only takes "sbb dx,dx"
to widen the difference of two char *'s to the correct long value,
and that can be optimized away when a long value is not required.

> 	I think the compilers are right in generating 16bit code for all other
> memory models, since you have been warned about the trade-off of speed/space
> that you are making by chosing any other model but huge.
What? My fix doesn't cost anything, it changes one instruction.
If the compiler optimizes the divide into an arithmetic shift,
then you do have to add one instruction.

> 	On thesame note, pointer comparisons are not guaranteed to be correct
> in any model but huge. So if you want to fool arund with very big arrays, turn
> on the HUGE flag.
They are guaranteed when the operands point to components of the same object,
as was the case in my example.

egisin@mks.UUCP (Eric Gisin) (12/14/88)

In article <8455@sequent.UUCP>, sch@sequent.UUCP (Steve Hemminger) writes:
> I believe the ANSI std and K&R say pointer subtraction is only allowed inside
> one data structure/array.  Since the small/medium/large model only allows
> arrays <64K, only a 16bit result needs to be computed. Any code that
> depends on subtracting pointers into two totally seperate arrays is
> non-portable.

Please read my code. I have two pointers to "s", a single object.
Declaring "s" with 1 or 10000 makes no difference.
And the pointers are seperated by 60000, which is less than 64K.

struct six {		/* HINT: this struct is 6 bytes long */
	int i[3];
};

int diff(struct six far* p, struct six far* q) {
  	return p - q;
}

main(void) {
	struct six s[1];
	printf("%d\n", diff(s+10000, s));	/* 10000 */
	printf("%d\n", diff(s, s+100));		/* -100 */
}

kim@msn034.misemi (Kim Letkeman) (12/14/88)

In article <8455@sequent.UUCP>, sch@sequent.UUCP (Steve Hemminger) writes:
> I believe the ANSI std and K&R say pointer subtraction is only allowed inside
> one data structure/array.  Since the small/medium/large model only allows
> arrays <64K, only a 16bit result needs to be computed. Any code that
> depends on subtracting pointers into two totally seperate arrays is
> non-portable.

K&R A6.6 Pointers and Integers and A7.7 Additive Operators states that 
adding or subtracting pointers (with integers or other pointers within
the same array) gives you the displacement (as in relative record number).

ptr(obj2)-ptr(obj1) is always 1.

If you want to get a physical (byte) offset, you would likely have to
cast to int or longint, which would definitely be non portable.

Small data models (tiny, small, medium) use 16 bit pointers as offsets
from DS. Large data models (compact, large, huge) use 32 bit pointers
consisting of the segment and offset. The huge model is special in that
all pointer references call routines that normalize the result (smallest
possible segment value). This allows pointer addition/subtraction without
worrying about segment problems.

The small data models and the huge model appear to be the only really
safe models for heavy pointer arithmetic where byte displacements are
being calculated and used.

The language used in the above comment is a bit loose and ambiguous. You
should tighten it up a bit to avoid misunderstandings.

Kim

davidsen@steinmetz.ge.com (William E. Davidsen Jr) (12/14/88)

In article <597@mks.UUCP> egisin@mks.UUCP (Eric Gisin) writes:
| How come I can't find a compiler that generates correct
| code for pointer subtraction in C on 8086s?

  See below. I think you're doing it wrong...

int pdiff(ptr1, ptr2)
  struct six *ptr1, *ptr2;
{
  struct six huge *hptr1 = ptr1, *hptr2 = ptr2;

  return hptrs - hptr1;
}
-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {uunet | philabs}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

chasm@killer.DALLAS.TX.US (Charles Marslett) (12/14/88)

In article <3845@pt.cs.cmu.edu>, ralf@b.gp.cs.cmu.edu (Ralf Brown) writes:
=> In article <597@mks.UUCP> egisin@mks.UUCP (Eric Gisin) writes:
=> -How come I can't find a compiler that generates correct
=> -code for pointer subtraction in C on 8086s?
=> -Neither Turbo, Microsoft, or Watcom do it right.
=> -Here's an example:
=> -
=> -struct six {
=> -	int i[3];
=> -};
=> -int diff(struct six far* p, struct six far* q) {
=> -  	return p - q;
=> -}
=> -main(void) {
=> -	struct six s[1];
=> -	printf("%d\n", diff(s+10000, s));	/* 10000 */
=> -	printf("%d\n", diff(s, s+100));		/* -100 */
=> -}
=> -
=> -All of the compilers I tried computed a 16 bit difference,
=> -then sign extended it before dividing.
=> -This does not work if the pointers differ by more than 32K.
=> 
=> In addition to HUGE model, which someone already pointed out, I would like to
=> mention that K&R only guarantees valid results for pointer subtractions 
=> between pointers to the SAME array.  In non-HUGE models, the largest object
=> is 64K, so for multi-byte data types, the result is correct (since they can't
=> differ by more than 32K objects).  For char arrays, you would have an 
=> ambiguity as to which pointer points lower in the array.

I must disagree, the error occurs if the pointers differ by 32K bytes -- it
is independent of the size of an element.  The result is that if you use
pointer subtraction in you code, you must be K&R *AND* smaller than 32K bytes.

I finally wrote an assembly routine that takes two pointers and an element
size and returns the difference (as an integer).  That was just because the
compiler writers were all too lazy to do the code generation correctly!

By the way, MSC 5.1 does it wrong in small model too!

Charles Marslett
chasm@killer.dallas.tx.us
=> -- 
=> {harvard,uunet,ucbvax}!b.gp.cs.cmu.edu!ralf -=-=- AT&T: (412)268-3053 (school) 
=> ARPA: RALF@B.GP.CS.CMU.EDU |"Tolerance means excusing the mistakes others make.
=> FIDO: Ralf Brown at 129/31 | Tact means not noticing them." --Arthur Schnitzler
=> BITnet: RALF%B.GP.CS.CMU.EDU@CMUCCVMA -=-=- DISCLAIMER? I claimed something?
=> --

tarvaine@tukki.jyu.fi (Tapani Tarvainen) (12/15/88)

In article <3845@pt.cs.cmu.edu> ralf@b.gp.cs.cmu.edu (Ralf Brown) writes:
>In article <597@mks.UUCP> egisin@mks.UUCP (Eric Gisin) writes:
>-How come I can't find a compiler that generates correct
>-code for pointer subtraction in C on 8086s?
>-Neither Turbo, Microsoft, or Watcom do it right.

[stuff deleted]

>In addition to HUGE model, which someone already pointed out, I would like to
>mention that K&R only guarantees valid results for pointer subtractions 
>between pointers to the SAME array.

The same error occurs in the following program 
(with Turbo C 2.0 as well as MSC 5.0):

main()
{
        static int a[30000];
        printf("%d\n",&a[30000]-a);
}

output:  -2768

This seems perfectly legal according to either K&R or ANSI draft,
so I think this is a bug.



Tapani Tarvainen
------------------------------------------------------------------
Internet:  tarvainen@jylk.jyu.fi  -- OR --  tarvaine@tukki.jyu.fi
BitNet:    tarvainen@finjyu

bobmon@iuvax.cs.indiana.edu (RAMontante) (12/15/88)

Referring to Eric Gisin's code (appended):  Although "diff()" is being
called with two pointers to the same object, it doesn't know that.  In
general it knows only that it's getting two pointers to a struct "six".
Since the pointers are passed by value, they may as well be pointers
into distinct objects; and they certainly aren't guaranteed to be
represented with identical segments.  (Again, in general there's no
assurance that they _could_ be.)

My TurboC manual indicates that, except in huge model, pointers are
tested for inequality by using only the offsets (16 bits).  Inequality
tests often look like subtraction, so I wouldn't be surprised if pointer
subtraction also used only offsets.  (Huge-pointer comparisons use the
full address, in normalized form.)

So I agree with Bill Davidsen that the rigorous way to do this is to
cast the pointers to huge, and subtract the casts.  Viz., 

int diff(struct six far *p, struct six far *q)
{
	return (int)( (struct six huge *)p - (struct six huge *)q );
}

Beauty, eh?

________________

In article <600@mks.UUCP> egisin@mks.UUCP (Eric Gisin) writes:
+
+Please read my code. I have two pointers to "s", a single object.
+Declaring "s" with 1 or 10000 makes no difference.
+And the pointers are seperated by 60000, which is less than 64K.
+
+struct six {		/* HINT: this struct is 6 bytes long */
+	int i[3];
+};
+
+int diff(struct six far* p, struct six far* q) {
+  	return p - q;
+}
+
+main(void) {
+	struct six s[1];
+	printf("%d\n", diff(s+10000, s));	/* 10000 */
+	printf("%d\n", diff(s, s+100));		/* -100 */
+}

earleh@eleazar.dartmouth.edu (Earle R. Horton) (12/15/88)

In article <15813@iuvax.cs.indiana.edu> bobmon@iuvax.UUCP (RAMontante) writes:
...
>So I agree with Bill Davidsen that the rigorous way to do this is to
>cast the pointers to huge, and subtract the casts.  Viz., 
...

I don't think anyone has suggested this, but perhaps casting to long
might be a better way to do this, and portable, too!  I don't do much
8086 stuff, but a long int is the same size as a long pointer on just
about every system I have run across.  Subtraction of longs is never a
questionable operation, either, whereas pointer subtraction sometimes
is...

Earle R. Horton. 23 Fletcher Circle, Hanover, NH 03755
(603) 643-4109
Graduate student.

carlp@iscuva.ISCS.COM (Carl Paukstis) (12/16/88)

In article <600@mks.UUCP> egisin@mks.UUCP (Eric Gisin) writes:
.>In article <8455@sequent.UUCP>, sch@sequent.UUCP (Steve Hemminger) writes:
.>> I believe the ANSI std and K&R say pointer subtraction is only allowed inside
.>> one data structure/array.  Since the small/medium/large model only allows
.>> arrays <64K, only a 16bit result needs to be computed. Any code that
.>> depends on subtracting pointers into two totally seperate arrays is
.>> non-portable.
.>
.>Please read my code. I have two pointers to "s", a single object.
.>Declaring "s" with 1 or 10000 makes no difference.
.>And the pointers are seperated by 60000, which is less than 64K.
.>
.>struct six {		/* HINT: this struct is 6 bytes long */
.>	int i[3];
.>};
.>
.>int diff(struct six far* p, struct six far* q) {
.>  	return p - q;
.>}
.>
.>main(void) {
.>	struct six s[1];
.>	printf("%d\n", diff(s+10000, s));	/* 10000 */
.>	printf("%d\n", diff(s, s+100));		/* -100 */
.>}

Of course, the code you posted is NOT legal, since the two pointers in the
example *do not* point inside the same object.  You have verified that the
incorrect code is generated when you IN FACT declare "struct six s[10000]"?
If so, it's a bona-fide bug.  But if it won't work with your example, the
worst conclusion you can directly draw is that your example is "not 
conformant".
-- 
Carl Paukstis    +1 509 927 5600 x5321  |"The right to be heard does not
                                        | automatically include the right
UUCP:     carlp@iscuvc.ISCS.COM         | to be taken seriously."
          ...uunet!iscuva!carlp         |                  - H. H. Humphrey

egisin@mks.UUCP (Eric Gisin) (12/17/88)

In article <2237@iscuva.ISCS.COM>, carlp@iscuva.ISCS.COM (Carl Paukstis) writes:
< In article <600@mks.UUCP> egisin@mks.UUCP (Eric Gisin) writes:
< .>Please read my code. I have two pointers to "s", a single object.
< .>Declaring "s" with 1 or 10000 makes no difference.
...
< Of course, the code you posted is NOT legal, since the two pointers in the
< example *do not* point inside the same object.  You have verified that the
< incorrect code is generated when you IN FACT declare "struct six s[10000]"?
< If so, it's a bona-fide bug.  But if it won't work with your example, the
< worst conclusion you can directly draw is that your example is "not 
< conformant".

No, I DO NOT have to verify that it still generates incorrect code
when I declare "s" as s[10000]. "diff" is a global function,
and could be called from another module with a legal object.
A compiler with a dumb linker cannot generate 
code for diff depending on how it is called in than module.

scott@hpcvca.HP.COM (Scott Linn) (12/17/88)

/ hpcvca:comp.sys.ibm.pc / egisin@mks.UUCP (Eric Gisin) /  3:07 pm  Dec 12, 1988 /
In article <8377@bloom-beacon.MIT.EDU>, adtaiwo@athena.mit.edu (Ademola Taiwo) writes:
> Hi,
> 	You should read your reference manual, (Turbo-C) specifically says
> that there are no denormalizations done in all memory models except HUGE, so
> you may want to use the Huge model and you are guaranteed 32-bit computations
> on your pointers, including proper denormals.

You don't know what you are talking about.
My example deals with pointers into an object of size 32K to 64K,
which near * and far * are claimed to work with.
Huge * is for objects larger than 64K.

My example program may be misleading because it declares diff() as long,
I really only expect an int.  An int is sufficient unless you are
comparing char *'s, but even then if you cast the difference to size_t
you get the correct value when p > q.
Even that should be unecessary, I thought ANSI added the ptrdiff_t
type specifically for 8086 brain-damage. It only takes "sbb dx,dx"
to widen the difference of two char *'s to the correct long value,
and that can be optimized away when a long value is not required.

> 	I think the compilers are right in generating 16bit code for all other
> memory models, since you have been warned about the trade-off of speed/space
> that you are making by chosing any other model but huge.
What? My fix doesn't cost anything, it changes one instruction.
If the compiler optimizes the divide into an arithmetic shift,
then you do have to add one instruction.

> 	On thesame note, pointer comparisons are not guaranteed to be correct
> in any model but huge. So if you want to fool arund with very big arrays, turn
> on the HUGE flag.
They are guaranteed when the operands point to components of the same object,
as was the case in my example.
----------

chasm@killer.DALLAS.TX.US (Charles Marslett) (12/18/88)

In article <11477@dartvax.Dartmouth.EDU>, earleh@eleazar.dartmouth.edu (Earle R. Horton) writes:
> In article <15813@iuvax.cs.indiana.edu> bobmon@iuvax.UUCP (RAMontante) writes:
> ...
> >So I agree with Bill Davidsen that the rigorous way to do this is to
> >cast the pointers to huge, and subtract the casts.  Viz., 
> ...
> 
> I don't think anyone has suggested this, but perhaps casting to long
> might be a better way to do this, and portable, too!  I don't do much
> 8086 stuff, but a long int is the same size as a long pointer on just
> about every system I have run across.  Subtraction of longs is never a
> questionable operation, either, whereas pointer subtraction sometimes
> is...

Of course, pointers are not portably convertable to any kind of scalar
(even longs) so the difference of two longs converted from two pointers
is not necessarily going to give you the number of bytes seperating the
two addresses -- this is true of almost any segmented address machine, and
specifically it is true of the Microsoft and Borland C compilers.

The point is, subtraction of longs is valid, conversion of pointers to
longs (even if they are the same size) is not.  And subtraction of
pointers OUGHT TO BE portable (to reiterate my original statement, the
Microsoft and Borland compilers are broken since they do not work "right").

I might add that both of these companies are in business to sell software
-- tell them you do not like this overemphasis on fast over correct results
and we may see improvement next time around!

Charles Marslett
chasm@killer.dallas.tx.us

bobmon@iuvax.cs.indiana.edu (RAMontante) (12/19/88)

chasm@killer.DALLAS.TX.US (Charles Marslett) writes:
>
>	[...]	  And subtraction of
>pointers OUGHT TO BE portable (to reiterate my original statement, the
>Microsoft and Borland compilers are broken since they do not work "right").

I don't buy this; why do you say they don't work "right"?  Pointer
subtraction is only valid when the pointers are to the same array, and
TurboC (at least) is simply referencing off the DS register in small
memory models.  So the pointers may be restricted in range, but they
should give the correct number of ELEMENTS (not bytes, unless  the array
elements are byte-sized).

Note that the original code in this thread made the mistaken assumption
that its "diff()" function was dealing in pointers that were guaranteed
to refer to the same array; in fact, they were independent pointers
which happened, in the example call, to have received values based on a
common array in the caller program.

Perhaps this discussion ought to move to comp.lang.c.

chasm@killer.DALLAS.TX.US (Charles Marslett) (12/19/88)

In article <15891@iuvax.cs.indiana.edu>, bobmon@iuvax.cs.indiana.edu (RAMontante) writes:
> chasm@killer.DALLAS.TX.US (Charles Marslett) writes:
> >
> >	[...]	  And subtraction of
> >pointers OUGHT TO BE portable (to reiterate my original statement, the
> >Microsoft and Borland compilers are broken since they do not work "right").
> 
> I don't buy this; why do you say they don't work "right"?  Pointer
> subtraction is only valid when the pointers are to the same array, and
> TurboC (at least) is simply referencing off the DS register in small
> memory models.  So the pointers may be restricted in range, but they

An array is a set of elements, each the same size and same structure (this
applies to all the languages I am aware of, not just C).  The fact that the
original posting was not absolutely correctly written does not make what the
poster said wrong (spelling errors are reason for failure only in school).
The problem is not that pointers are restricted, the problem is that the
documentation says that small and compact model arrays cannot be bigger than
64K (total size), and large and medium model arrays cannot be bigger than 64K
each -- and the limit for references of the form "&array[k] - &array[m]" is
32K.  Either the compiler or the documentation is broken!

> should give the correct number of ELEMENTS (not bytes, unless  the array
> elements are byte-sized).

And the answer will wrong if the values are further apart that 32K bytes
(independent of the size of the element and the absolute value of the
difference -- if an element is 20K long, and the array has 3 elements,
&array[2] - &array[0] will give a NEGATIVE result *AND THAT IS WRONG*, I
repeat *THE COMPILER IS BROKEN*.

> Note that the original code in this thread made the mistaken assumption
> that its "diff()" function was dealing in pointers that were guaranteed
> to refer to the same array; in fact, they were independent pointers
> which happened, in the example call, to have received values based on a
> common array in the caller program.

I would say that this implies the compiler is broken if it does not generate
an error message since either the operation is legal and should generate a
valid result, or it is illegal, and should generate an error message.
> 
> Perhaps this discussion ought to move to comp.lang.c.

At last we agree!

Charles Marslett
chasm@killer.dallas.tx.us

carlp@iscuva.ISCS.COM (Carl Paukstis) (12/20/88)

(I apologize in advance for the length of this and for omitting several
comments by other folks.  I wanted to get the whole of the main context
since I am crossposting to comp.lang.c and redirecting followups there.)

Eric Gisin at Mortice Kern Systems writes:
>
>How come I can't find a compiler that generates correct
>code for pointer subtraction in C on 8086s?
>Neither Turbo, Microsoft, or Watcom do it right.
>Here's an example:
>
>struct six {
>	int i[3];		/* six bytes, at least for MSC (comment by C.P.) */
>};
>
>int diff(struct six far* p, struct six far* q) {
>  	return p - q;
>}
>
>main(void) {
>	struct six s[1];
>	printf("%d\n", diff(s+10000, s));	/* 10000 */
>	printf("%d\n", diff(s, s+100));		/* -100 */
>}
>
>All of the compilers I tried computed a 16 bit difference,
>then sign extended it before dividing.
>This does not work if the pointers differ by more than 32K.

(NOTE CRITICAL POINT FOR ERIC'S COMPLAINT:  the difference between s and
s+10000 is 60,000 bytes - easily less that the 64K segment limit)

Then I (Carl Paukstis) pick a nit and respond:

>Of course, the code you posted is NOT legal, since the two pointers in the
>example *do not* point inside the same object.  You have verified that the
>incorrect code is generated when you IN FACT declare "struct six s[10000]"?
>If so, it's a bona-fide bug.  But if it won't work with your example, the
>worst conclusion you can directly draw is that your example is "not 
>conformant".

And Eric (thoroughly frustrated by Intel architecture by now) responds:

>Summary: Oh my god
>
>No, I DO NOT have to verify that it still generates incorrect code
>when I declare "s" as s[10000]. "diff" is a global function,
>and could be called from another module with a legal object.
>A compiler with a dumb linker cannot generate 
>code for diff depending on how it is called in than module.

OK, I admit, I was picking a nit.  I stand by my original comment, but please
note that I wasn't claiming that it DID work, only that Eric's posted
code didn't PROVE it didn't work.

(In fact, of course, it does NOT work, even if one defines s[10000])

Anyway, I got interested and did some actual research (who, me? find facts 
before I post? nontraditional for me, I admit) with Microsoft C 5.1.  I even
went so far as to read the manual.  In chapter 6 (Working With Memory
Models) of the _Microsoft C Optimizing Compiler User's Guide_, I find
table 6.1 (Addressing of Code and Data Declared with near, far, and
huge).  The row for "far", column for "Pointer Arithmetic", says "Uses
16 bits".  Hmmm. This is consistent with Eric's results, if a tad
ambiguous - they only use 16 bits for ALL the arithmetic, including the
(required) signedness of the address difference.

I also find in section 6.3.5 (Creating Huge-Model Programs), the
following paragraph:

"Similarly, the C language defines the result of subtracting two
pointers as an _int_ value.  When subtracting two huge pointers,
however, the result may be a _long int_ value.  The Microsoft C
Optimizing Compiler ggives the correct result when a type cast like the
following is used:
    (long)(huge_ptr1 - huge_ptr2)"

So, I altered the "diff()" function as follows:

long diff(struct six huge* p, struct six huge* q) {
  	return (long)(p - q);
}

(and changed the printf() format specs to "%ld")

and left the rest of the program exactly as given in the original post.
No great surprise, it works fine.  Compile with any memory model
{small|medium|compact|large|huge} and it still works.  Split the code so
that diff() is defined in a different source file (but put a 
prototype declaration in the file with main()) and it still works.  The
prototype gets arguments promoted to a type in which MSC is capable of
doing correct pointer subtraction.  The cast of the return type is
apparently necessary for subtracting "huge" pointers - even in "huge"
model.  If one removes the model designators (far or huge) from the
prototype for diff() and compiles in "huge" model, it is still necessary
to cast and return a long.  My code presented above seems a complete and
fairly non-intrusive solution; it works for any compilation memory model.

Eric: does this prototype declaration and return type satisfy your needs
to avoid grepping through thousands of lines of code and changing same?

Gentlepersons all: is this about the best job Microsoft could have done,
given the wonderfulness of Intel segmented address space?

What's the moral of the story? (nb: not necessarily intended for Eric,
who I'm sure is aware of all this):

1)  Examine the manuals for odd requirements of your target environment
    with utmost care.  Experiment in this environment.
2)  Use prototypes whenever you have a compiler that supports them.
    They can be a BIG help in odd situations like this.
3)  Avoid huge (in the abstract, not Intel/Microsoft, sense) data
    objects whenever possible.
4)  Avoid Intel-based systems whenever possible :-)
5)  all of the above
6)  1) and 2)
7)  Isn't this article too damned long already?  Shaddup!
-- 
Carl Paukstis    +1 509 927 5600 x5321  |"The right to be heard does not
                                        | automatically include the right
UUCP:     carlp@iscuvc.ISCS.COM         | to be taken seriously."
          ...uunet!iscuva!carlp         |                  - H. H. Humphrey

stuart@bms-at.UUCP (Stuart Gathman) (12/30/88)

In article <18123@santra.UUCP>, tarvaine@tukki.jyu.fi (Tapani Tarvainen) writes:

> The same error occurs in the following program 
> (with Turbo C 2.0 as well as MSC 5.0):

> main()
> {
>         static int a[30000];
>         printf("%d\n",&a[30000]-a);
> }

> output:  -2768

This is entirely correct.  The difference of two pointers is an *int*.
If you want an unsigned difference, you need to cast to unsigned
(and/or use %u in the printf).  If the difference were defined as
unsigned, how would you indicate negative differences?  If you
make the difference long, all the related arithmetic gets promoted
also for a big performance hit.  The solution is simple, if you
want an unsigned ptrdiff, cast or assign to unsigned.

This is described in the Turbo C manual.

Don't flame the 8086 either.  The same thing happens in 32-bit machines
(just much less often).  16 bits is 16 bits, and segments are not
the problem.  The VAX restricts user programs to 31-bit address space
to avoid this.
-- 
Stuart D. Gathman	<stuart@bms-at.uucp>
			<..!{vrdxhq|daitc}!bms-at!stuart>

chasm@killer.DALLAS.TX.US (Charles Marslett) (12/31/88)

In article <142@bms-at.UUCP>, stuart@bms-at.UUCP (Stuart Gathman) writes:
 > In article <18123@santra.UUCP>, tarvaine@tukki.jyu.fi (Tapani Tarvainen) writes:
 > 
 > > The same error occurs in the following program 
 > > (with Turbo C 2.0 as well as MSC 5.0):
 > 
 > > main()
 > > {
 > >         static int a[30000];
 > >         printf("%d\n",&a[30000]-a);
 > > }
 > 
 > > output:  -2768
 > 
 > This is entirely correct.  The difference of two pointers is an *int*.

And unless you have a 15-bit computer, 30000 is a very representable *INT*,
so please pay attention to the discussion before asserting something.  The
compiler is generating a VERY WRONG ANSWER.

 > If you want an unsigned difference, you need to cast to unsigned
 > (and/or use %u in the printf).  If the difference were defined as
 > unsigned, how would you indicate negative differences?  If you
 > make the difference long, all the related arithmetic gets promoted
 > also for a big performance hit.  The solution is simple, if you
 > want an unsigned ptrdiff, cast or assign to unsigned.

The result cast to an unsigned is 62768, still not even close to the
correct value of 30000.  There are two viable solutions:  you can write
your own assembly language (or C code, even) to calculate the proper result
or you can ignore the issue and assume the size of a segment on the Intel
architecture is 32K.  I have used both solutions.

 > 
 > This is described in the Turbo C manual.

Unfortunately, the Turbo C manual lies (it does not identify all the cases
where the compiler gets it wrong -- in fact it looks very much like the
Microsoft C compiler documentation, all the same errors, did someone copy?
;^).

 > Don't flame the 8086 either.  The same thing happens in 32-bit machines
 > (just much less often).  16 bits is 16 bits, and segments are not
 > the problem.  The VAX restricts user programs to 31-bit address space
 > to avoid this.

Actually, in a 32-bit machine the problem is probably more serious if
we assume a real 32-bit address, since it may well not support 33+ bit
arithmetic even as well as Intel boxes support 17+ bit arithmetic.

 > -- 
 > Stuart D. Gathman	<stuart@bms-at.uucp>
 > 			<..!{vrdxhq|daitc}!bms-at!stuart>

Charles Marslett
chasm@killer.dallas.tx.us

pinkas@hobbit.intel.com (Israel Pinkas ~) (01/04/89)

Please note:  I am on my own here.  I work for Intel, but do not speak for
them.

In article <6604@killer.DALLAS.TX.US> chasm@killer.DALLAS.TX.US (Charles Marslett) writes:

>  In article <142@bms-at.UUCP>, stuart@bms-at.UUCP (Stuart Gathman) writes:
>   > In article <18123@santra.UUCP>, tarvaine@tukki.jyu.fi (Tapani Tarvainen) writes:
>   > 
>   > > The same error occurs in the following program 
>   > > (with Turbo C 2.0 as well as MSC 5.0):
>   > 
>   > > main()
>   > > {
>   > >         static int a[30000];
>   > >         printf("%d\n",&a[30000]-a);
>   > > }
>   > 
>   > > output:  -2768
>   > 
>   > This is entirely correct.  The difference of two pointers is an *int*.

>  And unless you have a 15-bit computer, 30000 is a very representable *INT*,
>  so please pay attention to the discussion before asserting something.  The
>  compiler is generating a VERY WRONG ANSWER.

The compiler is generating a correct answer.  There is an overflow in
there.  Remember, on a PC, ints are two bytes.  Let's ignore the fact that
there is no a[30000] (and that taking its address is invalid).  a[30000] is
offset from a[0] by 60,000 bytes.  The normal code for pointer subraction
is to subtract the pointers and divide by the size of the object.  Since
many objects have a size that is a power of two, the compiler can often
optime by using a shift.

However, since the difference is stored in an int, 60,000 is taken to be a
negaive number (-5536).  Dividing this by two (the size of int) gives -2768.

Note that there are similar problems when adding an int to a pointer.  They
just don't show up as often, as the pointer is treated specially.

There is no real solution for this.  You might get the desired result with
the following code:

	main()
	{
	    int a[30000];
	    printf("%ld\n", (long) (&a[30000] - a));
	}

Of course, you could always just subtract the two indecies.  You could also
conver the two pointers to (char *), subtract and convert to unsigned, and
divide by sizeof(object).

>   > If you want an unsigned difference, you need to cast to unsigned
>   > (and/or use %u in the printf).  If the difference were defined as
>   > unsigned, how would you indicate negative differences?  If you
>   > make the difference long, all the related arithmetic gets promoted
>   > also for a big performance hit.  The solution is simple, if you
>   > want an unsigned ptrdiff, cast or assign to unsigned.

>  The result cast to an unsigned is 62768, still not even close to the
>  correct value of 30000.  There are two viable solutions:  you can write
>  your own assembly language (or C code, even) to calculate the proper result
>  or you can ignore the issue and assume the size of a segment on the Intel
>  architecture is 32K.  I have used both solutions.

Treating it as an unsigned is wrong, as the next time through you might
want to know a - &a[30000].

This problem is not inherent in the fact that the 80x86 uses segments.  It
is a result of the fact the sizeof(int *) > sizeof(int).  (Actually, the
aize of the lagrst pointer type.)  Since the only class of compilers that
have this problem are the MS-DOS compilers, this is why people blame the
issue on Intel.  The only limitation that the 8088/8086 segments impose is
with respect to the size of an object, either code or data.  It takes more
effort to manipulate an object that is greater than 64K, and a compiler
would have to be very intelligent toi generate code for a single procedure
that was >64K.

>   > Don't flame the 8086 either.  The same thing happens in 32-bit machines
>   > (just much less often).  16 bits is 16 bits, and segments are not
>   > the problem.  The VAX restricts user programs to 31-bit address space
>   > to avoid this.

>  Actually, in a 32-bit machine the problem is probably more serious if
>  we assume a real 32-bit address, since it may well not support 33+ bit
>  arithmetic even as well as Intel boxes support 17+ bit arithmetic.

On machines where sizeof(int) >= sizeof(int *), this is never a problem.
On the VAX, 68K, Sparc, 80386, and most other machines that I have worked
with, ints are 32 bits.  Since most machines do not have 2G of virtual
memory, the issue never comes up.

-Israel
--
--------------------------------------
Disclaimer: The above are my personal opinions, and in no way represent
the opinions of Intel Corporation.  In no way should the above be taken
to be a statement of Intel.

UUCP:	{amdcad,decwrl,hplabs,oliveb,pur-ee,qantel}!intelca!mipos3!cad001!pinkas
ARPA:	pinkas%cad001.intel.com@relay.cs.net
CSNET:	pinkas@cad001.intel.com

wacey@paul.rutgers.edu ( ) (01/04/89)

sizeof(int) == 2
so 30000 * 2 =60000 which is to large for a signed int

iain wacey

ron@ron.rutgers.edu (Ron Natalie) (01/04/89)

> sizeof(int) == 2
> so 30000 * 2 =60000 which is to large for a signed int

Sizeof is irrelevent for pointer math.  For subtraction, The result is the
number of "elements" which would be 30000, not 30000 times sizeof
int.  For addition, the result is a new pointer, incremented by the
number of elements of the integer added to the pointer.

kyriazis@rpics (George Kyriazis) (01/04/89)

In article <Jan.3.12.44.56.1989.4839@ron.rutgers.edu> ron@ron.rutgers.edu (Ron Natalie) writes:
>> sizeof(int) == 2
>> so 30000 * 2 =60000 which is to large for a signed int
>
>Sizeof is irrelevent for pointer math.  For subtraction, The result is the
>number of "elements" which would be 30000, not 30000 times sizeof
>int.  For addition, the result is a new pointer, incremented by the
>number of elements of the integer added to the pointer.

Sizeof is irrelevant?  What about the following piece of code:

	int	*a,b[30000];

	a = &b[20000];
	printf("%d\n", a-b);

Obviously when you do the subtraction ( a-b ), the compiler only knows their
adrresses, not the number of elements they differ.  So it HAS to calculate the
diference in BYTES and then divide by sizeof( int ).  

In example in question ( &a[30000] - a ), the difference in 60000 bytes, which
is -5536.  Divide it by sizeof(int)==2 and you get the wonderful result: -2736.

I am not saying it is correct.  Obviously the compiler is doing a signed
division to get the result instead of the unsigned that it should do.
THAT is the bug.  

  George Kyriazis
  kyriazis@turing.cs.rpi.edu
  kyriazis@ss0.cicg.rpi.edu
------------------------------

ron@ron.rutgers.edu (Ron Natalie) (01/04/89)

>Obviously when you do the subtraction ( a-b ), the compiler only knows their
>adrresses, not the number of elements they differ.So it HAS to calculate the
>diference in BYTES and then divide by sizeof( int ).  

NO!  When I do the subtraction, the compiler is expected to do what is
nexecssary to find out what the number of elements between them is.  It
is not the case that this is always going to be subtract two byte pointers
(which they aren't) and divide by sizeof.  Consider a machine where int
pointers are different than byte pointers.  You would certainly not want
to use the divide case example.

Any compiler that gives the wrong answer for pointer subtraction when
two elements less than the maximum representable integer value apart
(in the same array) are subtracted has broken pointer subtraction.

-Ron

rap@olivej.olivetti.com (Robert A. Pease) (01/04/89)

>Sizeof is irrelevant?  What about the following piece of code:
>
>	int	*a,b[30000];
>
>	a = &b[20000];
>	printf("%d\n", a-b);
>
>Obviously when you do the subtraction ( a-b ), the compiler only knows their
>adrresses, not the number of elements they differ.  So it HAS to calculate the
>diference in BYTES and then divide by sizeof( int ).  
>
>In example in question ( &a[30000] - a ), the difference in 60000 bytes, which
>is -5536.  Divide it by sizeof(int)==2 and you get the wonderful result: -2736.

In MSC these examples reduce to a constant and the  compiler
places  the  values as immediate data in the instruction.  I
make no claims as to  how  the  compiler  arrives  at  these
values.

>I am not saying it is correct.  Obviously the compiler is doing a signed
>division to get the result instead of the unsigned that it should do.
>THAT is the bug.  

The code generated by MSC 5.1 is shown below.


mov	WORD PTR _a,OFFSET DGROUP:_b+40000	; a = &b[20000];
mov	ax,WORD PTR _a				; a - b;
sub	ax,OFFSET DGROUP:_b			;
sar	ax,1					;
mov	WORD PTR _diff,ax			;

mov	WORD PTR _diff,-2768			; diff = &b[30000] - b;


This  doesn't  really  show  much.   If  the  constants  are
replaced  with  variables initialized to the same value, the
code below is generated.


;|*** 
mov	ax,_ind2		; a = &b[ind2];	/* ind2 = 20000 */
shl	ax,1
add	ax,OFFSET DGROUP:_b
mov	WORD PTR _a,ax

mov	ax,WORD PTR _a		; diff = a - b;
sub	ax,OFFSET DGROUP:_b
sar	ax,1
mov	WORD PTR _diff,ax

mov	ax,_ind3		; diff = &b[ind3] - b; /* ind3 = 30000 */
shl	ax,1
add	ax,OFFSET DGROUP:_b
sub	ax,OFFSET DGROUP:_b
sar	ax,1
mov	WORD PTR _diff,ax


Note that the code is the same for both examples except  for
the intermediate storage.

Now, Intel's description of SHL is that it shifts  0  in  on
the  right  and if the sign bit changes then OF is set.  The
description for SAR is that it shifts the sign bit in on the
left.

This does begin to wory me a bit.  If I  trace  through  the
actual  values  with  CodeView, I find that the value of the
difference between pointers has changed sign between the SHL
and SAR instructions for the values used (20000 and 30000).

So, the bottom line is that when I expect the answer  to  be
20000  or  30000, it isn't and the reason it isn't is due to
the SAR instruction  shifting  a  sign  bit  into  the  left
instead  of  a zero bit.  This is totally contrary to what I
set out to prove.

					Robert A. Pease
{hplabs|fortune|microsoft|amdahal|piramid|tolerant|sun|aimes}!oliveb!rap

boyne@hplvli.HP.COM (Art Boyne) (01/05/89)

chasm@killer.DALLAS.TX.US (Charles Marslett) writes:
> In article <142@bms-at.UUCP>, stuart@bms-at.UUCP (Stuart Gathman) writes:
>  > In article <18123@santra.UUCP>, tarvaine@tukki.jyu.fi (Tapani Tarvainen) writes:
>  > 
>  > > The same error occurs in the following program 
>  > > (with Turbo C 2.0 as well as MSC 5.0):
>  > 
>  > > main()
>  > > {
>  > >         static int a[30000];
>  > >         printf("%d\n",&a[30000]-a);
>  > > }
>  > 
>  > > output:  -2768
>  > 
>  > This is entirely correct.  The difference of two pointers is an *int*.
> 
> And unless you have a 15-bit computer, 30000 is a very representable *INT*,
> so please pay attention to the discussion before asserting something.  The
> compiler is generating a VERY WRONG ANSWER.

Ok guys, just hold it a minute and think: it only hurts for a little while.

Let's look at what the compiler is trying to do:  &a[30000]-a where a is an
integer array.  First, int's are 16-bit, ie., 2 bytes/int.  So, a[30000] is
located 60000 bytes away from a[0].  Ideally, the calculation should look like

      take address of a[30000]
      subtract address of a[0]
      divide by 2 to convert offset to integer pointer difference

But &a[30000]-a = 60000 bytes, too big for a 16-bit signed int.  Unsigned 60000
looks like -5536 signed.  Divided by 2 is -2768.  Voila.

You can argue that the compiler should use unsigned arithmetic for the pointer
calculation, or do a 32-bit calculation, but given the assumptions the compiler
operates under, the answer is reasonable (but not useful).

Art Boyne, boyne@hplvdz.HP.COM

kyriazis@rpics (George Kyriazis) (01/05/89)

>
>In MSC these examples reduce to a constant and the  compiler
>places  the  values as immediate data in the instruction....
>

Ok, I give up.  I didn't know that the compiler produces a constant for it.
I just looked at the assembly code that a SUN4 produces and it also uses
a constant for it.  I guess compilers are smarter than I thought they are :-)

>>I am not saying it is correct.  Obviously the compiler is doing a signed
>>division to get the result instead of the unsigned that it should do.
>
>The code generated by MSC 5.1 is shown below.
>
>  [ .. some additional code .. ]
>
>mov	ax,_ind3		; diff = &b[ind3] - b; /* ind3 = 30000 */
>shl	ax,1
>add	ax,OFFSET DGROUP:_b
>sub	ax,OFFSET DGROUP:_b
>sar	ax,1
>mov	WORD PTR _diff,ax
>
>

Ok.  The SHR is obviously a part of the array indexing (since an int is
2 bytes multiplication by sizeof(int)==2 is a left shift by one place).
The SAR is the bug I was talking about.  It should've been SLR (assuming
that SAR means Shift Arithmetic Right, SLR would mean Shift Locical Right),
shifting a 0 from the left (if such an instruction exists).  Or they could
equally well put an   AND ax, 0x7fff  (excuse me if my 8088 is wrong) after
the SAR.


Now excuse me about the blank lines, but I have to put some so rn accepts
the article.  Sorry :-)








  George Kyriazis
  kyriazis@turing.cs.rpi.edu
  kyriazis@ss0.cicg.rpi.edu
------------------------------

rjchen@phoenix.Princeton.EDU (Raymond Juimong Chen) (01/05/89)

I really tried to stay out of this, but I want to clarify some
things that people seem to be missing, as well as provide a counterexample
to an assertion made.

> The compiler is generating a correct answer.  

Who is defining "correct"?  If you want K&R to be the final word on
the "definition" of the C language, then I'm afraid that the compiler
is not generating a "correct" answer.  K&R (old version) page 98
explicitly states "p-q is the numebr of elements between p and q".
And in section 7.4, "If two pointers to objects of the same type
are subtracted, the result is ... an int representing the number of
objects separating the pointed-to objects."  I suspect that the ANSI
draft says the same thing.  Therefore, the "correct" answer is 30000.

I shall henceforth place the word "correct" in quotation marks to emphasize
that what I am referring to is the value that should be returned, as
defined by K&R.  I am not insisting that this is the most reasonable
result, merely that it is what K&R decrees the answer should be.

> There is an overflow in there.

Exactly.  And the compiler should be smart enough to emit code to handle
the overflow so as to generate the "correct" answer.

> However, since the difference is stored in an int, 60,000 is taken to be a
> negaive number (-5536).  Dividing this by two (the size of int) gives -2768.

It seems you are simply giving a rational explanation for the
answer; justifying why the answer of -2768 is not unexpected,
but not justifying its correctness.  (As stated above, the
"correct" answer is DEFINED to be 30000.)

The additional code is really very simply.  If the carry is clear, then
the diference is positive, so we just perform a UNSIGNED division on the
resulting difference.  If the carry is set, then the difference is
negative, and the accumulator contains the two-s complement of the 
absolulte value of the byte difference.  So we negate the accumulator,
perform an unsigned division, then negate the result.

As the counterexample below illustrates, this will not work if the
object is only one byte wide, since an int cannot hold all the
possible pointer differences.

> There is no real solution for this.  You might get the desired result with
> the following code:
> 
> 	main()
> 	{
> 	    int a[30000];
> 	    printf("%ld\n", (long) (&a[30000] - a));
> 	}

But I suspect most people would argue that the original code
(which used the line printf("%ld\n", &a[30000] - a); ) is ANSI-conformant,
and therefore should yield the "correct" answer.
 
> This problem is not inherent in the fact that the 80x86 uses segments.  

Agreed.

> It
> is a result of the fact the sizeof(int *) > sizeof(int).  (Actually, the
> aize of the lagrst pointer type.)  

Untrue.  Even if sizeof(int *) == sizeof(int), we would still have problems.
Suppose sizeof(int) == sizeof(int *) == 4.  And suppose you have an array
of chars that takes up 3/4 of the computer's memory.  Let p be a pointer to 
the start of the array, and q a pointer to the end of the array.  
Let M be 2^sizeof(int), so that an unsigned can represent the values
0 ... M-1, and an int can represent (if using two's complement)
-(M/2) ... M/2 - 1.  And the computer has M bytes of memory.
Let d be the value (0.75 * M).  Then the following possible "correct" return 
values exist:

q - p == d
p - q == -d

So we have to be able to represent 2d-1 different values.  But 2d-1 > M.
Therefore, you cannot handle all possible return values with just an int,
since an int can represent only M possible values.
 
> On machines where sizeof(int) >= sizeof(int *), this is never a problem.

See above counterexample.  As mentioned many times before, you need
your int to be at least one bit bigger than your pointers.  This
becomes clear when you realize that if an array gobbles up all but one
byte of memory, then the difference betwen its head and its tail will
be +-(M-1), and you clearly require (lg M)+1 bits to represent that
many different values.

-------------------------
Let me say that I understand why the implementors of compilers have not
bothered to worry about this far-out case:  It makes your code larger
(since you have to perform special tests for the carry bit) to handle
cases that don't pop up very often.  Since benchmarks compare both speed
and size, I can see the following exchange:  (P == a programmer on the
compiler development team.  M = an executive in the marketing department.)

P:  Well, we need to add this extra code to handle this special case 
	which I admit doesn't happen very often.
M:  Does this special case occur in any of the standard benchmarks?
P:  No.
M:  Will it hurt our showing in the standard benchmarks?
P:  Yes.
M:  Leave it out.
------------------------
Respectfully,  
-- 
Raymond Chen	UUCP: ...allegra!princeton!{phoenix|pucc}!rjchen
		BITNET: rjchen@phoenix.UUCP, rjchen@pucc
		ARPA: rjchen@phoenix.PRINCETON.EDU, rjchen@pucc.PRINCETON.EDU
"Say something, please!  ('Yes' would be best.)" - The Doctor

rjchen@phoenix.Princeton.EDU (Raymond Juimong Chen) (01/05/89)

I know it's bad form to follow up to one's own article, but
the number of typos I found after posting it was embarrassing.

> explicitly states "p-q is the numebr of elements between p and q".
                                number

> The additional code is really very simply.  If the carry is clear, then
                                     simple

>               the accumulator contains the two-s complement of the 
> absolulte value of the byte difference.

More clearly (and spelled correctly), the accumulator contains the two's
complement of the absolute value of the number of bytes separating the
two pointers.

> But I suspect most people would argue that the original code
> (which used the line printf("%ld\n", &a[30000] - a); ) is ANSI-conformant,
                              "%d\n"                   ((oops))

> Let M be 2^sizeof(int), so that an unsigned can represent the values
> 0 ... M-1,  ...

Of course, that should be ``Let M be 2^(number of bits in an int)''.

Apologetically,
-- 
Raymond Chen	UUCP: ...allegra!princeton!{phoenix|pucc}!rjchen
		BITNET: rjchen@phoenix.UUCP, rjchen@pucc
		ARPA: rjchen@phoenix.PRINCETON.EDU, rjchen@pucc.PRINCETON.EDU
"Say something, please!  ('Yes' would be best.)" - The Doctor

stuart@bms-at.UUCP (Stuart Gathman) (01/05/89)

In article <51@rpi.edu>, kyriazis@rpics (George Kyriazis) writes:
> division to get the result instead of the unsigned that it should do.

Unsigned division will not help one whit.  You need 17 bits precision.

	sizeof (type) == 2^n:

	mov	ax,ptr1
	sub	ax,ptr2
	rcr	ax,1
	sra	ax,1	;n-1 times, possibly none

	sizeof (type) == 2*n:

	mov	ax,ptr1
	sub	ax,ptr2
	rcr	ax,1
	cwd
	mov	bx,n
	idiv

	sizeof (type) == n
	
	xor	dx,dx
	mov	ax,ptr1
	sub	ax,ptr2
	sbb	dx,0		; carry extend
	mov	bx,n
	idiv

For the most common (2^n, 2*n) cases, this takes exactly as many bytes
and is just as fast as the wrong code.  The general case is just 3 extra
bytes (and won't occur with word alignment).

This kind of thing is very common (and very maddening) in system code.
I have yet to see a 16-bit system that didn't break at some 32K boundary
because of signed/unsigned carelessness.

The bottom line is that my 16-bit programs designed to use 40K of buffers
have to get by with 30K until I can track down all the compiler/system
bugs and find work arounds.

BTW, 32-bit machines often mess it up also - they get by with it because
2G arrays aren't very common.

P.S.  - beware the fateful number 32767 when writing code for 16-bit
systems . . .
-- 
Stuart D. Gathman	<stuart@bms-at.uucp>
			<..!{vrdxhq|daitc}!bms-at!stuart>

smcroft@sactoh0.UUCP (Steve M. Croft) (01/05/89)

In article <5137@phoenix.Princeton.EDU>, rjchen@phoenix.Princeton.EDU (Raymond Juimong Chen) writes:
> > There is an overflow in there.
> 
> Exactly.  And the compiler should be smart enough to emit code to handle
> the overflow so as to generate the "correct" answer.
 
At what point should the compiler not be expected to fix overflow problems?
Whadda 'bout:

int a;
a = (30000*30000*30000*30000*30000*30000)/(30000*30000*30000*30000*30000)

This involves more than resolving the carry bit, as was earlier suggested..

Cheers!
    steve

-- 
###############################################################
#                 steve "whadda guy" croft                    #  
# ...!pacbell!sactoh0smcroft  ||  ...csusac!athena!crofts     #
###############################################################

rjchen@phoenix.Princeton.EDU (Raymond Juimong Chen) (01/06/89)

From article <607@sactoh0.UUCP>, by smcroft@sactoh0.UUCP (Steve M. Croft):
> In article <5137@phoenix.Princeton.EDU>, rjchen@phoenix.Princeton.EDU (Raymond Juimong Chen) writes:
>> > There is an overflow in there.
>> 
>> Exactly.  And the compiler should be smart enough to emit code to handle
>> the overflow so as to generate the "correct" answer.
>  
> At what point should the compiler not be expected to fix overflow problems?
> Whadda 'bout:
> 
> int a;
> a = (30000*30000*30000*30000*30000*30000)/(30000*30000*30000*30000*30000)
> 
> This involves more than resolving the carry bit, as was earlier suggested..

But this involves an explicit calculation which overflows an int, and
is therefore ``implementation-defined''.  The same goes for division by
zero, or generating a floating-point result which is too big.  If the
answer is too big, then the result is ``implementation-defined''.

On the other hand, the subtraction of two pointers to the same array is
PROMISED by K&R to return a "correct" answer.  Remember:  I put the
word "correct" in quotation marks to emphasize that it is guaranteed
by K&R, and therefore any compiler which returns anything different
does not observe the K&R specifications.  Whether the K&R specifications
are silly, outdated, or unimplementable is completely irrelevant.

(The thing about the carry bit was to suggest a possible method of
returning the "correct" answer for pointer subtraction.  It was certainly
not pretending to be a solution to all overflow problems!  As other
folks have mentioned, extending to a 32-bit difference also solves
the problem.)

-- 
Raymond Chen	UUCP: ...allegra!princeton!{phoenix|pucc}!rjchen
		BITNET: rjchen@phoenix.UUCP, rjchen@pucc
		ARPA: rjchen@phoenix.PRINCETON.EDU, rjchen@pucc.PRINCETON.EDU
"Say something, please!  ('Yes' would be best.)" - The Doctor

tarvaine@tukki.jyu.fi (Tapani Tarvainen) (01/06/89)

In article <PINKAS.89Jan3082456@hobbit.intel.com> pinkas@hobbit.intel.com (Israel Pinkas ~) writes:

>>   > >         static int a[30000];
>>   > >         printf("%d\n",&a[30000]-a);

>                                                Let's ignore the fact that
>there is no a[30000] (and that taking its address is invalid). 

I have been told that dpANS explicitly states that the address of
"one-after-last" element of an array may be taken, and subtractions 
like the above are legal and should give correct result.
I do not have an access to the the dpANS - could somebody who does
please look this up?  
In any case all compilers I know of do it just fine
(unless some kind of overflow occurs, like in this very example -
but that's independent of how big the array is declared)
and a lot of existing code does rely on it.

Regarding the original problem, it *is* possible to do the subtraction
correctly, although not simply by using unsigned division.
Here is one way I think would work (on the left is what Turbo C
generates, for comparison):

                             	xor 	dx,dx
 mov    ax,&a[30000]         	mov 	ax,&a[30000]
 sub    ax,a                 	sub 	ax,a
 mov    bx,2                 	mov 	bx,2
 cwd                         	sbb 	dx,dx
 idiv   bx                   	idiv 	bx

I.e., take advantage of the fact that we can treat carry and
AX as one 17-bit register containing the result of subtraction.
It will cost a few clock cycles, I'm afraid.
In this particular case it can actually be done with
no speed penalty with the following trick:

 mov	ax,&a[30000]
 sub	ax,a
 rcr	ax

In general case it seems we must choose between doing it fast
and getting it right every time.  Perhaps a compiler option for
those who would otherwise use an old compiler version to save
the two cycles or whatever it costs...

Tapani Tarvainen
------------------------------------------------------------------
Internet:  tarvainen@jylk.jyu.fi  -- OR --  tarvaine@tukki.jyu.fi
BitNet:    tarvainen@finjyu

egisin@mks.UUCP (Eric Gisin) (01/06/89)

In article <147@bms-at.UUCP>, stuart@bms-at.UUCP (Stuart Gathman) writes:
< 	
< 	xor	dx,dx
< 	mov	ax,ptr1
< 	sub	ax,ptr2
< 	sbb	dx,0		; carry extend
< 	mov	bx,n
< 	idiv
< 
< For the most common (2^n, 2*n) cases, this takes exactly as many bytes
< and is just as fast as the wrong code.  The general case is just 3 extra
< bytes (and won't occur with word alignment).

You can replace the xor and sbb with one sbb:
	sbb	dx,dx

That's only one byte longer than the incorrect code and it is just as fast.

rap@olivej.olivetti.com (Robert A. Pease) (01/06/89)

In article <64@rpi.edu> kyriazis@turing.cs.rpi.edu (George Kyriazis) writes:
|
|>The code generated by MSC 5.1 is shown below.
|>
|>  [ .. some additional code .. ]
|>
|>mov	ax,_ind3		; diff = &b[ind3] - b; /* ind3 = 30000 */
|>shl	ax,1
|>add	ax,OFFSET DGROUP:_b
|>sub	ax,OFFSET DGROUP:_b
|>sar	ax,1
|>mov	WORD PTR _diff,ax
|>
|>
|
|The SAR is the bug I was talking about.  It should've been SLR (assuming
|that SAR means Shift Arithmetic Right, SLR would mean Shift Locical Right),
|shifting a 0 from the left (if such an instruction exists).  Or they could
|equally well put an   AND ax, 0x7fff  (excuse me if my 8088 is wrong) after
|the SAR.

This is not where the problem is.  The problem is  not  with
the  SAR but that the overflow flag is not checked after the
SHL instruction.  In this case, with the index of  20000  or
30000,  the  SHL changes the sign bit.  Because the sign bit
changed, the overflow  flag  is  set.  Exception  processing
should  have  been  included  after  the  SHL  to  check for
overflow and deal with it properly.

					Robert A. Pease
{hplabs|fortune|microsoft|amdahal|piramid|tolerant|sun|aimes}!oliveb!rap

mayer@drutx.ATT.COM (gary mayer) (01/07/89)

In article <18123@santra.UUCP>, tarvaine@tukki.jyu.fi (Tapani Tarvainen) writes:

> The same error occurs in the following program 
> (with Turbo C 2.0 as well as MSC 5.0):
>
> main()
> {
>         static int a[30000];
>         printf("%d\n",&a[30000]-a);
> }
>
> output:  -2768

I grant that this is probably not the answer you would like, but it
is the answer you should expect once pointer arithmetic is understood.

First, pointers are addresses.  On the 8088, etc. processors, memory is
addressed in bytes, and an integer is 2 bytes.  Thus the array above is
60,000 BYTES long, and the absolute difference between "a" (the address
of the first element of the array) and "&a[30000]" (the address of the
integer ONE PAST the end of the array, though that is NOT a problem here
and is a very common practice) is 60,000.

Second, when doing pointer arithmetic a scaling factor is used.  In pointer
subtraction, the result is the number of objects (integers here) between
the two pointers.  The scaling factor is not visible, but is used internally
and is the number of addressing units in the given object.  In this
example, the addressing unit is a byte and the object is a 16 bit 
integer, yeilding a scaling factor of 2.

Third, you might expect the answer to be 30,000, the result of 60,000 / 2.
This doesn't happen because of the 16 bit size, the fact that the result
of pointer subtraction is specified to be an integer, and a "weak" but
standard way of implementing the underlying code.  What happens is the
result of the initial pointer address subtraction is 60,000 (or xEA60).
The division by 2 is done as a signed operation and is thus interpreted
as "-5536 / 2" (xEA60 as an integer in 16 bits is -5536), yeilding the
-2768 result.  It is the treatment of this step of the operation that
I consider to be "weak", it is mixing signed and unsigned operations in
an unfavorable way, and done differently it would yield the correct
result in this case.

The problem is complicated on the 8088, etc. machines further because
a "far" pointer allows for arrays larger than the 16 bit integer size
can express.  The result of the subtraction of far pointers should be
a long integer, but I do not know what those compilers do.

In summary, be careful with pointers on these machines, and try to
learn about how things work "underneath".  The C language is very
close to the machine, and there are many times that this can have
an effect - understanding and avoiding these where possible is what
writing portable code is all about.

gwyn@smoke.BRL.MIL (Doug Gwyn ) (01/07/89)

In article <18683@santra.UUCP> tarvaine@tukki.jyu.fi (Tapani Tarvainen) writes:
>>>   > >         static int a[30000];
>>>   > >         printf("%d\n",&a[30000]-a);
>I have been told that dpANS explicitly states that the address of
>"one-after-last" element of an array may be taken, and subtractions 
>like the above are legal and should give correct result.

Almost.  Address of "one after last" is legal for all data objects,
but of course cannot be validly used to access the pseudo-object.
All objects can be considered to be in effect arrays of length 1.
Pointers to elements of the same array object can be subtracted;
the result is of type ptrdiff_t (defined in <stddef.h>).
The example above assumes that ptrdiff_t is int,
which is not guaranteed by the pANS.
Casting to another, definite, integral type such as (long)
would make the result portably usable in printf() etc.

andrew@stl.stc.co.uk (Andrew Macpherson) (01/08/89)

In article <142@bms-at.UUCP> stuart@bms-at.UUCP (Stuart Gathman) writes:
| In tarvaine@tukki.jyu.fi (Tapani Tarvainen) writes:
| 
| > The same error occurs in the following program 
| > main()
| > {
| >         static int a[30000];
| >         printf("%d\n",&a[30000]-a);
| > }
| > output:  -2768
| 
| This is entirely correct.  The difference of two pointers is an *int*.

This is entirely wrong.  It surely has not escaped your notice that
30000 is less than MAXINT (even on an 8088 or pdp-11)?  The difference
of two pionters is an int, but the nescessary intermediate stages have
to use arithmatic such that sizeof(the whole array)+1 can be treated as
a signed quantity even if this result of the subtraction is going to be
divided by the sizeof the elements of the array and then become (or be
squashed into) an integer.  (The +1 is needed since you are allowed to
use the address of the (virtual) element one beyond the array)

Hence all the comments from previous posters about using long
arithmatic --- this is the simple way of acheiving the required
accuracy.  All those casts though are entirely spurious work-arounds
for a very simple *BUG*.

| If you want an unsigned difference, you need to cast to unsigned
| (and/or use %u in the printf).  If the difference were defined as
| unsigned, how would you indicate negative differences?  If you
| make the difference long, all the related arithmetic gets promoted
| also for a big performance hit.  The solution is simple, if you
| want an unsigned ptrdiff, cast or assign to unsigned.

This is a fine red herring
| 
| This is described in the Turbo C manual.

and this is equally irrelevant unless you subscribe to that philosophy which
holds "A documented bug is a feature"
| 
| Don't flame the 8086 either.  The same thing happens in 32-bit machines
| (just much less often).  16 bits is 16 bits, and segments are not
| the problem.  The VAX restricts user programs to 31-bit address space
| to avoid this.
| -- 
| Stuart D. Gathman	<stuart@bms-at.uucp>

As you say the same might happen on a Vax, 68000 or whatever with
really large data structures, though with the restriction on data-space
size you describe this code-generator error cannot occur.  The
difference is that the limit is very much larger, and the structure of
the processor is protecting one from this mistake rather than
encouraging the unwary to make it.
-- 
Andrew Macpherson                          PSI%234237100122::andrew
andrew@stl.stc.co.uk        - or -         ...!mcvax!ukc!stl!andrew
"It is always a great mistake to treat the individual on the chance
that he may become a crowd" -- Mr Justice Codd: (A.P.Herbert)

greg@gryphon.COM (Greg Laskin) (01/08/89)

In article <9878@drutx.ATT.COM> mayer@drutx.ATT.COM (gary mayer) writes:
>In article <18123@santra.UUCP>, tarvaine@tukki.jyu.fi (Tapani Tarvainen) writes:
>
>> The same error occurs in the following program 
>> (with Turbo C 2.0 as well as MSC 5.0):
>>
>> main()
>> {
>>         static int a[30000];
>>         printf("%d\n",&a[30000]-a);
>> }
>>
>> output:  -2768
>
>I grant that this is probably not the answer you would like, but it
>is the answer you should expect once pointer arithmetic is understood.
>
>First, pointers are addresses.  

good

>
>Second, when doing pointer arithmetic a scaling factor is used.  In pointer
>subtraction, the result is the number of objects (integers here) between
>the two pointers.  

good

>Third, you might expect the answer to be 30,000, the result of 60,000 / 2.
>This doesn't happen because of the 16 bit size, the fact that the result
>of pointer subtraction is specified to be an integer, and a "weak" but

consider:

unsigned a = 0, b = 60000;
int size=2;

main()
{
	printf("%d\n", (a-b)/size;
}

That is what the compiler is supposed to be doing.  Go ahead, try it
on your 16 bit compiler ... you'll get 30,000.  

Pointers are not ints.  Pointers have no signs.  The compiler in
question has pointers typed incorrectly.

That the result is an int has nothing at all to do with the internal
math involved in the calculation.

(Observant readers will point out that the result can not correctly
express a number of array elements > 32767 on a 16 bit machines which
is a problem with pointer subtraction involving character arrays).

>The problem is complicated on the 8088, etc. machines further because
>a "far" pointer allows for arrays larger than the 16 bit integer size
>can express.  The result of the subtraction of far pointers should be
>a long integer, but I do not know what those compilers do.

Subtraction of two far pointers which point to the same data aggregate
are guaranteed to yield a result <65535.  Only the lower 16 bits
of the pointers are subtracted.  If the two pointers are not
pointing to the same aggregate, the result will be wrong (actually
undefined).  You probably mean "hugh" pointers.  Hugh pointers and
far pointers don't exist in the C language.  The far and hugh extensions
to the language gurantee defined results only under narrowly defined
conditions.

>
>In summary, be careful with pointers on these machines, and try to
>learn about how things work "underneath".  The C language is very
>close to the machine, and there are many times that this can have
>an effect - understanding and avoiding these where possible is what
>writing portable code is all about.

good
-- 
Greg Laskin  greg@gryphon.COM    <routing site>!gryphon!greg 
	     gryphon!greg@elroy.jpl.nasa.gov

tarvaine@tukki.jyu.fi (Tapani Tarvainen) (01/09/89)

In article <9878@drutx.ATT.COM> mayer@drutx.ATT.COM (gary mayer) writes:
>In article <18123@santra.UUCP>, tarvaine@tukki.jyu.fi (Tapani Tarvainen) writes:
>
>> The same error occurs in the following program 
>> (with Turbo C 2.0 as well as MSC 5.0):
>>
>> main()
>> {
>>         static int a[30000];
>>         printf("%d\n",&a[30000]-a);
>> }
>>
>> output:  -2768
>
>I grant that this is probably not the answer you would like, but it
>is the answer you should expect once pointer arithmetic is understood.
>
[deleted explanation (very good, btw) about why this happens]
>
>In summary, be careful with pointers on these machines, and try to
>learn about how things work "underneath".  The C language is very
>close to the machine, and there are many times that this can have
>an effect - understanding and avoiding these where possible is what
>writing portable code is all about.

I couldn't agree more with the last paragraph.  My point, however,
was that the result above is 
(1) Surprising: It occurs in small memory model, where both ints 
and pointers are 16 bits, and the result fits in an int).
When I use large data model I expect trouble with pointer
arithmetic and cast to huge when necessary, but it shouldn't
be necessary with the small model (or at least the manual
should clearly say it is).
(2) Unnecessary: Code that does the subtraction correctly has
been presented here.  
(3) WRONG according to K&R or dpANS -- or does either say that
pointer subtraction is valid only when the difference *in bytes*
fits in an int?  If not, I continue to consider it a bug.

Another matter is that the above program isn't portable
anyway, because (as somebody else pointed out),
pointer difference isn't necessarily an int (according to dpANS).
Indeed, in Turbo C the difference of huge pointers is long,
and the program can be made to work as follows:

         printf("%ld\n", (int huge *)&a[30000] - (int huge *)a);

Actually all large data models handle this example correctly (in Turbo C),
and thus casting to (int far *) also works here,
but as soon as the difference exceeds 64K (or the pointers
have different segment values) they'll break too, only 
huge is reliable then (but this the manual _does_ explain).

To sum up: Near pointers are reliable up to 32K,
far up to 64K, anything more needs huge.

With this I think enough (and more) has been said about the behaviour
of 8086 and the compilers, however I'd still want somebody
with the dpANS to confirm whether or not this is a bug
- does it say anything about when pointer arithmetic may
fail because of overflow?

------------------------------------------------------------------
Tapani Tarvainen                 BitNet:    tarvainen@finjyu
Internet:  tarvainen@jylk.jyu.fi  -- OR --  tarvaine@tukki.jyu.fi

msb@sq.uucp (Mark Brader) (01/10/89)

Someone says:
Of the code:
		static int a[30000];
		printf("%d\n",&a[30000]-a);

Someone says:
> > I have been told that dpANS explicitly states that the address of
> > "one-after-last" element of an array may be taken, and subtractions 
> > like the above are legal and should give correct result.

And Doug Gwyn says:
> Almost. ... the result is of type ptrdiff_t (defined in <stddef.h>).
> The example above assumes that ptrdiff_t is int ...

Right so far.  But in addition, it's possible for a valid expression to
result in an overflow.  This is not a problem in the particular example
since 30000 can't overflow an int, but it's permissible for subscripts to
run higher than the maximum value that ptrdiff_t can contain.  In that
case, the analogous subtraction "like the above" would not work.

Section 3.3.6 in the October dpANS says:
#  As with any other arithmetic overflow, if the result does not fit in
#  the space provided, the behavior is undefined.

Mark Brader, SoftQuad Inc., Toronto, utzoo!sq!msb, msb@sq.com
	A standard is established on sure bases, not capriciously but with
	the surety of something intentional and of a logic controlled by
	analysis and experiment. ... A standard is necessary for order
	in human effort.				-- Le Corbusier

m5@lynx.uucp (Mike McNally) (01/10/89)

In article <9878@drutx.ATT.COM> mayer@drutx.ATT.COM (gary mayer) writes:
>I grant that this is probably not the answer you would like, but it
>is the answer you should expect once pointer arithmetic is understood.

Pointer subtraction is understood.  According to the C definition, the
statement should work.  If it does not, then the compiler is busted.
I don't give a crap why it doesn't work, or what sort of architectural
problem the compiler writer had to deal with.  If a translator for a 
language exists on a machine, it should translate programs into a form
that executes correctly on the target machine, or at least inform the
operator that a problem exists.  That's the whole point of a high level
language.
-- 
Mike McNally                                    Lynx Real-Time Systems
uucp: {voder,athsys}!lynx!m5                    phone: 408 370 2233

            Where equal mind and contest equal, go.

alexande@drivax.UUCP (Mark Alexander) (01/13/89)

When all else fails, look at the actual code generated by your
compiler to see where the problem lies.  The problem people are having
with Turbo C and MS C doing incorrect pointer subtraction is probably
due to the compiler generating a SIGNED shift (SAR) instead of an
UNSIGNED shift (SHR) when subtracting INT pointers.  Datalight C 3.14
has this problem, as shown below.

With DLC, the code generated to do the pointer subtraction look like this:

	mov	BX,offset DGROUP:U0EA60h	; &a[30000]
	mov	SI,offset DGROUP:U0		; &a[0]
	sub	BX,SI				; &a[30000] - &a[0]
	sar	BX,1				; convert bytes to words

That last 'sar' instruction should really be a 'shr'.  I'm sure Turbo C
and MS C have a similar problem.
-- 
Mark Alexander	(UUCP: amdahl!drivax!alexande)
"Bob-ism: the Faith that changes to meet YOUR needs." --Bob (as heard on PHC)

alexande@drivax.DRI (Mark Alexander) (01/17/89)

In article <4121@drivax.UUCP> I wrote without thinking very hard:
:	mov	BX,offset DGROUP:U0EA60h	; &a[30000]
:	mov	SI,offset DGROUP:U0		; &a[0]
:	sub	BX,SI				; &a[30000] - &a[0]
:	sar	BX,1				; convert bytes to words
:That last 'sar' instruction should really be a 'shr'.

Actually, it should be 'rcr', to handle the case where the first pointer
is less than the second.  Thanks to Larry Jones (uunet!sdrc!scjones)
for pointing this out to me.  That darned carry flag always confuses me.
-- 
Mark Alexander	(amdahl!drivax!alexande) (no 'r')