[comp.lang.c] Sizes, alignments, and maxima

karl@haddock.ima.isc.com (Karl Heuer) (02/23/89)

In article <830@atanasoff.cs.iastate.edu> hascall@atanasoff.cs.iastate.edu (John Hascall) writes:
>In article <8943@alice.UUCP> ark@alice.UUCP (Andrew Koenig) writes:
>>For that reason it's hard to see how a C implementation could possibly
>>do anything but put [an array] in contiguous memory.
>
>How about:  Assume int's are (say) 2 bytes.  Assume further that ... all
>accesses must be on an 8-byte boundary.

Then sizeof(int) is 8, and the elements of the array consists of contiguous
8-byte units, of which only two bytes are significant.  This sounds much like
a Cray-2, in fact.

Question for comp.std.c (to which I've redirected followups): I've been told
that the Cray-2 has sizeof(int) == 8, yet INT_MAX == 0x7FFFFFFF (i.e. the
arithmetic is only accurate to 4 bytes when using int).  Is this legal in a
conforming implementation?  I think I can prove that UINT_MAX must be 2*^64-1,
but I'm less sure about INT_MAX.  Section 3.1.2.5 has a restriction to binary
architectures, which by the definition in the footnote seems to require every
bit except the highest to represent a power of two; should this be interpreted
as a requirement that 2*^63-1 must be representable in an 8-byte int?

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
(I've implicitly assumed 8-bit bytes above, simply because it would be too
cumbersome to type the more correct expressions involving CHAR_BIT.)

boyne@hplvli.HP.COM (Art Boyne) (02/23/89)

badri@valhalla.ee.rochester.edu (Badri Lokanathan) writes:
>For instance, if the array was to be x[101] to x[110] instead of
>x[0] to x[9], what is the easiest way to do it?
>For automatic arrays one could do
>int space[10], *x; 
>x = &space[0] - 101;
>
>In article <1989Feb22.171441.7957@utzoo.uucp>, henry@utzoo.uucp (Henry Spencer) writes:
>> No one couldn't, not if one wants to be portable.  There is absolutely no
>> guarantee that x[101] will be the same as space[0], and in fact there is
>> no guarantee that the computation of x won't trap and give you a core dump.
                         ^^^^^^^^^^^
>> On a sufficiently odd segmented machine it might.  Even K&R1 warns you that
>> pointer arithmetic won't necessarily do what you think unless you keep it
>> within an array.
>
>Henry is probably referring to the comments on pages 98 and 189 of K&R1.
>My argument is, since x is an independent variable (a pointer, but
>still a variable,) there is no way of checking any array bounds.
>Thus in order for consistency with the definition of pointer
>arithmetic,
>x -= i; x += i;
>should result in the original value of x (provided the resulting
>intermediate values are within pointer ranges.)
>Since x can be initialized to any value, this should always be possible,
>the result after each operation being an address appropriately offset
>from the original object. Yes, accessing the contents of an illegal location
                                ^^^^^^^^^
>will almost certainly screw things up, but there is no access happening.

Sorry, but read Henry's response more carefully:  it is possible that the
*computation* of x-=i will cause an abort/trap/etc - without attempting
any *access*.  This is especially likely if x=-i would result in x pointing
to a negative-index array location (consider a machine whose address
registers trapped on a negative segment offset and whose compiler put
the array into a separate segment, perhaps due to segment size limitations).  
Henry is correct: pointer arithmetice isn't guaranteed unless you keep it
*within* an array.

Art Boyne, boyne@hplvla.HP.COM

badri@valhalla.ee.rochester.edu (Badri Lokanathan) (02/25/89)

In article <340009@hplvli.HP.COM>, boyne@hplvli.HP.COM (Art Boyne) writes:
> Sorry, but read Henry's response more carefully:  it is possible that the
> *computation* of x-=i will cause an abort/trap/etc - without attempting
> any *access*.  This is especially likely if x=-i would result in x pointing
> to a negative-index array location (consider a machine whose address
> registers trapped on a negative segment offset and whose compiler put
> the array into a separate segment, perhaps due to segment size limitations).  
> Henry is correct: pointer arithmetice isn't guaranteed unless you keep it
> *within* an array.
> 

On a slightly different note and out of curiosity, I tried the
following experiment on a sun 3 running OS3.4:

main()
{
	unsigned i, j, k, val;

	i = 100; j = 200; k = 300;

	val = i - j + k;
	printf("%u\n", val);

	val = (i - j) + k;
	printf("%u\n", val);

	val = i - j;
	printf("%u\n", val);
}

The output was
200
200
4294967196

Thus even though the intermediate value was rubbish (-100), it still
worked correctly. (I checked the assembler output for difference too.)
Could anybody tell me of a machine where this will not run?
Similarly with the pointer problem, while everybody has
said that a problem *might* occur, is there a machine where
failure will definitely occur?
-- 
,------------------------.    "O! the beautiful bombs falling!
| Badri Lokanathan       |     Children leap as deer to catch them.
| badri@ee.rochester.edu |     Mothers burst open like flowers.
`------------------------'     Fathers spin away into orange air."  -Dick Bakken

henry@utzoo.uucp (Henry Spencer) (02/26/89)

In article <1839@valhalla.ee.rochester.edu> badri@valhalla.ee.rochester.edu (Badri Lokanathan) writes:

>	unsigned i, j, k, val;
>
>	i = 100; j = 200; k = 300;
>
>	val = i - j + k;
>	printf("%u\n", val);
>
>	val = (i - j) + k;
>	printf("%u\n", val);
>
>	val = i - j;
>	printf("%u\n", val);
>}
>
>The output was
>200
>200
>4294967196
>
>Thus even though the intermediate value was rubbish (-100), it still
>worked correctly. (I checked the assembler output for difference too.)
>Could anybody tell me of a machine where this will not run?

No ANSI-conforming implementation will refuse to run this.  The intermediate
result was neither rubbish nor -100, it was 4294967196, which is perfectly
correct, legal, and standard-compliant.  Arithmetic on unsigned numbers is
defined to wrap around rather than causing overflow.

If you tried the equivalent with signed arithmetic, on some machines
(I believe the MIPS machines are among them), you would get overflow traps.

I can't immediately think of any currently-extant machines where the same
would happen with pointers, but that's no guarantee that there never will
be one.
-- 
The Earth is our mother;       |     Henry Spencer at U of Toronto Zoology
our nine months are up.        | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

scs@adam.pika.mit.edu (Steve Summit) (02/26/89)

In article <1839@valhalla.ee.rochester.edu> badri@valhalla.ee.rochester.edu (Badri Lokanathan) writes:
>In article <340009@hplvli.HP.COM>, boyne@hplvli.HP.COM (Art Boyne) writes:
>> Henry is correct: pointer arithmetice isn't guaranteed unless you keep it
>> *within* an array.
>On a slightly different note and out of curiosity, I tried the
>following experiment on a sun 3 running OS3.4:
>
[Example demonstrating correct wraparound behavior of unsigned
arithmetic with underflow.]
>
>Thus even though the intermediate value was rubbish (-100), it still
>worked correctly.
>Similarly with the pointer problem, while everybody has
>said that a problem *might* occur, is there a machine where
>failure will definitely occur?

Unsigned arithmetic is guaranteed to be modulo 2**n in the
presence of underflow or underflow (where n is of course the word
size in bits).  The same cannot be said for pointer arithmetic.
For many machines, pointer arithmetic is equivalent to unsigned
arithmetic (uses the same instructions and registers), but this
is not required by the language.

Yes, Virginia, there are machines out there with baroque memory
architectures ("baroque" if you're used to single contiguous
linear address spaces, as most of us are; the more complicated
architectures may have compensating advantages) for which pointer
arithmetic is anything but simple, but instead uses special
registers and/or instructions.  The canonical example is the
80n86 for n>=2 and when operating in certain memory management
modes.  I won't get into a full-blown description of the 8086
family's segmented architecture here; suffice it to say that,
although correct C programs can be made to run under it, it can
be a real mess.

For the purposes of the current discussion, pointer arithmetic
involving intermediate results which overflow or underflow can
and do result in processor traps.  You might ask why "regular"
unsigned arithmetic can't still be used for pointers on such
machines.  The answer is that since they don't have single linear
contiguous address spaces, pointers aren't simple numbers, but
instead (in the case of an 80x86 in other than "small" model) a
segment,offset pair.  Pointer arithmetic typically only operates
on the offset portion, which works as long as the pointer stays
within the same segment, but fails if the offset overflows, since
no carry into the segment portion normally takes place.  (Some
compilers can perform "huge model" addressing, generating
laborious code for each pointer arithmetic operation, to perform
the carry manually.)

If this sort of thing shocks you, you are not alone.  Many people
find that the warts and kludges required to "support" segmented
architectures demolish any of the purported benefits that Intel
marketing literature would lead us to believe are to be derived
from this "feature."  To paraphrase Douglas Adams, this is a very
respectable view, widely held by right-thinking people, who are
largely recognizable as being right-thinking people by the mere
fact that they hold this view.

Several people have recently argued that the use of NULL as a
preprocessor macro is at the root of much needless confusion, and
should therefore be stamped out.  Can we make a similar argument
about segmented architectures and stamp them out too :-) ?  It
occurs to me that segmented architectures are probably more
appropriate for earthworms than computers (or has some other wag
made this suggestion already?).

Followups to comp.arch.intel.flame.

                                            Steve Summit
                                            scs@adam.pika.mit.edu