[comp.lang.c] ANSI C awkward examples

timc@copper.UUCP (11/19/87)

I tried to post this a few days ago, and I don't think it got out.  My
apologies to anyone who gets this twice.

Fun things to know about ANSI C:

    1)  In ANSI C, the usual arithmetic conversions treat int expressions
        differently from long expressions, even when the sizes of int and
        long are the same.

            According to 3.2.1.5:
                unsigned int u = 4000000000;   /* Assume 32 bit int and long */

                (u + 1)  > 50 is true;  but
                (u + 1L) > 50 is false.

    2)  This code fragment is not portable in ANSI C:
            u_long = u_char_1 * u_char_2;

            According to 3.2.1.1, the multiply is to be:
                ((int) u_char_1) * ((int) u_char_2)

            instead of
                ((unsigned int) u_char_1) * ((unsigned int) u_char_2).

            For a typical 8086 implementation (32 bit long, 16 bit int)
            if u_char_1 and u_char_2 each contain 200, the unsigned long
            result in u_long is 4,294,941,760, instead of the correct
            40,000.

            Unfortunately, this is not due to some sort of typo.  ANSI is
            aware of this example.

keesan@bbn.UUCP (11/25/87)

In article <1470@copper.TEK.COM> timc@copper.UUCP (Tim Carver) writes:
>
>Fun things to know about ANSI C:
>
>    1)  In ANSI C, the usual arithmetic conversions treat int expressions
>        differently from long expressions, even when the sizes of int and
>        long are the same.
>
>            According to 3.2.1.5:
>                unsigned int u = 4000000000;   /* Assume 32 bit int and long */
>
>                (u + 1)  > 50 is true;  but
>                (u + 1L) > 50 is false.
>

Not any more.  I pointed this out in a formal comment on the first Draft, and
the November 9, 1987 Draft (Doc. No: X3J11/87-221) now says in 3.2.1.5:
    [stuff about double and float]
    Otherwise, the integral promotions are performed on both operands.  Then  |
    the following rules are applied:                                          |

	If either operand has type unsigned long int, the other operand is
	converted to unsigned long int.

	Otherwise, if one operand has type long int and the other has type    +
	unsigned int, if a long int can represent all values of an unsigned   +
	int, the operand of type unsigned int is converted to long it;        +
	otherwise both operands are converted to unsigned long int.           +

	Otherwise, if either operand has type long int, the other operand is  
	converted to long int.

	Otherwise, if either operand has type unsigned int, the other operand |
	is converted to type unsigned int.

	Otherwise, both operands have type int.

+ indicates new language, | indicates changed language.


In the example given, this would result in

(u + 1) > 50 ==> (u + 1U) > 50 ==> 4000000001U > 50 ==> 400000001U > 50U ==> 1

and
(u + 1L) > 50 ==> ((unsigned long int)u + 1LU) > 50 ==> 4000000001LU > 50
    ==> 400000001LU > 50LU ==> 1
-- 
Morris M. Keesan
keesan@bbn.com
{harvard,decvax,ihnp4,etc.}!bbn!keesan

gwyn@brl-smoke.ARPA (Doug Gwyn ) (11/26/87)

In article <1470@copper.TEK.COM> timc@copper.UUCP (Tim Carver) writes:
>Fun things to know about ANSI C:

More precisely, about an early draft of what is intended to become the
ANS for the C programming language.  Read on...

>                (u + 1L) > 50 is false.

This was fixed in the November 9, 1987 draft.

>            For a typical 8086 implementation (32 bit long, 16 bit int)
>            if u_char_1 and u_char_2 each contain 200, the unsigned long
>            result in u_long is 4,294,941,760, instead of the correct
>            40,000.

Note that the integral promotions themselves preserve value
including sign, and the value of the unsigned chars are 200.
The only problem is that the multiplication overflows (the
result is not representable in an int).  Overflow is in the
"undefined" category, so the implementation may legally
produce either answer in this case.  Careful programmers
will provide explicit (unsigned) casts to ensure the
expected result.

If the "integral promotions" had been defined to preserve
sign rather than value, then your example would have worked
as you expected it to.  The decision to specify the value-
preserving rules was probably to most hotly debated X3J11
Committee decision, with AT&T on the side of the sign-
preserving rules.  However, the outcome was as you see it,
and now that it takes a 2/3 majority to make a substantive
change to the draft Standard, I doubt the likelihood of the
decision being reversed.

chris@mimsy.UUCP (11/26/87)

In article <6732@brl-smoke.ARPA> gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
>... The decision to specify the value-preserving rules was probably
>to most hotly debated X3J11 Committee decision,

(`Hotly debated' := shouting match, free-for-all, and/or food fight :-) )

>with AT&T on the side of the sign-preserving rules.  However, the
>outcome was as you see it, and now that it takes a 2/3 majority to
>make a substantive change to the draft Standard, I doubt the
>likelihood of the decision being reversed.

Alas!  This time AT&T was right (yes, I admit that AT&T SysV-oids are
not *always* wrong :-) ).

The new signed/unsigned rules (differing from PCC's), together with
the fact that the value of sizeof is `an unsigned integral type',
will (I predict) be the source of some of the most amazingly subtle bugs....
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

gwyn@brl-smoke.ARPA (Doug Gwyn ) (11/26/87)

In article <9538@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>The new signed/unsigned rules (differing from PCC's), together with
>the fact that the value of sizeof is `an unsigned integral type',
>will (I predict) be the source of some of the most amazingly subtle bugs....

Yeah, well, the problem was that there were many more non-PCC-based
implementations of C, especially in the MS-DOS world.  They could
say something similar if sign-preserving rules had been adopted.

msb@sq.UUCP (11/27/87)

>                 unsigned int u = 4000000000;   /* Assume 32 bit int and long */
>                 (u + 1)  > 50 is true;  but
>                 (u + 1L) > 50 is false.

This is fixed in the new draft (November 9, 1987).  The "usual arithmetic
conversions" now specify that when "long" and "unsigned" meet up, the
conversion is to "unsigned long" if "long" and "int" are the same size,
and to "long" otherwise.

>     2)  This code fragment is not portable in ANSI C:
>             u_long = u_char_1 * u_char_2;

As discussed at some length in this group just a little while ago,
multiplying two ints to get a long is indeed dangerous in terms of
overflow.  If you don't want overflow, cast one or both operands
to "long", or in this case "unsigned long", before multiplying.

Incidentally, in email following from that past discussion, Doug Gwyn
pointed out a subtlety in the Draft's specification that I hadn't noticed.
Since the behavior of signed integer types on overflow is undefined, it IS
permissible for
		short i, j;	/* assume 16 bits */
		long p;		/* assume 32 bits */
		p = i*j;
to behave the same as
		p = (long)i*(long)j;
instead of like the commonly expected
		p = (long) (short) (i*j);

This is contrary to what most of us responding to the original article said,
including me.  However, this does not mean that it's permissible on overflow
for the result to spill over into any other object (even if p was also
short).  I presume that if you really want the truncating behavior the
last example form above will suffice.

But if the types were "unsigned short" and "unsigned long", then the behavior
on overflow would be well-defined, and the analog of the last example form
would indeed be required.

Mark Brader				"C takes the point of view
SoftQuad Inc., Toronto			 that the programmer is always right"
utzoo!sq!msb, msb@sq.com				-- Michael DeCorte