[net.lang.c] Casts and conversions; short, long, unsigned

Harbison@TL-20A.ARPA (Sam Harbison) (03/26/85)

There has been a lot of net traffic regarding casts from short to unsigned.  I
would like to try to clarify the general problem and offer a complete set of
arithmetic and integer conversion rules.  Forgive the length of the note...

First some general points about the definition of C: All C "casts" are
conceptually conversions, changing objects of one type to objects of another
type, possibly by rearranging the bits.  It happens that many integer-to-
integer conversions in fact do not change any bits and are therefore omitted by
the compiler.  The typical example of this is conversions between signed and
unsigned integers of the same size on two's-complement machines.  Similarly,
casts are never lvalues, even if some compilers erroneously allow some casts to
be lvalues.

Casts between integers of unlike size and signedness are confusing for at least
four reasons: 1) original K&R did not recognize types unsigned short, char, or
long, and so did not specify the usual arithmetic conversion rules; 2) the way
in which conversions between integers of unlike size and signedness is to take
place was unspecified in certain instances; 3) the operand of a cast is subject
to the "ususal arithmetic conversions", which is often forgotten; and 4) many
compilers get the rare cases wrong anyway.

If we admit the types "unsigned short" and "unsigned char", we should adjust
the "usual arithmetic conversions" to say that objects of these types are
immediately converted to type "unsigned".  (This is in keeping with the spirit
of the other conversions.)  Since the operand of a cast is subject to these
conversions, this reduces the problem of casting from "unsigned short" or
"unsigned char" to casting from "unsigned int".

If we admit the type "unsigned long", we likewise have to augment the
arithmetic conversions (see below).  The two new non-obvious integral casts are
those from signed int to unsigned long and from unsigned int to signed long.
In those cases, it seems right (using the short/int analogy) to specify:
	(unsigned long) i == (unsigned long) (long) i
		(long) ui == (long) (unsigned long) ui

Anyway, here is a version of the complete arithmetic conversion rules that
Guy Steele and I worked out.  I believe them to be compatible with K&R and
existing practice.  An interesting addition is the combination of types long
and unsigned int (rule 5).

    1. Any operands of type short or char are converted to int, any operands of
    type unsigned short or unsigned char are converted to unsigned int, any
    operands of type float are converted to double, any operands of type "array
    of T" are converted to type "pointer to T", and any operands of type "func-
    tion returning T" are converted to type "pointer to function returning T".
    2. Then, if either operand is not of arithmetic type or if the two operands
    have the same type, no additional conversion is performed.
    3. Otherwise, if one operand is of type double, then the other operand is
    converted to type double.
    4. Otherwise, if one operand is of type unsigned long then the other
    operand is converted to type unsigned long.
    5. Otherwise, if one operand is of type long int and the other operand is
    of type unsigned int, then each of the two operands is converted to type
    unsigned long int.
    6. Otherwise, if one operand is of type long, then the other operand is
    converted to type long.
    7. Otherwise, if one operand is of type unsigned int, then the other
    operand is converted to type unsigned int.
    8. Otherwise, both operands must be of type int, and so no additional
    conversion is performed.

Finally, here are the integer conversion rules for two's-complement computers.
I believe them to be compatible with K&R and with existing practice.

    1. When converting between integers of the same size, there is no change of
    representation, regardless of the signedness of the source or destination.
    2. When converting from a longer integer to a shorter one, the high-order
    bits are discarded, regardless of the signedness of source or destination.
    3. When converting from a shorter integer to a longer one, the source is
    sign-extended if it is signed and zero-extended if it is unsigned,
    regardless of the signedness of the destination.

Sam Harbison	(Harbison@TL-20A.arpa)
Tartan Laboratories
477 Melwood Avenue
Pittsburgh PA 15213
(412) 621-2210

-------