Harbison@TL-20A.ARPA (Sam Harbison) (03/26/85)
There has been a lot of net traffic regarding casts from short to unsigned. I would like to try to clarify the general problem and offer a complete set of arithmetic and integer conversion rules. Forgive the length of the note... First some general points about the definition of C: All C "casts" are conceptually conversions, changing objects of one type to objects of another type, possibly by rearranging the bits. It happens that many integer-to- integer conversions in fact do not change any bits and are therefore omitted by the compiler. The typical example of this is conversions between signed and unsigned integers of the same size on two's-complement machines. Similarly, casts are never lvalues, even if some compilers erroneously allow some casts to be lvalues. Casts between integers of unlike size and signedness are confusing for at least four reasons: 1) original K&R did not recognize types unsigned short, char, or long, and so did not specify the usual arithmetic conversion rules; 2) the way in which conversions between integers of unlike size and signedness is to take place was unspecified in certain instances; 3) the operand of a cast is subject to the "ususal arithmetic conversions", which is often forgotten; and 4) many compilers get the rare cases wrong anyway. If we admit the types "unsigned short" and "unsigned char", we should adjust the "usual arithmetic conversions" to say that objects of these types are immediately converted to type "unsigned". (This is in keeping with the spirit of the other conversions.) Since the operand of a cast is subject to these conversions, this reduces the problem of casting from "unsigned short" or "unsigned char" to casting from "unsigned int". If we admit the type "unsigned long", we likewise have to augment the arithmetic conversions (see below). The two new non-obvious integral casts are those from signed int to unsigned long and from unsigned int to signed long. In those cases, it seems right (using the short/int analogy) to specify: (unsigned long) i == (unsigned long) (long) i (long) ui == (long) (unsigned long) ui Anyway, here is a version of the complete arithmetic conversion rules that Guy Steele and I worked out. I believe them to be compatible with K&R and existing practice. An interesting addition is the combination of types long and unsigned int (rule 5). 1. Any operands of type short or char are converted to int, any operands of type unsigned short or unsigned char are converted to unsigned int, any operands of type float are converted to double, any operands of type "array of T" are converted to type "pointer to T", and any operands of type "func- tion returning T" are converted to type "pointer to function returning T". 2. Then, if either operand is not of arithmetic type or if the two operands have the same type, no additional conversion is performed. 3. Otherwise, if one operand is of type double, then the other operand is converted to type double. 4. Otherwise, if one operand is of type unsigned long then the other operand is converted to type unsigned long. 5. Otherwise, if one operand is of type long int and the other operand is of type unsigned int, then each of the two operands is converted to type unsigned long int. 6. Otherwise, if one operand is of type long, then the other operand is converted to type long. 7. Otherwise, if one operand is of type unsigned int, then the other operand is converted to type unsigned int. 8. Otherwise, both operands must be of type int, and so no additional conversion is performed. Finally, here are the integer conversion rules for two's-complement computers. I believe them to be compatible with K&R and with existing practice. 1. When converting between integers of the same size, there is no change of representation, regardless of the signedness of the source or destination. 2. When converting from a longer integer to a shorter one, the high-order bits are discarded, regardless of the signedness of source or destination. 3. When converting from a shorter integer to a longer one, the source is sign-extended if it is signed and zero-extended if it is unsigned, regardless of the signedness of the destination. Sam Harbison (Harbison@TL-20A.arpa) Tartan Laboratories 477 Melwood Avenue Pittsburgh PA 15213 (412) 621-2210 -------