msb@sq.com (Mark Brader) (04/17/89)
This comment was posted to comp.lang.c with "Distribution: na" and "Subject: Re: calloc (actually NULL =?= 0)". I've added comp.std.c and directed followups to there. > Actually, I can't see any particular reason for (int)0 to be a zero > bit pattern either (unless it's mandated by pANS). Following are excerpts from section 3.1.2.5, "Types". # There are four "signed integer types", designated as "signed char", "short # int", "int", and "long int". (The signed integer and other types may also # be designated in several additional ways, as described in section 3.5.2.) # ... # For each of the signed integer types, there is a corresponding (but # different "unsigned integer type" (designated with the keyword "unsigned") # that uses the same amount of storage (including sign information) and has # the same alignment requirements. The range of nonnegative values of a # signed integer type is a subrange of the corresponding unsigned integer # type, and the representation of the same value in each type is the same.* Footnote: * The same representation and alignment requirements are meant to imply * interchangeability as arguments to functions, return values from functions, * and members of unions. # ... # The type "char", the signed and unsigned integer types, and the enumerated # types are collectively called "integral types". The representations of # integral types shall define values by use of a pure binary numeration # system.* Footnote: * A positional representation for integers that uses the binary digits 0 * and 1, in which the values represented by successive bits are additive, * begin with 1, and are multiplied by successive integral powers of 2, * except perhaps the bit with the highest position. (Adapted from the * "American National Dictionary for Information Processing Systems.") Now, let me review the usual representations of integers in binary. Pretend that integers are only 3 bits long, so there are only 8 possible bit patterns; that way we can enumerate them all in one line of this article. The "usual" interpretations are: Bits 000 001 010 011 100 101 110 111 unsigned 0 1 2 3 4 5 6 7 2's complement 0 1 2 3 -4 -3 -2 -1 1's complement 0 1 2 3 -3 -2 -1 -0 sign-magnitude 0 1 2 3 -0 -1 -2 -3 The intention of the second-quoted footnote is to allow each of the interpretations tabulated above. Note that "all zero-bits" is an integer 0 in each one of them. The issue of "-0" is one on which I have seen different opinions. My point of view is that since -0 is mathematically equivalent to 0, there is only one *value* there, and since the pANS speaks of "the" representation of a value, it can have only one representation. Consequently, I feel that a conforming 1's complement implementation, for instance, is required to silently convert any instance of the all-1's bit pattern to all-0's before doing any bitwise operations on it. A second issue is whether the following interpretation is allowed: Bits 000 001 010 011 100 101 110 111 "unsigned" 0 1 2 3 0 1 2 3 The question here is whether such an interpretation is "using" all the bits of the storage, as required by the quoted paragraph about unsigned types. On a machine where "int", and therefore "unsigned int", are 16-bit types, "unsigned int" could not use this representation because the highest unsigned int value would be 32767 and the pANS requires it to be at least 65535. But on a machine where ints were 18 bits, it might (depending on this point of interpretation) be permissible for unsigned ints to use only 17 of their 18 bits and have the same highest value, 131071, as ints. I think I've heard it said that allowing this was an error and also that it was intentional and some implementation was using it. I don't know what's right. A third issue occurred to me as I was writing this article. I see nothing in all of this text to prohibit the FOLLOWING interpretations: Bits 000 001 010 011 100 101 110 111 "unsigned" 4 5 6 7 0 1 2 3 "2's complement" -4 -3 -2 -1 0 1 2 3 "1's complement" -3 -2 -1 -0 0 1 2 3 "sign-magnitude" -0 -1 -2 -3 0 1 2 3 In THIS interpretation, all 0-bits is NOT necessarily a zero value! If this is indeed a loophole it's clear from other places in the pANS that it was unintentional. For example, the "null character" is defined in section 2.2.1 to have all bits zero, and the Examples in 3.1.3.4 say that '\0' is commonly used to represent it, and body of 3.1.3.4 requires '\0' to be a synonym for 0. But we can't deduce that 0 has to have all bits 0 from this, because the Examples are not part of the pANS proper. (Neither are the footnotes, for that matter, and I did complain about the material in the second-quoted footnote belonging in my opinion in the text proper.) -- Mark Brader, SoftQuad Inc., Toronto, utzoo!sq!msb, msb@sq.com "I'm a little worried about the bug-eater," she said. "We're embedded in bugs, have you noticed?" -- Niven, "The Integral Trees" This article is in the public domain.