guy@rlgvax.UUCP (Guy Harris) (01/21/84)
An interesting comment from "The C Reference Manual": 14.4 Explicit pointer conversions A pointer may be converted to any of the integral types large enough to hold it. *Whether an "int" or "long" is required is machine dependent.* ("Italics" mine) Note that this means that a C implementation with 16-bit "int"s and 32-bit "long" is perfectly legal. Whether it is the "right" way to do it is a valid point of debate, but whether it is a legal way to do it is *not*. However, the manual also says: 7.4 Additive operators . . . If two pointers to objects of the same type are subtracted, the result is converted (by division by the length of the object) to *an "int"* ("italics" mine) representing the number of objects separating the pointed-to objects. This sounds to me like: Machine X has a C implementation with 16-bit "int"s and 32-bit pointers. As such, long l; char *p; l = p; assigns the 32 bits of p to the 32 bits of l, while int i; char *p; i = p; may throw away bits. *However*, char *p; char *q; foo(p - q); must pass an "int", *not* a "long int", to "foo". I don't know whether this is what is implied or not. This, if true, implies that there must not be more than "maxint" items between any two pointers, which will probably break lots of programs which subtract two arbitrary heap pointers - programs like the UNIX kernel, various memory allocators, etc.. > What makes a program portable? Adhering strictly to the C reference > manual is the answer I'd give. Since the manual states that 0 == NULL, > I believe that's that. It is up to the implementation to assure that > this works. Well, that isn't that. First of all, the manual does not state that "0 == NULL". For one thing, since NULL is defined in <stdio.h> with a pre-processor statement, "0 == NULL" expands to "0 == 0" which is a tautology. What the manual says is: 7.14 Assignment operators . . . The compilers currently allow a pointer to be assigned to an integer (note: an integer, not an "int"), an integer to a pointer, and a pointer to a pointer of another type. (This implies that it may not be required that compilers permit these assignments, in general.) The assignment is a pure copy operation, with no conversion. This usage is nonportable, and may produce pointers which cause addressing exceptions when used. However, it is guaranteed that assignment of the constant 0 to a pointer will produce a null pointer distinguishable from a pointer to any object. This implies to me that char *p; p = 1; and char *p; p = 0; are not the same sort of thing. I take it to mean that a C implementation need not represent null pointers as a bit string all of whose bits are zero, and that if it doesn't assignment of 0 to a pointer is *not* done as a pure copy; it causes the bit string which represents a null pointer to be stuffed into the pointer. On page 98 of "The C Programming Language", it says "In general, integers cannot meaningfully be assigned to pointers; zero is a special case" which supports this interpretation. This takes care of pointer conversions in general. For comparisons, the manual says: 7.7 Equality operators . . . A pointer may be compared to an integer, but the result is machine-dependent unless the integer is the constant 0. A pointer to which 0 has been assigned is guaranteed not to point to any object, and will *appear to be equal to* 0 ("italics" mine); in conventional usage, such a pointer is considered to be null. Again, I read this as saying that a 0 pointer, which is conventionally called a null pointer, need not have the same bit pattern as an integer (note: not an "int") with the value 0. Furthermore, "(char *)0" is not the same as "0"; it is a 0 pointer of type "char *". As such, foo(0); and foo((char *)0); are not equivalent. Period. Note that "lint" agrees with me 100% in this case; if it finds that "foo" has a "char *" as its first argument, it will report a type clash for the first statement but not for the second. All pointers are not created equal; it just so happens that on *most* - NOT all - implementations of C, they are all the same sort of beast. I'm sure that many, if not all, implementations on word-addressable machines treat them as two very different types. In fact, they could even have different sizes! "foo(0)" could fail for several reasons, not just the 16-bit versus 32-bit problem. If a null pointer were represented as something like 0xff000000 (which is, I think, the representation of a NULL pointer in System/360 and successors' PL/I implementations), foo(0); is passing an "int" of value 0 to foo, which has the bit pattern 0x00000000 on the 360, while foo((char *)0); is passing a "char *" of value 0 to foo, which has the bit pattern 0xff000000. The problem here is that in all C contexts except for subroutine arguments, the C language can determine what type an expression is to be converted to and will do that conversion automatically. Since there is *currently* not way of declaring the type of arguments to a function, it assumes that the programmer got it right and will not do such conversions implicitly. The programmer *must*, as a result, specify such conversions explicitly, by writing "foo((char *)0)", in order to write a correct C program. (They must also declare the return value of routines correctly, which is another related problem I've seen with a lot of code.) There is some discussion of adding the ability to declare the type of the arguments to a function to C, which would obviate most of this problem (you could say "foo(0)", and the proper bit pattern for a 0 pointer of type "char *" would be passed to "foo"), although "execl" still would require a cast, as it takes a variable number of arguments (something which is *not* explicitly mentioned in the C Reference Manual, so a weird implementation is legal but would break so many UNIX programs - like those that use "printf", "scanf", and "execl" - that anybody who makes such an implementation better have a very good reason for it). > Offhand, I could not find anything in the manual that says that function > arguments on the stack are no smaller than type int. (I could have > easily overlooked this, however.) Couldn't machines with 32 bit pointers > and 16 bit ints push 32 bits on the stack always. This is analogous to > how chars are done now. 7.1 Primary expressions . . . A function call is... Any actual arguments of type "float" are converted to "double" before the call; any of type "char" or "short" are converted to "int"; and as usual, array names are converted to pointers. No other conversions are performed automatically; *in particular, the the compiler does not compare the types of actual arguments with those of formal arguments. If conversion is needed, use a cast*... ("Italics" mine) That's where it says that function arguments ("on the stack" is an implementation detail) are no smaller than type "int". Machines with 32-bit pointers and 16-bit "int"s *could* push the "int" and 16 bits of zero, but this wouldn't do the right thing if a zero pointer didn't consist of 32 bits of zero - remember, this could be a tagged architecture, for instance. If it were felt that the implementation should compensate for deficient programmers (sic!), one would just have to pay a performance deficiency for this - but the whole point of *having* 16-bit "int"s and 32-bit pointers is to *avoid* the performance penalty of 32-bit quantities on machines that don't support them as fully and efficiently as one might like (i.e., they don't fully do 32-bit arithmetic, like the 68K, or they require two memory fetches on a 16-bit bus). Since such an implementation would solely be compensating for programmers too lazy to run their code through "lint", and who, as such, are writing code which is not guaranteed to be portable - in fact, code which is probably *guaranteed not to be portable*, even if this "feature" were put into 16-bit "int" and 32-bit pointer implementations - there's no point in doing it. Guy Harris {seismo,ihnp4,allegra}!rlgvax!guy