greg@utcsri.UUCP (01/01/70)
In article <253@xyzzy.UUCP> throopw@xyzzy.UUCP (Wayne A. Throop) writes: >> franka@mmintl.UUCP (Frank Adams) >> So does arithmetic on a null pointer produce undefined results? I don't >> have a copy of the proposed standard available, so I don't know what it >> says. This is about what it *should* say; if it doesn't, it should be >> changed. ... >on this point is that the standard does *NOT* say that arithmetic on the >null pointer produces undefined results (contrary to my own ... >> [it should be undefined because...] >> So we have the general principle that pointer arithmetic should not be able >> to adjust the value of the pointer outside the guaranteed "neighborhood" of >> legal values near that pointer. >> In the case of a null pointer, the only legal value in that neighborhood >> is null itself; thus "(char *)0 + 1" produces an undefined result. >> ("(char *)0 + 0" would be legal, and equivalent to "(char *)0".) I agree that it should be illegal to do any kind of arithmetic on a null pointer of any type. All this stuff gets very interesting when you consider what happens on an 80286 in its native 'protected' mode (as opposed to the 'fast 8086' mode in which most of them are warming their sockets). In this mode, a pointer is 32 bits; a 16-bit segment number, and a 16-bit offset. It is meaningless to do arithmetic on the segment number since it is just an index into a table maintained by the OS. Pointer arithmetic as we know it in C affects only the offset. The CPU supports a 'null' pointer as follows: Any pointer whose segment part is zero is considered a null pointer. It is not legal to dereference such a pointer, and it is not legal to load one into the stack-pointer register pair (SS:SP) or the program-counter register pair (CS:IP). Violations cause hardware traps. It is legal to load a null pointer as a 'data pointer'. (What this really means is that you can put 0 into DS and ES but not CS or SS). Thus the code for incrementing a pointer, when given a null pointer, will always produce a null pointer. The other weird bit concerns the range of these pointers. The compiler may assign a separate segment for every data object. The segment has a size, and any reference to that segment beyond this size causes a trap. Suppose I declare 'int foo[10]', then I may get a 20-byte segment for foo. Then &foo[10] is a pointer which is illegal to dereference. This is good. There are lots of bits of code like this: for( p = foo; p < &foo[10]; ++p ){ which cause p to be repeatedly compared to a constant invalid pointer until it becomes an invalid pointer itself. I can live with that. What gets a little weird is this: pointer inequalities are done by comparing only the offset part, since the comparison is invalid anyway if the segment numbers are different. Also, offset arithmetic is done in 16 bits. This means that foo[-1] is not only an invalid pointer, but it will be 'greater than' foo[0] since it will have an offset of 0xfffe. What this means is that the following won't work: for( p = &foo[9]; p >= foo; --p ){ /* loops forever */ Furthermore, if I declare a 64K segment ( int foo64[32768] ), the (overflowed) value of &foo64[32767] + 1 is the same as &foo64[0]. Thus not even this will work: for( p = foo64; p <= &foo64[32767]; ++p ){ /* loops forever */ In order to avoid these problems, then, we need a class of pointers which cannot be dereferenced but which can be used in comparisons. It is sufficient that these pointers be restricted to the form (&x)+1, where x is any valid data object. (&x)+1 > (&x) must always hold for any data object x (which rules out a full 64k byte segment on a 286). It would be nice if &x-1 were always less than &x, but that is not possible under this segmentation scheme. The ANSI standard must have something about such pointers. Do they say roughly the same thing about them as I have in the preceding paragraph? Sorry for all the blather, but I have noticed several previous postings that have overlooked these considerations. These people may never have to program on such an architecture, but it seems like it isn't too much trouble to avoid constructs which won't port. What I am looking for is a somewhat more concrete definition of which constructs will and won't work. [ e.g. what about: p = &foo[-1]; do{ ++p; ... }while( p <= &foo[9] ); Does the first ++p cause p to be &foo[0]? Can I legally add 4123 to &foo[0], and if I then subtract 4120 do I get &foo[3]? ] P.S. I am not a segment fan, but a pragmatist recently transplanted to the real world ( arrggg! ). -- ---------------------------------------------------------------------- Greg Smith University of Toronto UUCP: ..utzoo!utcsri!greg Have vAX, will hack...
msb@sq.UUCP (08/11/87)
Regarding the code... > > main(a) > > char (*a)[]; > > { a = 0; printf("a=0x%x\n", a); a++; printf("a=0x%x\n", a); } Wayne Throop writes: > But the scariest thing about all this is that *none* *of* *my* *tools* > *caught* *this* *bug*!!!! Lint happily passed the program ... > And the compiler didn't complain ... Same on our machine, by the way. > (By the way, for those of you who missed it, the program is illegal for > the obvious reason that it increments a pointer to an object of unknown > size, Actually, *declaring* such a pointer is probably illegal. The language in K&R appendix A section 8.4 is a bit fuzzy, but seems to imply this; and section 3.5.3.2 of the (Oct.'86) ANSI draft nails it down clearly. > but *also* because it performs arithmetic on a null pointer, and > of course, this is illegal.) Um, I don't think so, Wayne; it's just that the result, if you indirect through such a pointer, is undefined. K&R is silent on this, but ANSI 3.3.6 seems pretty clear. And here the pointer isn't being indirected through. The OTHER thing that's wrong with the code is that a "%x" format is used to print a pointer variable. "%x" is used to print ints, or at least, things that printf() can pretend are ints. Pointers needn't be the same size as ints. It's much safer to do this: printf ("a=0x%lx\n", (long) a); Then you get surprised only if the pointers won't even fit in a long. ANSI has a better solution to this: the new format "%p". (See 4.9.6.1). On an ANSI compiler, you would write: printf ("a=%p\n", (void *) a); and be guaranteed reasonable results. But I don't think "%p" exists yet. Mark Brader, utzoo!sq!msb C unions never strike!
strouckn@nvpna1.UUCP (Louis Stroucken 42720) (08/12/87)
In article <1987Aug10.192923.7879@sq.uucp> msb@sq.UUCP (Mark Brader) writes: > >Regarding the code... > >> > main(a) >> > char (*a)[]; [ discussion wether "a++;" should do something sensible ] > >Actually, *declaring* such a pointer is probably illegal. The language >in K&R appendix A section 8.4 is a bit fuzzy, but seems to imply this; >and section 3.5.3.2 of the (Oct.'86) ANSI draft nails it down clearly. I haven't got any ANSI draft here, so I'd better stay out of the discussion, but: Please note that "a" is a formal argument of main!! K&R appendix A section 10.4 says on array arguments: ...formal parameters declared "array of..." are adjusted to read "pointer to...". The declaration of "a" might as well read "char **a;". "a++;" should increment "a" with sizeof( char * ) bytes. If I miss something, please let me know. Louis Stroucken UUCP: ...!mcvax!philmds!prle1!nvpna1!strouckn
chris@mimsy.UUCP (Chris Torek) (08/12/87)
In article <234@nvpna1.UUCP> strouckn@nvpna1.UUCP (Louis Stroucken) writes: >Please note that "a" is a formal argument of main!! >K&R appendix A section 10.4 says on array arguments: > ...formal parameters declared "array of..." are adjusted to read > "pointer to...". >The declaration of "a" might as well read "char **a;". "a++;" should >increment "a" with sizeof( char * ) bytes. The type of `a' in `char (*a)[]' is `pointer to array <unspecified size> of char'. Aside from the fact that pointers to arrays of unspecified size are illegal[1], this declaration is correct and cannot be altered. The adjustment is for formal parameters declared `array ...', not `... array ...'. The single ellipsis means that the array type must come first. ----- [1]This illegality is in fact unnecessary; a pointer to an array of unspecified size can be dereferenced. It cannot be used in any pointer arithmetic except to add or subtract zero. Nonetheless it was deemed illegal, and this loses nothing, since C does not have dynamic arrays. (C has dynamic memory allocation, but what you get are flat blocks of address space, though they are not necessarily contained within a globally flat space.) -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690) Domain: chris@mimsy.umd.edu Path: seismo!mimsy!chris
karl@haddock.ISC.COM (Karl Heuer) (08/13/87)
In article <1987Aug10.192923.7879@sq.uucp> msb@sq.UUCP (Mark Brader) writes: >Wayne Throop writes: >>... but *also* because it performs arithmetic on a null pointer, and >>of course, this is illegal.) > >Um, I don't think so, Wayne; it's just that the result, if you indirect >through such a pointer, is undefined. K&R is silent on this, but ANSI >3.3.6 seems pretty clear. "A.6.2 Undefined behavior: ... A pointer that is not to a member of an array object is added to or subtracted from" [Oct86 dpANS]. A null pointer is an extreme example of this. Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
karl@haddock.ISC.COM (Karl Heuer) (08/13/87)
In article <234@nvpna1.UUCP> strouckn@nvpna1.UUCP (Louis Stroucken 42720) writes: >[In the declaration "main(a) char (*a)[]; ..."] Please note that "a" is a >formal argument of main!! K&R [says] `...formal parameters declared "array >of..." are adjusted to read "pointer to...".' Which is irrelevant, since "a" is not an array. It is a pointer to an array (note the parens in the declaration), which is not converted. Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
throopw@xyzzy.UUCP (Wayne A. Throop) (08/13/87)
> msb@sq.uucp (Mark Brader), >> throopw@xyzzy.UUCP >>> main(a) >>> char (*a)[]; >>> { a = 0; printf("a=0x%x\n", a); a++; printf("a=0x%x\n", a); } >> the program is illegal for >> the obvious reason that it increments a pointer to an object of unknown >> size, > Actually, *declaring* such a pointer is probably illegal. True. Draft X3J11 limits arrays of unknown size to formals and externals. In my opinion, this is wrong. >> but *also* because it performs arithmetic on a null pointer, and >> of course, this is illegal. > Um, I don't think so, Wayne; it's just that the result, if you indirect > through such a pointer, is undefined. K&R is silent on this, but ANSI > 3.3.6 seems pretty clear. Yes indeed. Pretty clear: For addition, either both operands shall have arithmetic type, or one operand shall be a pointer to an object and the other shall have integral type. I repeat: "pointer to an object". The null pointer doesn't qualify. Further, K&R are not totally silent on this point either. I quote from 7.4 in the reference: A pointer to an object in an array and a value of any integral type may be added. K&R therefore can be considered to be even *more* restrictive, since they call out that the object must be a member of an array. (Actually, I think draft X3J11 calls this out also, but this result must be peiced together from several readings of Holy Writ, and I don't want to be thought to be in competition with The World Tomorrow TV show.) > The OTHER thing that's wrong with the code is that a "%x" format is used > to print a pointer variable. [...] > On an ANSI compiler, you would write: > printf ("a=%p\n", (void *) a); > and be guaranteed reasonable results. But I don't think "%p" exists yet. Good point. -- IBM manuals are written by little old ladies in Poughkeepsie who are instructed to say nothing specific. --- R. T. Lillington
drw@cullvax.UUCP (Dale Worley) (08/13/87)
strouckn@nvpna1.UUCP (Louis Stroucken 42720) writes:
# >> > main(a)
# >> > char (*a)[];
# [ discussion wether "a++;" should do something sensible ]
# Please note that "a" is a formal argument of main!!
# The declaration of "a" might as well read "char **a;". "a++;" should
# increment "a" with sizeof( char * ) bytes.
And so, Louis is the only one to notice that Lint passes this code,
*because it is correct* (as far as static analysis goes)! Now, let's
make the declaration local, rather than a parameter, and try again...
Dale (hey, I missed it too!)
--
Dale Worley Cullinet Software ARPA: cullvax!drw@eddie.mit.edu
UUCP: ...!seismo!harvard!mit-eddie!cullvax!drw
OS/2: Yesterday's software tomorrow Nuclear war? There goes my career!
bright@dataio.Data-IO.COM (Walter Bright) (08/13/87)
In article <234@nvpna1.UUCP> strouckn@nvpna1.UUCP (Louis Stroucken 42720) writes: >In article <1987Aug10.192923.7879@sq.uucp> msb@sq.UUCP (Mark Brader) writes: >>Regarding the code... >>> > main(a) >>> > char (*a)[]; > [ discussion wether "a++;" should do something sensible ] >> >>Actually, *declaring* such a pointer is probably illegal. >Please note that "a" is a formal argument of main!! >K&R appendix A section 10.4 says on array arguments: > ...formal parameters declared "array of..." are adjusted to read > "pointer to...". > >The declaration of "a" might as well read "char **a;". "a++;" should >increment "a" with sizeof( char * ) bytes. > >If I miss something, please let me know. I'm letting you know :-) The declaration: char (*a)[]; means: 'a' is a pointer to an array of chars, the size of the array is unknown. Since 'a' is not an "array of...", it is not adjusted to "pointer to..." and is not equivalent to "char **a;". The expression "a++" means "add the size of the array to the pointer 'a'". Since the size of the array is unspecified, the compiler can't do it in any 'unsurprising' way. Therefore, the attempt to do this should be illegal. Expressions of the form (*a)[n] are legal, however, since the compiler does not need to know the size of the array to compile it.
john@caeco.UUCP (John Rigby) (08/14/87)
About the program main(a) char (*a)[]; { a = 0; print("%x\n"); a++; print("%x\n"); } in article <234@nvpna1.UUCP>, strouckn@nvpna1.UUCP (Louis Stroucken 42720) says: > > K&R appendix A section 10.4 says on array arguments: > ...formal parameters declared "array of..." are adjusted to read > "pointer to...". > > The declaration of "a" might as well read "char **a;". "a++;" should > increment "a" with sizeof( char * ) bytes. > > If I miss something, please let me know. "a" is NOT an array. It is a pointer to an array. As such, your argument is invalid. in article <189@xyzzy.UUCP>, throopw@xyzzy.UUCP (Wayne A. Throop) says: > > But the scariest thing about all this is that *none* *of* *my* *tools* > *caught* *this* *bug*!!!! Lint happily passed the program, as did other > typecheckers. And the compiler didn't complain (though on our system > the output is > > a=0x0 > a=0x1 > On my machine (Sun 3-260 running 3.2) both the compiler and lint give the same warning: warning: zero-length array element And the output is: a=0x0 a=0x0 Which makes sence since the size is zero. John Rigby !utah-cs!caeco!john CAECO Inc. Salt Lake City, UT
franka@mmintl.UUCP (Frank Adams) (08/15/87)
In article <1987Aug10.192923.7879@sq.uucp> msb@sq.UUCP (Mark Brader) writes: >Wayne Throop writes: >>... but *also* because it performs arithmetic on a null pointer, and >>of course, this is illegal.) > >Um, I don't think so, Wayne; it's just that the result, if you indirect >through such a pointer, is undefined. Operations which produce undefined results are a special case of illegal operations. Specifically, they are illegal operations where the response of the program can be anything. Including, at the extremes, aborting with an error message, or performing in some well-defined way as an implementation extension. On the other hand, I don't know of any programs (ala lint) which perform flow analysis on C programs, and nothing less will detect this kind of bug. Lint certainly cannot be expected to find it. -- Frank Adams ihnp4!philabs!pwa-b!mmintl!franka Ashton-Tate 52 Oakland Ave North E. Hartford, CT 06108
throopw@xyzzy.UUCP (Wayne A. Throop) (08/21/87)
) john@caeco.UUCP (John Rigby) ) About the program ) main(a) char (*a)[]; { a = 0; print("%x\n"); a++; print("%x\n");} ) [...] ) On my machine (Sun 3-260 running 3.2) both the compiler and lint give the ) same warning: ) warning: zero-length array element ) [...] Which makes sence since the size is zero. Well, no, not quite. The size is unknown, which is not the same thing as having size zero. A small nit, to be sure, but mine own. (On the other hand, it is a Good Thing to see that some instances of lint and/or other tools catch the thing and at least complain about it, however inaccurately.) -- The best book on programming for the layman is "Alice in Wonderland"; but that's because it's the best book on anything for the layman. --- Alan J. Perlis
DHowell.ElSegundo@Xerox.COM (08/22/87)
In article <1347@dataio.Data-IO.COM> bright@dataio.Data-IO.COM (Walter Bright) writes: >In article <234@nvpna1.UUCP> strouckn@nvpna1.UUCP (Louis Stroucken 42720) writes: >>In article <1987Aug10.192923.7879@sq.uucp> msb@sq.UUCP (Mark Brader) writes: >>>Regarding the code... >>>> > main(a) >>>> > char (*a)[]; >> [ discussion wether "a++;" should do something sensible ] >>> >>>Actually, *declaring* such a pointer is probably illegal. >>Please note that "a" is a formal argument of main!! >>K&R appendix A section 10.4 says on array arguments: >> ...formal parameters declared "array of..." are adjusted to read >> "pointer to...". >> >>The declaration of "a" might as well read "char **a;". "a++;" should >>increment "a" with sizeof( char * ) bytes. >> >>If I miss something, please let me know. > >The declaration: > char (*a)[]; >means: > 'a' is a pointer to an array of chars, the size of the array > is unknown. >Since 'a' is not an "array of...", it is not adjusted to "pointer to..." >and is not equivalent to "char **a;". > >Expressions of the form (*a)[n] are legal, however, since the compiler >does not need to know the size of the array to compile it. I'm confused. Suppose I declare: char (*a)[10]; char b[10]; Now b is an array of size 10 of char, and a is a pointer to an array of size 10 of char. So this means I should be able to say: a = &b; However, as I understand it, b is actually &b[0], which means a gets set to &&b[0], which I'm not sure makes any sense at all. What exactly does a point to? Does it point to the first element of an array? Does it point to a descriptor of an array? How would I assign anything useful to a, if I can't use the above type of assignment? Or is the assignment valid? If so what is the meaning of &b? Dan <DHowell.ElSegundo@Xerox.COM>
guy%gorodish@Sun.COM (Guy Harris) (08/22/87)
> So this means I should be able to say: > > a = &b; > > However, as I understand it, b is actually &b[0], which means a gets set > to &&b[0], which I'm not sure makes any sense at all. Well, given the current way array names (or array-valued expressions) are treated, it doesn't. PCC will treat "&b" as being equivalent to "b". However, in the langugage described by the current ANSI C draft standard, the "b is actually &b[0]" rule does not apply in certain contexts; one such context is that of an operand of "&". So, in ANSI C, "&b" is valid, and does make sense; it is a pointer to the array "b", as opposed to being a pointer to the first member of that array. (It is a trivial change to PCC to implement this; you just delete a couple of lines in "cgram.y", namely the one that converts the type of "&b" to "pointer to <element of b>" and the one that prints a warning message telling you it is doing so.) Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.com
chris@mimsy.UUCP (Chris Torek) (08/22/87)
In article <8942@brl-adm.ARPA> DHowell.ElSegundo@Xerox.COM writes: >I'm confused. With good reason. >Suppose I declare: > >char (*a)[10]; >char b[10]; > >Now b is an array of size 10 of char, and a is a pointer to an array of >size 10 of char. Right so far. >So this means I should be able to say: > >a = &b; And confusion strikes. K&R C makes &b illegal. >However, as I understand it, b is actually &b[0], which means a gets set >to &&b[0], which I'm not sure makes any sense at all. This is per K&R, and compilers that implement K&R C. &b is illegal, and in PCC, is ignored with a warning, such that the compiler `sees' a = b; which is a type mismatch (<pointer to array 10 of char> = <pointer to char>). Other C definitions, in particular draft ANS X3J11, make &b legal, defining it to produce an rvalue of type <pointer to array 10 of char> and having as its value the address of the entire array, whatever that means in the implementation. Given that other parts of C demand that each array be stored in a flat linear address space (that may nonetheless be disjoint from any or all other flat linear address spaces), this value will probably be the same as that of the address of the first element of b. >What exactly does a point to? This is machine- and implementation-dependent. >Does it point to the first element of an array? Quite possibly. >Does it point to a descriptor of an array? This is possible but unlikely. >How would I assign anything useful to a, if I can't use the above >type of assignment? You could write, e.g., a = (char (*)[10]) malloc(sizeof (char [10])); if (a == NULL) ... (*a)[9] = 'c'; /*or*/ strcpy(a[0], "text"); /* < 10 characters */ This does not entirely rule out an implementation that creates array descriptors, but makes things difficult for such. Note that a = (char (*)[10]) malloc(5 * sizeof (char [10])); is also legal, and creates something that can be addressed as a[4][9] = 'c'; or for (i = 0; i < 5; i++) strcpy(a[i], "text"); /* < 10 characters each */ A compiler that creates array descriptors will have quite a job dealing with these. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690) Domain: chris@mimsy.umd.edu Path: seismo!mimsy!chris
francus@cheshire.columbia.edu (Yoseff Francus) (08/24/87)
In article <8942@brl-adm.ARPA> DHowell.ElSegundo@Xerox.COM writes: >I'm confused. > >Suppose I declare: > >char (*a)[10]; >char b[10]; > >Now b is an array of size 10 of char, and a is a pointer to an array of >size 10 of char. So this means I should be able to say: > >a = &b; > >However, as I understand it, b is actually &b[0], which means a gets set >to &&b[0], which I'm not sure makes any sense at all. > >What exactly does a point to? Does it point to the first element of an >array? Does it point to a descriptor of an array? How would I assign >anything useful to a, if I can't use the above type of assignment? Or >is the assignment valid? If so what is the meaning of &b? > >Dan <DHowell.ElSegundo@Xerox.COM> Since b is the name of an array it is considered to be a constant, and you cannot use the & operator on a constant. The assignement you want is simply a = b; Be careful though, since a++ will not move to the next character, but rather will jump forward by 10*sizeof(char). ****************************************************************** yf In Xanadu did Kubla Khan a stately pleasure dome decree But only if the NFL to a franchise would agree. ARPA: francus@cs.columbia.edu UUCP: seismo!columbia!francus
karl@haddock.ISC.COM (Karl Heuer) (08/24/87)
In article <8942@brl-adm.ARPA> DHowell.ElSegundo@Xerox.COM writes: >char (*a)[10]; char b[10]; a = &b; > >However, as I understand it, b is actually &b[0], A better way to state this rule is something like "the array-valued expression b, if used in a rvalue context, will automatically be converted to the pointer-valued expression which is conceptually &b[0]". Since the premise is false, the rule does not apply in this situation. (Others have already mentioned that PCC interprets &b as a typo for b, and that ANSI has fixed this, making the above code legal and useful.) >What exactly does a point to? Does it point to the first element of an >array? Does it point to a descriptor of an array? As stated by its declaration, a points to an array. Thus, if you dereference it, you get an array (which, if used in an rvalue context, will be converted to a pointer). Your other two questions don't make sense, unless you rewrite them as "If I cast a into a different pointer type and dereference it, will I get ...?". In this form, the answers are likely to be "Yes" and "No", respectively; but I strongly discourage that type of code. Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
mouse@mcgill-vision.UUCP (der Mouse) (08/26/87)
In article <2310@mmintl.UUCP>, franka@mmintl.UUCP (Frank Adams) writes: > In article <1987Aug10.192923.7879@sq.uucp> msb@sq.UUCP (Mark Brader) writes: >> Wayne Throop writes: >>> ... but *also* because it performs arithmetic on a null pointer, >>> and of course, this is illegal.) >> Um, I don't think so, Wayne; it's just that the result, if you >> indirect through such a pointer, is undefined. > Operations which produce undefined results are a special case of > illegal operations. Yes. But arithmetic on a null pointer is not what Mark was saying produces undefined results, it's indirecting through the result. And the sample program didn't indirect through the pointer after it did the arithmetic. (Of course, arithmetic on a null pointer may *also* be illegal, or produce undefined results, I don't know for sure. But that's not my point.) der Mouse (mouse@mcgill-vision.uucp)
franka@mmintl.UUCP (Frank Adams) (09/08/87)
In article <871@mcgill-vision.UUCP> mouse@mcgill-vision.UUCP (der Mouse) writes: >Yes. But arithmetic on a null pointer is not what Mark was saying >produces undefined results, it's indirecting through the result. > >(Of course, arithmetic on a null pointer may *also* be illegal, or produce >undefined results, I don't know for sure. But that's not my point.) A good point. So does arithmetic on a null pointer produce undefined results? I don't have a copy of the proposed standard available, so I don't know what it says. This is about what it *should* say; if it doesn't, it should be changed. Arithmetic on a null pointer should produce an undefined result. To see this, first consider arithmetic on non-null pointers. Given a pointer to an element of an array a of size N. Arithmetic on such a pointer should not produce a pointer to anything outside the range a[0] to a[N]. That is, doing so should produce an undefined result. This is because such an operation may produce an arithmetic overflow in some cases; and implementations should be able to enable interrupts for arithmetic overflow without having to generate special code for pointer arithmetic. So we have the general principle that pointer arithmetic should not be able to adjust the value of the pointer outside the guaranteed "neighborhood" of legal values near that pointer. In the case of a null pointer, the only legal value in that neighborhood is null itself; thus "(char *)0 + 1" produces an undefined result. ("(char *)0 + 0" would be legal, and equivalent to "(char *)0".) As further evidence for this point of view, I note that there could be machines where the hardware traps attempts to increment the null pointer. -- Frank Adams ihnp4!philabs!pwa-b!mmintl!franka Ashton-Tate 52 Oakland Ave North E. Hartford, CT 06108
throopw@xyzzy.UUCP (09/12/87)
> franka@mmintl.UUCP (Frank Adams) > So does arithmetic on a null pointer produce undefined results? I don't > have a copy of the proposed standard available, so I don't know what it > says. This is about what it *should* say; if it doesn't, it should be > changed. The conclusion reached by several people I know who studied the wording on this point is that the standard does *NOT* say that arithmetic on the null pointer produces undefined results (contrary to my own interpretation). Most of them, however, agree that it *OUGHT* to make this undefined. > [it should be undefined because...] > So we have the general principle that pointer arithmetic should not be able > to adjust the value of the pointer outside the guaranteed "neighborhood" of > legal values near that pointer. > In the case of a null pointer, the only legal value in that neighborhood > is null itself; thus "(char *)0 + 1" produces an undefined result. > ("(char *)0 + 0" would be legal, and equivalent to "(char *)0".) I think this is not sufficent. The operation of adding zero to the null pointer should be illegal, because it is not a member of a neighborhood at all, let alone a neighborhood of one element. That is, the value "null" is not a member of an ordered set of addresses upon which arithmetic is meaningful. > As further evidence for this point of view, I note that there could be > machines where the hardware traps attempts to increment the null pointer. And there could be machines where the hardware traps attempts to add zero to the null pointer as well, so that should be undefined also. -- To understand a program you must become both the machine and the program. --- Alan J. Perlis -- Wayne Throop <the-known-world>!mcnc!rti!xyzzy!throopw
cabo@tub.UUCP (09/14/87)
Several people have argued that arithmetic on null pointers should be disallowed in the C standard on the grounds that hardware may trap attempts to alter the special cookie that implements a null pointer. While I agree that char *p = 0; p++; should not be allowed by the standard, I see some benign applications for constant expressions involving null pointers, e.g. (char *)&((struct foo *)0)->bar - (char *)&((struct foo *)0)->baz for computing the relative offset of two structure members in character sized units. I'm not saying that the above is a constant expression according to the wording of the draft (I don't have it, unfortunately), but I would like it to be one. The alternative, declaring a dummy object (or worse, a dummy pointer that would have to be initialized via malloc()) just for being able to reference its members, doesn't appeal to me at all (is this PL/1?). Carsten -- Carsten Bormann, <cabo@tub.UUCP> <cabo@db0tui6.BITNET> <cabo@tub.BITNET> Communications and Operating Systems Research Group Technical University of Berlin (West, of course...) Path: ...!pyramid!tub!cabo from the world, ...!unido!tub!cabo from Europe only.
guy@sun.uucp (Guy Harris) (09/15/87)
> The ANSI standard must have something about such pointers. Do they > say roughly the same thing about them as I have in the preceding paragraph? Yes. Pointer ineqalities are only valid for pointers that "are members of the same aggregate object", with one exception: "If P points to the last member of an array object, the pointer expression P+1 compares higher than P, even though P+1 does not point to a member of the array object." (3.3.8 Relational operators). -- Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.com (or guy@sun.arpa)
kent@xanth.UUCP (Kent Paul Dolan) (09/18/87)
In article <5391@utcsri.UUCP> greg@utcsri.UUCP (Gregory Smith) writes: >All this stuff gets very interesting when you consider what happens on >an 80286 in its native 'protected' mode (as opposed to the 'fast 8086' mode >in which most of them are warming their sockets). > >In this mode, a pointer is 32 bits; a 16-bit segment number, and a 16-bit >offset. It is meaningless to do arithmetic on the segment number since >it is just an index into a table maintained by the OS. Pointer >arithmetic as we know it in C affects only the offset. >The other weird bit concerns the range of these pointers. The >compiler may assign a separate segment for every data object. >The segment has a size, and any reference to that segment beyond this >size causes a trap. >Suppose I declare 'int foo[10]', then I may get a 20-byte segment for >foo. Then &foo[10] is a pointer which is illegal to dereference. >This is good. >There are lots of bits of code like this: > for( p = foo; p < &foo[10]; ++p ){ >which cause p to be repeatedly compared to a constant invalid pointer until it >becomes an invalid pointer itself. I can live with that. > >What gets a little weird is this: pointer inequalities are done by comparing >only the offset part, since the comparison is invalid anyway if the segment >numbers are different. Also, offset arithmetic is done in 16 bits. This >means that foo[-1] is not only an invalid pointer, but it will be 'greater >than' foo[0] since it will have an offset of 0xfffe. What this means is that >the following won't work: > for( p = &foo[9]; p >= foo; --p ){ /* loops forever */ > >The ANSI standard must have something about such pointers. Do they >say roughly the same thing about them as I have in the preceding paragraph? > I love it! OK, X3J11, the ball's in your court. Do we teach every C programmer on every architecture in the world that decrementing pointer loops are a no-no, and break half the code in existance, or do we finally bite the bullet and decide that compiler writers for brain dead architectures, and not the whole C community, pay the penalty for bad hardware designs? Especially since these folks are often the perpetrators of the bad hardware design. Obviously, at a cost in execution speed, several system trap calls and whatever, the pointer can be converted to an integer, the arithmetic done, and the result converted back to a (possibly now illegal) pointer; if the code, as above, doesn't need the pointer, just the arithmetic result for a comparision, then the programmer can go on writing high level language code instead of worrying about boobosities in the hardware. If people who chose such obscenities to build their systems around were made responsible for making them work like normal computers, instead of having the whole rest of the world "fill the C compiler with 'if' kludges" (from another posting), the forces of evolution would consign these duds to the recycling heap in a heartbeat. It is only by continuing to cater to the inanities of some hardware (and software, I suppose) designers that we are forced to continue to suffer their stupidities in the systems we use. Yes, I suffer fools. But I _do_ suffer, and so do you. Kent, the man from xanth. "His expression lit up. 'Hey, you wouldn't be a dope smuggler, would you?' Rail looked confused. 'Why would anyone wish to smuggle stupidity when there is so much of it readily available?'" -- Alan Dean Foster, GLORY LANE
chris@mimsy.UUCP (09/20/87)
>> for( p = &foo[9]; p >= foo; --p ){ /* loops forever */ In article <2474@xanth.UUCP> kent@xanth.UUCP (Kent Paul Dolan) writes: >I love it! > >OK, X3J11, the ball's in your court. Do we teach every C programmer >on every architecture in the world that decrementing pointer loops are >a no-no, and break half the code in existance, or do we finally bite >the bullet and decide that compiler writers for brain dead >architectures, and not the whole C community, pay the penalty for bad >hardware designs? Especially since these folks are often the >perpetrators of the bad hardware design. Unfortunately for you, it has already been considered, and the answer is that for (p = foo; p < &foo[10]; p++) is portable, and compilers must allow for it, but for (p = &foo[9]; p >= foo; p--) is not, and compilers need not allow for it. This does not `break half the code in existence' except when one tries to run it on an architecture in which &foo[-1] >= &foo[0] (where---surprise!---it already fails). -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
rbutterworth@orchid.UUCP (09/20/87)
In article <8655@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes: > for (p = foo; p < &foo[10]; p++) > is portable, and compilers must allow for it, but > for (p = &foo[9]; p >= foo; p--) > is not, and compilers need not allow for it. This does not `break > half the code in existence' except when one tries to run it on an > architecture in which &foo[-1] >= &foo[0] (where---surprise!---it > already fails). Using for (p = &foo[9]; p != foo; p--) gets around this problem of the incorrect inequality. The question is, does an expression, such as "p--", that generates an illegal address violate the standard if that address is never used as an address? e.g. given "p = &foo[-1];", it is obviously wrong to use "*p", but is there hardware for which the assignment itself would cause a machine fault (assuming "p" is not declared as "register")? If there aren't any such machines, why does this assignment violate the ANSI standard?
greg@gryphon.CTS.COM (Greg Laskin) (09/21/87)
In article <2474@xanth.UUCP> kent@xanth.UUCP (Kent Paul Dolan) writes: >In article <5391@utcsri.UUCP> greg@utcsri.UUCP (Gregory Smith) writes: >>All this stuff gets very interesting when you consider what happens on >>an 80286 in its native 'protected' mode ... >> for( p = &foo[9]; p >= foo; --p ){ /* loops forever */ > >OK, X3J11, the ball's in your court. Do we teach every C programmer >on every architecture in the world that decrementing pointer loops are >a no-no, and break half the code in existance, or do we finally bite >the bullet and decide that compiler writers for brain dead >architectures, and not the whole C community, pay the penalty for bad >hardware designs? Especially since these folks are often the >perpetrators of the bad hardware design. > Given: extern struct FOO foo[]; Assume: sizeof struct FOO == n+1 foo == (struct FOO *) n; where n is any address in any linear address space. Does the example code loop forever? If so, does this mean that linear address spaces are brain-dead or that the code is broken? & -- Greg Laskin "When everybody's talking and nobody's listening, how can we decide?" INTERNET: greg@gryphon.CTS.COM UUCP: {hplabs!hp-sdd, sdcsvax, ihnp4}!crash!gryphon!greg UUCP: {philabs, scgvaxd}!cadovax!gryphon!greg
gwyn@brl-smoke.ARPA (Doug Gwyn ) (09/21/87)
In article <48400001@tub.UUCP> cabo@tub.UUCP writes: > (char *)&((struct foo *)0)->bar - (char *)&((struct foo *)0)->baz >for computing the relative offset of two structure members in character >sized units. I'm not saying that the above is a constant expression >according to the wording of the draft (I don't have it, unfortunately), >but I would like it to be one. Unfortunately, X3J11 cannot guarantee that such use of null pointers would be portable. However, they have specified an offsetof() macro to accomplish what you're trying to do; it's still being debated and may change or (unlikely) vanish before the second public review.
gwyn@brl-smoke.ARPA (Doug Gwyn ) (09/21/87)
In article <2474@xanth.UUCP> kent@xanth.UUCP (Kent Paul Dolan) writes: >OK, X3J11, the ball's in your court. Do we teach every C programmer >on every architecture in the world that decrementing pointer loops are >a no-no, and break half the code in existance, or do we finally bite >the bullet and decide that compiler writers for brain dead >architectures, and not the whole C community, pay the penalty for bad >hardware designs? Especially since these folks are often the >perpetrators of the bad hardware design. A pointer to the [-1]st element of an array may well have an address CONSIDERABLY smaller than that of the [0]th element, if the array elements are large. This is not a problem for the [N+1]st element. Therefore, there is only a small penalty in requiring implementations to ensure that [N+1] pointers have valid addresses (generally only wastes at most one word of "slop" space per segment), but there would be an unacceptably large penalty with requiring that a pointer to the [-1]st element of an array have a valid address. This was in fact discussed by X3J11 and general agreement reached to require [N+1] pointer validity, but not [-1] pointer validity. This permits one common form of somewhat sloppy coding, but not the other. Please note that the outlawed form was NEVER safe; I've seen it break in a PDP-11 implementation of bsearch() (due to data space address wrap-around), for example. It is not within X3J11's power to "somehow" make the unworkable work. And yes, please teach every C programmer in the world how to write reliable code. Thanks in advance.
gwyn@brl-smoke.ARPA (Doug Gwyn ) (09/21/87)
In article <10758@orchid.waterloo.edu> rbutterworth@orchid.waterloo.edu (Ray Butterworth) writes: >Using > for (p = &foo[9]; p != foo; p--) >gets around this problem of the incorrect inequality. for ( p = &foo[LIMIT]; p-- != foo; ) would be more correct, since it includes the case of &foo[0] and excludes the case right after the end of the array. >The question is, does an expression, such as "p--", that generates >an illegal address violate the standard if that address is never >used as an address? Not according to current wording. This was reaffirmed at the Framingham meeting, in the course of revising related wording.