pfeiffer@herve.cs.wisc.edu (Phil Pfeiffer) (07/06/88)
Before other comp.compiler readers are quick to point out that my posting about C's semantics was not totally accurate: When I posted that C's semantic model allowed unconstrained use of pointers, I said this based on my experience with the Unix C compiler, and did not double-check K&R before posting. My mistake. I received two communiques today from Bob Larson (blarson%skat.usc.edu@oberon.usc.edu) that I'd like to pass along (with his permission) before other comp.compiler readers correct me, as well. > But C does constrain pointer arithmatic to the bounds of the array. > (ANSI will allow the address folowing the array to be calculated but > not referenced.) Most compilers don't enforce this, but it is there > in K&R .... > >K&R 1, page 98: >"But all bets are off if you do arithmetic or comparisons with pointers >pointing to different arrays. If you're lucky, you'll get obvious >nonsense on all machines. If you're unlucky, your code will work on one >machine but collapse mysteriously on another." > >This doesn't seem to be restated in the refernce manual section. > >My copy of K&R 2 is elsewhere, but I'm pretty sure the restriction still >holds. (My info on ANSI C is mostly from comp.lang.c and comp.std.c, >so is less than perfectly reliable.) Also, on page 90 of K&R (version 1): "You should also note the implications in the declaration that a pointer is constrained to point to a particular kind of object." I guess this is why formal language specification and compiler validation were invented. -- Phil [From Phil Pfeiffer <pfeiffer@herve.cs.wisc.edu>] -- Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbn}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request
louie@trantor.umd.edu (Louis A. Mamakos) (07/06/88)
i use a compiler which really enforces the rule of comparing (or operating) on two pointers pointing at the same object. The compiler is for a Unisys 1100 mainframe, which is a ones-complement, 36 bit, word addressable machine. This thing is a walking validation test for well written C programs. Its pointers are generally 2 words (8 9-bit bytes) long, except for pointer to functions which are 8 words (64 9-bit bytes) long. Even with all of the obvious adversities, it is not difficult to write C code. You have to watch out for code which thinks that pointers can be put into ints (or longs) and then back again, but other than that, things work. You will also see problems like this, I suspect, on 8086 type segmented architectures. Or so it would seem; I don't use such things. Now, if I can just convince people that -1 <> ~0 you see, on a one's complement computer you have both +0 and -0 and -0 == ~0 Fun stuff. Louis A. Mamakos WA3YMH Internet: louie@TRANTOR.UMD.EDU University of Maryland, Computer Science Center - Systems Programming -- Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbn}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request
mrspock@hubcap.clemson.edu (Steve Benz) (07/06/88)
From article <1262@ima.ISC.COM>, by pfeiffer@herve.cs.wisc.edu (Phil Pfeiffer): > ...my posting about C's semantics was not totally accurate... > >>K&R 1, page 98: >>"But all bets are off if you do arithmetic or comparisons with pointers >>pointing to different arrays. If you're lucky, you'll get obvious >>nonsense on all machines. If you're unlucky, your code will work on one >>machine but collapse mysteriously on another." To combine this with what was said before: An optimizer can assume that once a pointer, 'p', is assigned to an address within an array, 'a', 'p' will stay within the bounds of 'a' until assigned to an address within some other array or in the heap space. While this assumption is valid (according to the book,) it is also true that the optimized code may behave differently than unoptimized code, but only for programs that are not in the domain of valid K&R C programs. Nevertheless, I can think of a rather large set of programs that aren't "valid K&R C programs": All those programs that use varargs. (At least by the definitions of varargs that I've seen.) > Also, on page 90 of K&R (version 1): > "You should also note the implications in the declaration that a pointer is > constrained to point to a particular kind of object." I think a C compiler would be very hard pressed to guarantee this sort of thing, with the cast operation lurking about. The only way to absolutely guarantee this is to do some unpleasantly complicated typechecking on every object in memory. Insofar as mainstream C compilers go, I think you have to accept Phil's original assertion about pointer operations -- at least within the heap space. If you can circumvent the varargs problem, you might be able to get by with the K&R definition in the stack space. - Steve Benz [From ] -- Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbn}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request
daveb@geac.uucp (David Collier-Brown) (07/13/88)
In article <2109@hubcap.UUCP> mrspock@hubcap.clemson.edu (Steve Benz) writes: [discussion of binding of pointers to arrays/heap] |Nevertheless, I can think of a rather large set of programs that |aren't "valid K&R C programs": All those programs that use varargs. |(At least by the definitions of varargs that I've seen.) Yes, it is hard to make any usefull assumptions about a pointer being used for varargs processing, or for wandering through a core/executable file in a debugger (the other "fun" example I run into often). |> Also, on page 90 of K&R (version 1): |> "You should also note the implications in the declaration that a pointer is |> constrained to point to a particular kind of object." | |I think a C compiler would be very hard pressed to guarantee this sort |of thing, with the cast operation lurking about. The only way to absolutely |guarantee this is to do some unpleasantly complicated typechecking on |every object in memory. In some sense, that's what an optimizing compiler tends to have to do. In tightly-typed languages, there's more information about the use of a pointer (and syntactic sugar), but one can extract the required information from C in many of the common cases, based on the **particular** optimization being applied. Let's look at the example that started this discussion thread: a pointer in a copy routine which, because it can point anywhere, is proposed to invalidate my register history... (I'm assuming that the optimizatio in question is fetch minimization). If the pointer is kept "within bounds" by the copy routine, then we can assume that passing the pointer to the routine has exactly the same effect as a local assignment through the pointer. If the pointer is not being kept within bounds, both the non-optimized copy code and the optimized calling code fails. But it fails in a manner which does not involve the optimization! If a function scribbles on memory, and I'm trying to do fetch-minimization from memory, it doesn't matter if the fetch minimization is rendered incorrect, because the fetch **itself** has been rendered incorrect by the scribbler. You can apply this same kind of argument to a number of common cases of optimization in the presence of errors, to discover which optimizations are independent of/orthogonal to the error... And apply more optimizations than expected on first glance. (In a real sense, you're doing complicated mental typechecking on the operations when you write the optimizer). In my global pessimizer, I assume all registers and all stack entries are invalid at all times, and so generate code which depends only on read-only data in the linkage segment and random numbers in the heap. --dave (whats a correctness?) c-b -- David Collier-Brown. {mnetor yunexus utgpu}!geac!daveb Geac Computers Ltd., | Computer science loses its 350 Steelcase Road, | memory, if not its mind, Markham, Ontario. | every six months. -- Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbn}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request