dfp@cbnewsl.ATT.COM (david.f.prosser) (06/20/89)
In article <921@tukki.jyu.fi> tarvaine@tukki.jyu.fi (Tapani Tarvainen) writes: } }Consider the following code fragment: } } b = (char *) malloc(n); } c = b + x; }... } t = b; } b = (char *) realloc(b, m); }/*1*/ i = b - t; } c += i; } }The idea is that c keeps pointing to the same thing. }Is this guaranteed to work? I think not: }pointer subtraction assumes the pointers point to }the same structure, which b and t don't (unless pANS }says something about realloc in this context?). }And indeed, it may fail with Turbo C and probably any 80x86 C with }large data models. (The problem came up when porting Gnu grep to }ms-dos. See article <920@tukki.jyu.fi> in gnu.utils.bug for details.) } Any use of the value of t after the realloc call causes undefined behavior. Section 4.10.3 of the pANS: The value of a pointer that refers to freed space is indeterminate. And the definition of undefined behavior (section 1.6): ... behavior, upon use ... of indeterminately-valued objects ... The point is that the only portable way of doing the above is to calculate the offset of c from b prior to the realloc. } }Then how about this: } }/*2*/ c = b + (c - t); } }Is this guaranteed to work, or is the compiler free to rearrange it as } } c = (b - t) + c; "c - t" is undefined already. Thus the compiler is free to do anything with this expression. Assuming that it were valid, the rearrangement is not permitted if such were to make it invalid. The translation of the C code must behave as if the abstract machine were actually to execute the code as you wrote it. } }even though b - t is illegal (and fails)? } }I know it can be done safely like this: } } i = c - t; } c = b + i; } }which is what I did, but I'd like to know what pANS says about /*2*/. Only if the realloc comes between the first and the second statement. } }-- }Tapani Tarvainen BitNet: tarvainen@finjyu }Internet: tarvainen@jylk.jyu.fi -- OR -- tarvaine@tukki.jyu.fi Dave Prosser ...not an official X3J11 answer...
karl@haddock.ima.isc.com (Karl Heuer) (06/20/89)
In article <921@tukki.jyu.fi> tarvaine@tukki.jyu.fi (Tapani Tarvainen) writes: >[c marks a place in buffer b, which we want to realloc; old value in t] >/*2*/ c = b + (c - t); >Is this guaranteed to work, or is the compiler free to rearrange it as > c = (b - t) + c; >even though b - t is illegal (and fails)? Yes, this is guaranteed to work; parens must be honored in ANSI C. Any such rearrangement is now legal only via the as-if rule, which requires that the rewrites be transparent to the user. Thus, this optimization would be legal for a compiler on a flat architecture (e.g. a pdp11 or vax), but not for a segmented machine if the value (b-t) is not representable. Similarly, lacking information about the possible values of j and k, an integer expression like "i=(i-j)+k" cannot be optimized into "i+=(k-j)" unless the integer-overflow trap is disabled. (Which it usually is.) Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
karl@haddock.ima.isc.com (Karl Heuer) (06/21/89)
In article <844@cbnewsl.ATT.COM> dfp@cbnewsl.ATT.COM (david.f.prosser) writes: >Any use of the value of t after the realloc call causes undefined behavior. This is an important point, which I completely ignored in my own followup (which was concerned with the order-of-evaluation question). Clearly it's meaningless to attempt to dereference a |free|'d pointer (including the old value of a |realloc|'d pointer); but what's not as well known is that you can't reliably do *anything* with that value anymore -- not even copy it into a new variable, or compare it with |NULL|. This allows for an implementation on a segmented architecture to have |malloc| allocate a new segment from the system, and |free| return it. If the hardware distinguishes between arithmetic registers and address registers, and if loading a bogus segment address into an address register causes a hardware trap, then bad things could happen if the user does anything with a |free|'d pointer. So, in order to not place an undue burden on such implementations, the pANS labels this as undefined behavior. Hence, the correct way to synchronize mid-array pointers is: /* |b| is a |malloc|'d buffer; |c| points somewhere inside */ something *oldb = b; ptrdiff_t dist = c - b; if ((b = (something *)realloc((void *)b, newsize)) == NULL) { b = oldb; /* oldb is still valid, since |realloc| failed */ fprintf(stderr, "sorry, no more space\n"); } else { c = b + dist; } Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
tarvaine@tukki.jyu.fi (Tapani Tarvainen) (07/22/89)
Consider the following code fragment: b = (char *) malloc(n); c = b + x; ... t = b; b = (char *) realloc(b, m); /*1*/ i = b - t; c += i; The idea is that c keeps pointing to the same thing. Is this guaranteed to work? I think not: pointer subtraction assumes the pointers point to the same structure, which b and t don't (unless pANS says something about realloc in this context?). And indeed, it may fail with Turbo C and probably any 80x86 C with large data models. (The problem came up when porting Gnu grep to ms-dos. See article <920@tukki.jyu.fi> in gnu.utils.bug for details.) Then how about this: /*2*/ c = b + (c - t); Is this guaranteed to work, or is the compiler free to rearrange it as c = (b - t) + c; even though b - t is illegal (and fails)? I know it can be done safely like this: i = c - t; c = b + i; which is what I did, but I'd like to know what pANS says about /*2*/. -- Tapani Tarvainen BitNet: tarvainen@finjyu Internet: tarvainen@jylk.jyu.fi -- OR -- tarvaine@tukki.jyu.fi
dfp@cbnewsl.ATT.COM (david.f.prosser) (08/02/89)
In article <921@tukki.jyu.fi> tarvaine@tukki.jyu.fi (Tapani Tarvainen) writes: > >Consider the following code fragment: > > b = (char *) malloc(n); > c = b + x; >... > t = b; > b = (char *) realloc(b, m); >/*1*/ i = b - t; > c += i; > >The idea is that c keeps pointing to the same thing. >Is this guaranteed to work? I think not: >pointer subtraction assumes the pointers point to >the same structure, which b and t don't (unless pANS >says something about realloc in this context?). The pANS says that this is invalid for a much more fundamental reason: After a realloc call, the "old" pointer value is indeterminate. To make any use of the value causes undefined behavior. The only valid means of doing relocation of pointers after a realloc is to compute the distance from the beginning of the allocated block *before* the realloc call. >And indeed, it may fail with Turbo C and probably any 80x86 C with >large data models. (The problem came up when porting Gnu grep to >ms-dos. See article <920@tukki.jyu.fi> in gnu.utils.bug for details.) > > >Then how about this: > >/*2*/ c = b + (c - t); > >Is this guaranteed to work, or is the compiler free to rearrange it as > > c = (b - t) + c; > >even though b - t is illegal (and fails)? This expression fails for the same reason as the first. However, the pANS says that the program must behave as if the abstract machine were executing the code exactly as written. Thus, only benign rearrangement of expressions are allowed. There is no real difference here though, since the behavior is undefined. > >I know it can be done safely like this: > > i = c - t; > c = b + i; > >which is what I did, but I'd like to know what pANS says about /*2*/. Assuming this occurs *after* the realloc call, it cannot be done safely this way. But your question is whether an expression such as int i; char *c, *t, *b; c = b + (c - t); can be rearranged to be c += b - t; by a valid ANSI C compiler. The answer is "maybe", but only if it makes no detectable difference to the program. Writing the expression instead as i = c - t; c = b + i; forces the assignment to i before the assignment to c, but has no real distinguishable difference from the first form except in regards volatile objects and interrupts, and so forth. Dave Prosser ...not an official X3J11 answer...