[comp.std.c] pointers & order of execution

dfp@cbnewsl.ATT.COM (david.f.prosser) (06/20/89)

In article <921@tukki.jyu.fi> tarvaine@tukki.jyu.fi (Tapani Tarvainen) writes:
}
}Consider the following code fragment:
}
}	b = (char *) malloc(n);
}	c = b + x;
}...
}	t = b;
}	b = (char *) realloc(b, m);
}/*1*/	i = b - t;
}	c += i;
}
}The idea is that c keeps pointing to the same thing.
}Is this guaranteed to work?  I think not:
}pointer subtraction assumes the pointers point to
}the same structure, which b and t don't (unless pANS
}says something about realloc in this context?).
}And indeed, it may fail with Turbo C and probably any 80x86 C with
}large data models.  (The problem came up when porting Gnu grep to
}ms-dos.  See article <920@tukki.jyu.fi> in gnu.utils.bug for details.)
}

Any use of the value of t after the realloc call causes undefined behavior.

Section 4.10.3 of the pANS:

	The value of a pointer that refers to freed space is indeterminate.

And the definition of undefined behavior (section 1.6):

	... behavior, upon use ... of indeterminately-valued objects ...

The point is that the only portable way of doing the above is to calculate
the offset of c from b prior to the realloc.

}
}Then how about this:
}
}/*2*/	c = b + (c - t);
}
}Is this guaranteed to work, or is the compiler free to rearrange it as
}
}	c = (b - t) + c;

"c - t" is undefined already.  Thus the compiler is free to do anything with
this expression.  Assuming that it were valid, the rearrangement is not
permitted if such were to make it invalid.  The translation of the C code must
behave as if the abstract machine were actually to execute the code as you
wrote it.

}
}even though b - t is illegal (and fails)?
}
}I know it can be done safely like this:
}
}	i = c - t;
}	c = b + i;
}
}which is what I did, but I'd like to know what pANS says about /*2*/.

Only if the realloc comes between the first and the second statement.

}
}-- 
}Tapani Tarvainen                 BitNet:    tarvainen@finjyu
}Internet:  tarvainen@jylk.jyu.fi  -- OR --  tarvaine@tukki.jyu.fi

Dave Prosser	...not an official X3J11 answer...

karl@haddock.ima.isc.com (Karl Heuer) (06/20/89)

In article <921@tukki.jyu.fi> tarvaine@tukki.jyu.fi (Tapani Tarvainen) writes:
>[c marks a place in buffer b, which we want to realloc; old value in t]
>/*2*/	c = b + (c - t);
>Is this guaranteed to work, or is the compiler free to rearrange it as
>	c = (b - t) + c;
>even though b - t is illegal (and fails)?

Yes, this is guaranteed to work; parens must be honored in ANSI C.  Any such
rearrangement is now legal only via the as-if rule, which requires that the
rewrites be transparent to the user.  Thus, this optimization would be legal
for a compiler on a flat architecture (e.g. a pdp11 or vax), but not for a
segmented machine if the value (b-t) is not representable.

Similarly, lacking information about the possible values of j and k, an
integer expression like "i=(i-j)+k" cannot be optimized into "i+=(k-j)" unless
the integer-overflow trap is disabled.  (Which it usually is.)

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

karl@haddock.ima.isc.com (Karl Heuer) (06/21/89)

In article <844@cbnewsl.ATT.COM> dfp@cbnewsl.ATT.COM (david.f.prosser) writes:
>Any use of the value of t after the realloc call causes undefined behavior.

This is an important point, which I completely ignored in my own followup
(which was concerned with the order-of-evaluation question).  Clearly it's
meaningless to attempt to dereference a |free|'d pointer (including the old
value of a |realloc|'d pointer); but what's not as well known is that you
can't reliably do *anything* with that value anymore -- not even copy it into
a new variable, or compare it with |NULL|.

This allows for an implementation on a segmented architecture to have |malloc|
allocate a new segment from the system, and |free| return it.  If the hardware
distinguishes between arithmetic registers and address registers, and if
loading a bogus segment address into an address register causes a hardware
trap, then bad things could happen if the user does anything with a |free|'d
pointer.  So, in order to not place an undue burden on such implementations,
the pANS labels this as undefined behavior.

Hence, the correct way to synchronize mid-array pointers is:
	/* |b| is a |malloc|'d buffer; |c| points somewhere inside */
	something  *oldb = b;
	ptrdiff_t   dist = c - b;
	if ((b = (something *)realloc((void *)b, newsize)) == NULL) {
		b = oldb;  /* oldb is still valid, since |realloc| failed */
		fprintf(stderr, "sorry, no more space\n");
	} else {
		c = b + dist;
	}

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

tarvaine@tukki.jyu.fi (Tapani Tarvainen) (07/22/89)

Consider the following code fragment:

	b = (char *) malloc(n);
	c = b + x;
...
	t = b;
	b = (char *) realloc(b, m);
/*1*/	i = b - t;
	c += i;

The idea is that c keeps pointing to the same thing.
Is this guaranteed to work?  I think not:
pointer subtraction assumes the pointers point to
the same structure, which b and t don't (unless pANS
says something about realloc in this context?).
And indeed, it may fail with Turbo C and probably any 80x86 C with
large data models.  (The problem came up when porting Gnu grep to
ms-dos.  See article <920@tukki.jyu.fi> in gnu.utils.bug for details.)


Then how about this:

/*2*/	c = b + (c - t);

Is this guaranteed to work, or is the compiler free to rearrange it as

	c = (b - t) + c;

even though b - t is illegal (and fails)?

I know it can be done safely like this:

	i = c - t;
	c = b + i;

which is what I did, but I'd like to know what pANS says about /*2*/.

-- 
Tapani Tarvainen                 BitNet:    tarvainen@finjyu
Internet:  tarvainen@jylk.jyu.fi  -- OR --  tarvaine@tukki.jyu.fi

dfp@cbnewsl.ATT.COM (david.f.prosser) (08/02/89)

In article <921@tukki.jyu.fi> tarvaine@tukki.jyu.fi (Tapani Tarvainen) writes:
>
>Consider the following code fragment:
>
>	b = (char *) malloc(n);
>	c = b + x;
>...
>	t = b;
>	b = (char *) realloc(b, m);
>/*1*/	i = b - t;
>	c += i;
>
>The idea is that c keeps pointing to the same thing.
>Is this guaranteed to work?  I think not:
>pointer subtraction assumes the pointers point to
>the same structure, which b and t don't (unless pANS
>says something about realloc in this context?).

The pANS says that this is invalid for a much more fundamental reason:
After a realloc call, the "old" pointer value is indeterminate.  To
make any use of the value causes undefined behavior.  The only valid
means of doing relocation of pointers after a realloc is to compute
the distance from the beginning of the allocated block *before* the
realloc call.

>And indeed, it may fail with Turbo C and probably any 80x86 C with
>large data models.  (The problem came up when porting Gnu grep to
>ms-dos.  See article <920@tukki.jyu.fi> in gnu.utils.bug for details.)
>
>
>Then how about this:
>
>/*2*/	c = b + (c - t);
>
>Is this guaranteed to work, or is the compiler free to rearrange it as
>
>	c = (b - t) + c;
>
>even though b - t is illegal (and fails)?

This expression fails for the same reason as the first.  However, the pANS
says that the program must behave as if the abstract machine were executing
the code exactly as written.  Thus, only benign rearrangement of expressions
are allowed.  There is no real difference here though, since the behavior is
undefined.

>
>I know it can be done safely like this:
>
>	i = c - t;
>	c = b + i;
>
>which is what I did, but I'd like to know what pANS says about /*2*/.

Assuming this occurs *after* the realloc call, it cannot be done safely
this way.

But your question is whether an expression such as

	int i; char *c, *t, *b;

	c = b + (c - t);

can be rearranged to be

	c += b - t;

by a valid ANSI C compiler.  The answer is "maybe", but only if it makes
no detectable difference to the program.  Writing the expression instead as

	i = c - t; c = b + i;

forces the assignment to i before the assignment to c, but has no real
distinguishable difference from the first form except in regards volatile
objects and interrupts, and so forth.

Dave Prosser	...not an official X3J11 answer...