[comp.lang.c] More on noalias

chris@mimsy.UUCP (Chris Torek) (01/12/88)

By now I imagine everyone knows that I am opposed to noalias.  This
is not yet another argument against `noalias' *per se*; rather, it
is an argument for a different mechanism---one that is cleaner (I
claim) and more useful.

First, a side point.  While I have not seen the proposed semantics
for the keyword `noalias', I suspect that it does not in fact mean
that some variable is not aliased.  Aliasing, after all, is not a
property of a single variable; aliasing occurs only between some
group of variables.  `*p' may alias `j' without aliasing `v', for
instance.  So the word `noalias' itself is wrong: it probably means
`this variable is not aliased with any other variable that is in
fact used in the same or any inner scope'.  (In other words, called
procedures may alias the variable, so long as the alias is gone,
or remains unused, by the time that procedure returns.)

But let us take for granted that some mechanism like `noalias' is
worthwhile---that if we can give the compiler a hint, it may be able
to produce much faster or smaller code.  What, then, is wrong with
`noalias'?  It is, I claim, the wrong hint---if not always, at least
often enough.  Among other things, it is hardly general.

First, let me construct an example in which aliasing is relatively
hard to detect.  How about this:

	void
	vector_add /* maybe */ (int *a, int i, int j, int k, int n)
	{

		for (; --n >= 0; i++, j++, k++)
			a[i] = a[j] + a[k];
	}

This can use a vector add iff i != j and i != k (on some hypothetical
machine).  Hence the compiler wants to know whether this will be
true.  By declaring `int *noalias a' or `noalias int *a' (whichever
it is), we could say that.  But we could say that much better this
way:

	void
	vector_add /* really */ (int *a, int i, int j, int k, int n)
	{

		for (; --n >= 0; i++, j++, k++) {
			compiler_hint(alias: i != j, alias: i != k);
			a[i] = a[j] + a[k];
		}
	}

Not only are we then saying what we mean, we have a condition we
can test!

		if (i == j || i == k)
			panic("vector_add called illegally");
		for (...) etc.

Of course, by putting in the `panic' and adding

	compiler_hint(flow: dead panic());
	void panic(char *s);

we could leave out the `alias:' hint too.  But if we have only
`noalias', we cannot hint to flow analysis that panic() never
returns.

There is a miniature raging debate in comp.arch going on at this
very minute over the Fortran `frequency' statement.  If there exist
machines on which the proper `frequency' statement might make some
function run ten times faster, but all we have in C is `noalias',
you may have to write that function in assembly.  Too bad you
cannot say

	compiler_hint(flow: probability i < 0: .001);

No matter what it is the compiler wants to know, if you have a
general mechanism, you can tell it.  With `noalias' you can only
tell it about aliasing.  To reword the `summary' line above, if we
must have compiler hints, let us at least make them general.  ANS
X3J11 could specify the form which some compiler hints should take,
specifically those involving aliasing.  (The form I used was made
up on the spur of the moment, and could stand improvement.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 ????)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

dsill@NSWC-OAS.arpa (Dave Sill) (01/13/88)

[Chris Torek suggests that a general mechanism for giving the compiler
hints is needed.  Further, he suggests that noalias could be replaced
by such a hint and that such a mechanism could also give flow control
hints.]

>...  Too bad you cannot say
>
>	compiler_hint(flow: probability i < 0: .001);

This looks like a prime pragma candidate to me:
	#pragma flow (probability, i<0, .001)

>No matter what it is the compiler wants to know, if you have a
>general mechanism, you can tell it.  With `noalias' you can only
>tell it about aliasing.

Unfortunately, as Doug Gwyn has already pointed out, pragmas can't be
used to replace `noalias'.  (I don't understand `noalias' well enough
myself to understand why)

>To reword the `summary' line above, if we
>must have compiler hints, let us at least make them general.  ANS
>X3J11 could specify the form which some compiler hints should take,
>specifically those involving aliasing.

What pragmas need to be useful is a procedure like that used by the
ISO to register escape sequences.  Rather than having X3J11 specify a
list of pragmas and their behavior (which they obviously won't do),
they could maintain a list of pragmas submitted by vendors.  This
would reduce the likelyhood of two vendors implementing the same
pragma but with different behavior.  E.g., "#pragma optimized" on
vendor A's compiler might mean "turn on optimizing" while on vendor
B's compiler it means "this code has been hand optimized, don't try to
improve it".

Perhaps pragmas were intended to be machine dependent, but I think
they could and should be more general.

========
The opinions expressed above are mine.

Which is worse: ignorance or apathy?  Who knows?  Who cares?

mike@arizona.edu (Mike Coffin) (01/13/88)

In article <10129@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes:
> First, let me construct an example in which aliasing is relatively
> hard to detect.  How about this:
> 
> 	void
> 	vector_add /* maybe */ (int *a, int i, int j, int k, int n)
> 	{
> 
> 		for (; --n >= 0; i++, j++, k++)
> 			a[i] = a[j] + a[k];
> 	}
> 
> This can use a vector add iff i != j and i != k (on some hypothetical
> machine).  Hence the compiler wants to know whether this will be
> true.  By declaring `int *noalias a' or `noalias int *a' (whichever
> it is), we could say that.

I wish I knew for sure what noalias meant.  In any case, you can already
(in good old K&R C) write the above function in a way that allows vector
add:

 	void
 	vector_add /* maybe */ (int *a, int i, int j, int k, int n)
 	{
 		if (i==j || i==k) return;

		/* now the compiler can infer that i!=j and i!=k */
 		for (; --n >= 0; i++, j++, k++)
 			a[i] = a[j] + a[k];
 	}

If you don't like the way the if-statement looks, embed it in a macro;
e.g., "opt_assert(i!=j && i !=k)".
-- 

Mike Coffin				mike@arizona.edu
Univ. of Ariz. Dept. of Comp. Sci.	{allegra,cmcl2,ihnp4}!arizona!mike
Tucson, AZ  85721			(602)621-4252

dik@cwi.nl (Dik T. Winter) (01/13/88)

In article <3433@megaron.arizona.edu> mike@arizona.edu (Mike Coffin) writes:
 >  	void
 >  	vector_add /* maybe */ (int *a, int i, int j, int k, int n)
 >  	{
 >  		if (i==j || i==k) return;
 > 
 > 		/* now the compiler can infer that i!=j and i!=k */
 >  		for (; --n >= 0; i++, j++, k++)
 >  			a[i] = a[j] + a[k];
 >  	}
 > 
and supposes that this allows vectorization (just like the parent article
by Chris Torek).

First the conditions are wrong (if i equals both j and k, vectorization
is possible; no vectorization is possible if (j-n) < i < j, or
similar for i and k).  But that is beside the point; the equivalent
Fortran code will not vectorize on any machine that I know (Cray, CDC 205,
NEC SX/2, Fujitsu), because of this aliasing problem.  A better example is:
	void
	vector_add /* maybe */ (int *a, *b, *c, int n)
	{
	int i;
		for(i = 0; i < n; i++)
			a[i] = b[i] + c[i];
	}
This will vectorize in Fortran, because the standard states that aliasing
is not allowed here if a is involved.  It will not vectorize in C.
Adding a noalias for a will solve the problem (if I understand it right).

Of course the statement
		if((a > b && a < b+n) || (a > c && a < c+n)) exit(1);
(or EXIT_FAILURE) will help also, but the existence of such a statement
(or a similar one) is very difficult to check, because although in this case
the condition is extremely simple, it might be much more complex in
other cases. (What is the condition if the loop reads:
	int i, j;
		for(i = 0, j = 0; i < n; j += i, i++)
			a[j] = b[j] + c[j];
not very common, but also not unheard of.  It vectorizes on a number of
machines.)

Note that this example also clarifies Doug Gwyn's remark that noalias
changes the semantics.  Especially if the programmer lies.  (Without
noalias the result is well-defined, with it is undefined.)
-- 
dik t. winter, cwi, amsterdam, nederland
INTERNET   : dik@cwi.nl
BITNET/EARN: dik@mcvax

chris@mimsy.UUCP (Chris Torek) (01/13/88)

In article <3433@megaron.arizona.edu> mike@arizona.edu (Mike Coffin) writes:
>I wish I knew for sure what noalias meant.

So do I---it is hard to argue against something when no one will say
exactly what that something is :-) . . . .

>In any case, you can already (in good old K&R C) write the above
>function in a way that allows vector add:

Actually, that was nearly half the point.

> 	void
> 	vector_add /* maybe */ (int *a, int i, int j, int k, int n)
> 	{
> 		if (i==j || i==k) return;
>
>		/* now the compiler can infer that i!=j and i!=k */
> 		for (; --n >= 0; i++, j++, k++)
> 			a[i] = a[j] + a[k];
> 	}

(`Return' may be the wrong thing to do; but ignore that for now.)

I believe it is far better to *ensure* that some optimisation is
safe than simply to *declare* it so (as `noalias' appears to do).
But I am willing to concede that there may be cases in which this
is infeasible, and compiler hints will be used.  In such cases, it
would be best, I say, to put the hint as close as possible to the
affected code, and to make the hint as direct as possible, so that
it hints no more than is needed.

Or, mixing metaphores again, writing `noalias' is like using a
bandsaw to cut a keyhole.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris