[comp.lang.c] noalias

jss@hector.UUCP (Jerry Schwarz) (12/16/87)

In article <6833@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>
>The "noalias" type qualifier promises that the object can only be
>modified by a single handle (as I said), thereby permitting the
>optimizer to do things that are not safe otherwise; for example,
>for pointers that could be used as aliases accessing the same
>object, without "noalias" qualifying the object, every time it
>was modified via one pointer the compiler would have to assume
>that the modification might affect what is seen via the other
>pointer.  

My gut reaction is that this is a bad idea, but I'm withholding
final judgement until I see the exact wording of the proposal.

The original description suggested that the English constraints
about overlapping strings in functions like "strcpy" could
be replaced by the use of "noalias" in their declarations.
I think this is doubtful because "noalias" is either too
strong or two weak.

Consider  strcpy(s1,s2) 

	If the redeclaration means that s1 and s2 can't point to the
	same char then it is too weak.  The strings could still
	overlap even if they have different initial chars.

	If it means that they can't point into the same array then
	it is too strong because the strings might be known to
	be disjoint, even though they are part of the same array.

Is 
	strcpy((char*)&errno,"a")

allowed?  It depends on whether the body of strcpy refers to errno.
Thus in place of the English requiring disjointness of the arguments
I need something to tell me what globals strcpy refers to.

>Fortunately, the typical programmer can just ignore it and
>never use "noalias" in his code, so it only affects those who
>like to fine-tune things.

But the headaches for maintainance of programs containing "noalias"
are likely to be enormous.  Every time I make even a minor change to
such a program I have to be aware of the implications for pointers
that I might not even be aware of.  For example, if I assign to or
use a global variable in a function that was not mentioned in the
function before my change, I have to know that there is no caller
anywhere who has stuffed the address of that variable into a
"noalias" pointer.

I can imagine some reasonable uses of "noalias".  The first
corresponds to "register" in that it would prohibit taking the
address of a top level identifier.  (I initially wrote "global
variable", but then I realized that a compiler might be able to make
good use a the information that a particular function is always
called directly.) Another use might be in a structure member to
indicate that addresses are never taken on that component.   What
these have in common is that they are enforceable by the compiler.

Jerry Schwarz
Bell Labs

P.S.  Normally I would wait to see exactly what was being proposed
before rushing to judgement, but I suspect that the time constraints
here are such that final action on the proposal will occur before I
see its details.

gwyn@brl-smoke.ARPA (Doug Gwyn ) (12/17/87)

In article <3297@ulysses.homer.nj.att.com> jss@hector (Jerry Schwarz) writes:
>My gut reaction is that this is a bad idea, but I'm withholding
>final judgement until I see the exact wording of the proposal.

Please do.  I haven't tried to present a complete or even very
accurate description of "noalias".  It isn't as bad as you may
have surmised from what little I have said about it, however,
and the X3J11 committee's top theoreticians all seem to have
bought into the notion.

>The original description suggested that the English constraints
>about overlapping strings in functions like "strcpy" could
>be replaced by the use of "noalias" in their declarations.

Yes, that's right.  In fact, it does work, when you check out
the actual rules.  There is no prohibition against using strcpy()
to copy one portion of an array into another non-overlapping
portion of the same array, for example.

>Is 
>	strcpy((char*)&errno,"a")
>allowed?  It depends on whether the body of strcpy refers to errno.

strcpy() does not affect errno.  However, if you assume that it
is permitted to modify errno directly as well as via a noalias
pointer parameter, then that situation is a violation of the
noalias constraints.  The compiler is not obliged to diagnose
this; it is the programmer's responsibility to use functions
correctly.  Note that you have strange behavior completely
independently of the "noalias" issue whenever there are two
paths to modify the same datum; this is known as "aliasing"
and in the absence of something like "noalias" the pointer
parameter would be used less efficiently since every change to
a global variable or via another pointer parameter would
potentially have invalidated whatever the first pointer had
been pointing to.  With "noalias", this invalidation has been
promised to not occur, and more efficient code can be used.

>But the headaches for maintainance of programs containing "noalias"
>are likely to be enormous.  Every time I make even a minor change to
>such a program I have to be aware of the implications for pointers
>that I might not even be aware of.  For example, if I assign to or
>use a global variable in a function that was not mentioned in the
>function before my change, I have to know that there is no caller
>anywhere who has stuffed the address of that variable into a
>"noalias" pointer.

There is simply no substitute for correct and complete interface
specifications; however, "noalias" can help in specifying some
parameters.  The issues you raise are not caused by "noalias";
they are caused by aliasing, which has always been possible in C.

I have some sympathy for the viewpoint that "noalias" adds
complexity and that it will be quite a job educating programmers
as to what it really means and how to deal with it.  I hope I
haven't gotten everyone off to a bad start by being imprecise,
but I too don't yet have the official wording for this topic.
It will be in the second formal public review draft, which is
expected to be generally available around early February.

About the only viable alternative to "noalias" would simply be
to prohibit the type of optimizations that it enables.  But I
don't think you're going to be able to get a committee that
consists primarily of compiler implementors to agree to that.
(I tried, more than once.)

lmiller@venera.isi.edu (Larry Miller) (12/18/87)

In article <3297@ulysses.homer.nj.att.com> jss@hector (Jerry Schwarz) writes:
>In article <6833@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>>
>>		 Lots of discussion about noalias

Independent of the virtues or faults of noalias, is the issue of how
this construct could possibly have come up so late in the process of
creating a C standard.  Other issues that have been with C for a very
long time (such as expression rearrangement) created enormous
debate.  Could anyone have believed that an entirely new concept
would not also cause concern?  On this basis alone, it probably should 
not be part of the C standard.

Larry Miller				lmiller@venera.isi.edu (no uucp)
USC/ISI					213-822-1511
4676 Admiralty Way
Marina del Rey, CA. 90292

gwyn@brl-smoke.ARPA (Doug Gwyn ) (12/19/87)

In article <4361@venera.isi.edu> lmiller@venera.isi.edu.UUCP (Larry Miller) writes:
>Could anyone have believed that an entirely new concept
>would not also cause concern?

It was not an entirely new concept; aliasing issues had remained unresolved
for at least a couple of quarterly meetings.  The particular form which the
solution took was new, but then it was only in the last couple of months that
we have had the "type qualifiers" cleanly separated out as such in the grammar,
so the "noalias" solution wasn't really viable until the last meeting.

dmr@alice.UUCP (12/19/87)

Larry Miller and others are correct in pointing out that the deepest
sin of X3J11 was to allow themselves to make controversial changes
in what was supposed to be the final draft (save for editorial polishing).
The plan was to accept the second round of public comments starting
in January, deal with them in April, and sail home in August.
They will have a harder time now;  further substantive changes
will probably require more public comment.

I will save detailed discussion until the words actually describing
"noalias" are written (they aren't yet) and the public comment window
opens; the committee should expect a letter considerably less
friendly than the last one I wrote.   "noalias" is a dreadful mistake.
I believe it is bad language design; worse, the committee is
in effect blackmailing those who agree with this position,
because its introduction is so hard to retract now.

I say "mistake" and "in effect blackmailing" with care.  It was not
a plot, but a compromise, that introduced the keyword.  From the outside,
though, its introduction (along with other changes) presents a nasty
choice to people who deeply wish the language standard to be both
good and timely.  Why could not they have left well enough alone?

	Dennis Ritchie

gwyn@brl-smoke.ARPA (Doug Gwyn ) (12/20/87)

In article <7552@alice.UUCP> dmr@alice.UUCP writes:
>Why could not they have left well enough alone?

I certainly sympathize with Dennis's point of view; I'm not terribly
fond of "noalias" (nor, for that matter, some other things in the
proposed standard) myself.  However, in defense of X3J11, I suggest
that there was a perception on the part of many committee members
that things were not "well enough" as they stood.  In the end, it
seems to come down to a matter of what one considers important to
include in a language standard.  Many people, including Dennis, I
believe, do not think that features intended solely to aid
optimization belong in the language.  (Of course, "register" is
already one such!)  Others were insisting on them, and the "noalias"
approach seemed the least offensive, if one had to have anything at all.
I must say that adopting "noalias" did defuse an effort to permit this
kind of optimization for "normal" code, which I would have considered
a terrible tragedy, or to bollix up the semantics of "const", which
matters less to me but which would have caused AT&T (and at least one
other company representative) to oppose the proposed standard.

I think that "noalias" is technically sound, but perhaps not viable
for political or even stylistic reasons.  I'm sure there will be many
public comments on this one!

chris@mimsy.UUCP (Chris Torek) (12/23/87)

[Key point: programmers cannot ignore noalias]

In article <10972@brl-adm.ARPA> TLIMONCE%DREW.BITNET@CUNYVM.CUNY.EDU writes:
>Currently, K&R (and every compiler you've used so far) has been
>forced to assume (for safety) that your variables are NOT NOALIAS
>(i.e. they ARE assumed ALIASED).

Not all!  Register variables are not aliased; variables whose
address is never taken are not aliased (the latter includes the
former, by definition).  The problem occurs only with variables
whose address *is* taken.  Of course, this includes all the things
to which pointers point.

>All ANSI is saying is that there should be a way for a programmer
>to point out to the compiler that, "Hey, this is a safe variable...
>let your optimizer have a field day."

Perhaps there should.  Given that there are no existing implementations
that use `noalias' to do this (MicroSoft C uses a different format,
as several people have mentioned), the perennial `lack of prior
art' alone should keep this from being included in the draft
standard.

>If you don't understand the purpose of noalias then ignore it.
>It won't effect you.  If you do understand it, use it.

[I should say not!  However, it will *affect* others.  But pardon
the spelling gripe.]

This argument sounds familiar.  Wait ... let me think ... ah!
PL/I.  `Sure we can put all this stuff in the language.  Programmers
can pick a subset they like and stick with that.  The stuff they
don't like, they can ignore.'  I dare say it did not work.

>Someone asked, "Why not use violate?"

[This brings some most unusual images to mind!  `What seems to
be the trouble here?'  `He ... he *violated* that variable!'
`Ugh!  What are you, some kind of sicko?'  But again I digress....]

>Good question.  Violate implies that it is a IO location (i.e.
>memory mapped).

No, `volatile' implies that the variable may change `behind the
compiler's back', so to speak, and that stores to it must be done
exactly as written as they may have some external effect.  Shared
memory, for instance, might be `volatile'.

>Not only can the programmer just plain ignore it if s/he doesn't want to
>use it, a compiler writer can "ignore" it and generate code as usual.

Compilers are free to ignore `noalias'.  Programmers are not.  Here
is why:

	f() {
		noalias int a;
		...
		a = e;	/* e is any expression that evaluates to >=38 */
		g(&a);
		a *= 2;	/* the compiler might use e + e */
		...
	}

	g(int *p) {
		if (*p < 38) *p += 4;
	}

If you do not understand `noalias', and change the code in `g'
or the value of `e', you may change a correct program (in which,
after `a = e' in f(), a >= 38) to an incorrect one (in which,
after `a = e' in f(), a < 38, or g() alters *p even when *p >= 38.)

The definition of `noalias' may be such that the program fragment
above is already invalid.  In that case I could devise an even more
convoluted example which could have the same effect.  The point is
this:

* Whatever `noalias' means[1], it has semantics that may be visible
* to the program[2], and hence must be visible to the programmer.

Unlike `register', if you ignore those semantics, you may get a
program that quietly fails.  Ignoring the semantics of `register'
produces only a compile-time error.

-----
[1] It is probably either `I will create no aliases' or `those
aliases I create never alter the value of a variable'.  Both are
impossible for compilers to verify (at compile time).
[2] If the programmer lies, the program is incorrect, but the
compiler cannot detect all possible lies, so the programmer must.
-----

I would like to see the proposed definition of `noalias', as it
may prove that many regular variables (those that are neither
noalias nor volatile) have the same semantics as volatile as far
as read access goes, the only difference being that writes can be
deferred or eliminated.  (It depends on the semantics allowed for
asynchronously invoked code (signal handlers).)  They may even have
the same write semantics.  Or, again depending upon each definition
and all their interactions, they may not have similar semantics at
all.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

C90630JG%WUVMD.BITNET@CUNYVM.CUNY.EDU (01/15/88)

Doug Gywn writes:

>From:         Doug Gwyn <gwyn@brl-smoke.arpa>
>Subject:      Re: noalias and vectors
>
>In article <531@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe)
> writes:
>>His statement that a "safe" vectorising compiler currently finds exactly
                                                                   -------
>>three safe places in "all the system software" is the kind of evidence I
>>was asking for that noalias would be useful.
>
>No, it's no such thing!  That's three places where "noalias" was NOT
>necessary in order to vectorize.  Because he didn't HAVE "noalias",

For once, Doug failed to carefully read the article to which he was
replying.  Clearly (at least to my mind) Richard, by using "exactly",
meant to make the same point Doug did; namely, that if noalias had
been available more vectorization would (presumably) have been possible,
with a resulting gain in speed, and that this was evidence that the
added keyword would have value.

Jonathan Goldberg

nevin1@ihlpf.ATT.COM (00704A-Liber) (01/19/88)

Will standard debuggers (such as SDB) be able to support noalias (and I don't
mean ignore noalias)?  By support I mean implement the worst case scenario --
do not obtain the data pointed to by a noalias pointer more than once (store it
internally somewhere else, for example).

I can think of at least one case where this cannot be done.  Linking to an
already-compiled routine can not be implemented this way at all.
-- 
 _ __			NEVIN J. LIBER	..!ihnp4!ihlpf!nevin1	(312) 510-6194
' )  )				"The secret compartment of my ring I fill
 /  / _ , __o  ____		 with an Underdog super-energy pill."
/  (_</_\/ <__/ / <_	These are solely MY opinions, not AT&T's, blah blah blah