[comp.lang.misc] Whether cc after f2c can optimize arrays as well as f77 can

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (11/06/90)

In article <4869@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
> From article <1990Nov2.183359.6761@maths.nott.ac.uk>, by anw@maths.nott.ac.uk (Dr A. N. Walker):
> > [...stuff that shows that _sombody_ finally understands what I'm saying...]

As you know, I've said the same things before in e-mail, so you don't
have to act so amazingly relieved.

The crux of this side issue is whether the compiler can optimize possibly
aliased functions without asking the programmer. Below I quote an e-mail
message from me to you in June, with interpretations for comp.lang.misc
readers (who are missing the extensive context of the message).

> Exactly the problem I kept trying to point out to someone via private
> email (and the reason I don't deal with that person over email any
> more - he ignored the point and kept claiming that the compiler by
> itself could do all).

I admire your productive attitude.

---Dan

: From brnstnd Wed Jun 13 00:23:57 1990
: Return-Path: <brnstnd>
: Received:  by stealth.acf.nyu.edu (5.51/25-eef)
: 	id AA00647; Wed, 13 Jun 90 00:23:54 CDT
: Date: Wed, 13 Jun 90 00:23:54 CDT
: From: Dan Bernstein <brnstnd>
: Full-Name: Dan Bernstein
: Message-Id: <9006130523.AA00647@stealth.acf.nyu.edu>
: To: dan, jlg%woodsy@lanl.gov
: Subject: Re:  aliasing again ...
: Status: O
: 
: Gaaaargh.

This was after your repeated insistence that proper aliasing analysis
could not be performed by the compiler.

:   [ file1: int a; int b; main() { ... sub(&a); ... } ]
:   [ file2: extern int a; sub(x) int *x; { ... } ]
:   [ file3: extern int b; sub(x) int *x; { ... } ]

You had just said something like ``file2 and file3 are symmetric with
respect to global variables a and b, so aliasing analysis is
impossible!''

: The compiler expands file2's sub() into two functions:
: 
:   subf(x) int *x; /* possibly */ aliased global, x; { ... }
:   subg(x) int *x; /* no aliasing allowed */ { ... }
: 
: Similarly for file3.
: 
: The compiler compiles file1's main() into two functions, only one of
: which will be used:
: 
:   main() /* no aliasing allowed */ { ... subf(x) ... }
: 
: Must I keep explaining?

I was getting a little ticked at having to explain the same thing to you
so many times, particularly when TCMITS understood exactly what I was
saying.

Do any readers understand what's going on here? Walker assumed that
without outside help, the compiler cannot easily test for interference
at run time (let alone compile time), and hence cannot decide which
version of the compiled code to execute. But here I'm showing you
exactly how it can be done, for the case you care about (array
manipulations as in Fortran).

: Hopefully this will elicit an ``Oh, now I understand!'' from you.
: Presumably you'll continue with ``But it's making the wrong decision for
: file3, because file3 can't see a!'' I repeat my original claim: The
: compiler can trivially perform a sufficiently good aliasing analysis
: that the translation of a Fortran program without aliasing will end up
: with the unaliased compiled C code.

That's what you're looking for, right? C array manipulations optimized
as effectively as those in Fortran? Indeed, in the simplified model
above, file3 has to account for aliasing where that aliasing cannot in
fact exist. But that can never happen in the translation of a Fortran
program. And see below.

: To do a better job, of course, you can add more levels between ``all
: variables may be aliased'' and ``no variables are aliased.'' For
: example: ``Any parameter may be aliased with a.'' ``Any parameter may be
: aliased with b.'' And so on. You made some incorrect statements in
: comp.lang.misc about the nature of the combinatorial explosion if you
: enumerate *all* possibilities; it's actually quite feasible for normal
: code.
: 
: The most flexible (and arguably the best) solution is to let the
: programmer assert the lack of aliasing between any variables at any
: time. Q provides sufficiently general mechanisms for this.

As does Nemesis, as you then told me. But that has nothing to do with
current real-world languages like C. I still claim that a C compiler can
detect and avoid the aliasing problems of a Fortran-converted piece of
code at compile time.

Please do not pervert my claim into the following: ``A C compiler can
detect and avoid all aliasing problems at compile time with no help from
the programmer.'' That simply isn't true: pointers in general are much
more flexible than Fortran arrays, and some tests simply have to be
performed at run time. I agree entirely (and in fact told you in several
messages) that the compiler has to get assertions from the programmer if
it wants to deal with general aliasing problems properly.

---Dan

jlg@lanl.gov (Jim Giles) (11/07/90)

It would actually never occur to me to post something out of a private
email conversation onto the net without the other person's permission.
But, since you have done so, I will continue the discussion in a public
venue.  In fact, I was wondering how to introduce this topic onto the
net myself.

From article <2779:Nov608:20:1990@kramden.acf.nyu.edu>, by brnstnd@kramden.acf.nyu.edu (Dan Bernstein):
> In article <4869@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
> [...]
> :   [ file1: int a; int b; main() { ... sub(&a); ... } ]
> :   [ file2: extern int a; sub(x) int *x; { ... } ]
> :   [ file3: extern int b; sub(x) int *x; { ... } ]

These three separate file descriptions I sent (in a somewhat more
legible form) as an example of the fact that _some_ decision _must_
be made at load time with respect to aliasing.  As can be seen, the
main code has a global called 'a' and sends it (call-by-reference)
to a procedure called 'sub'.  The other two files give two alternate
definitions of the procedure.  If the main code is loaded with the
version of 'sub' in file2, then the argument and the global would be
aliased in that context - either slow code or unsafe optimizations
must be accepted.  If the version of 'sub' from file3 is loaded,
then this problem doesn't arise and we would prefer the optimized
version.

> [...]
> : The compiler expands file2's sub() into two functions:
> : 
> :   subf(x) int *x; /* possibly */ aliased global, x; { ... }
> :   subg(x) int *x; /* no aliasing allowed */ { ... }
> : 
> : Similarly for file3.

This was his proposed solution which he claims solves the problem
without any loader intervention (indeed, without load-time action
of any kind).  Presumably, this produces 4 object files (or at least
4 object versions of 'sub').  For convenience, let's call these
'subf2.o', 'subg2.o' (both from the definition in file2), 'subf3.o'
and 'subg3.o' (both from the definition of file3).

> : 
> : The compiler compiles file1's main() into two functions, only one of
> : which will be used:
> : 
> :   main() /* no aliasing allowed */ { ... subf(x) ... }
> : 
> : Must I keep explaining?

Yes, I think you must.  At this point, you have 6 object modules.
The two from file1 are presumably identical since there is no
hidden aliasing possible in 'main' (not any shown in the example
anyway).  Let us call this 'main.o' since it doesn't matter which
of the 'main' versions I use.

Now, to load and run the code, I do what?  Well, somehow, I must decide
which of the four versions of 'sub' to load with 'main'.  Half of this
decision is the same as always, I must decide whether I want the version
from file2 of the one from file3.  But, the next decision is more
difficult: which of 'subf.o' and 'subg.o' should I load?  Well, a
quick examination of the files shows me that I should either use
'subf2.o' (the one from file2 that allows aliasing) or I should
use 'subg3.o' (the one from file3 which is optimized).

But, wait a minute!!  How did I decide that?  I did it by looking the
the relevant code in the caller and the callee (an interprocedural
analysis - I always said there'd be one of those).  And, I made the
decision at load-time (exactly the time I always said the decision
would be made).  Further, the loader itself could have made the
decision if the compiler passed the right information along (which
is what I always recommended).

> [...]
> : To do a better job, of course, you can add more levels between ``all
> : variables may be aliased'' and ``no variables are aliased.'' For
> : example: ``Any parameter may be aliased with a.'' ``Any parameter may be
> : aliased with b.'' [...]

I pointed out repeatedly in email that many compilers already have
this feature: it doesn't solve or even address the problem, which
was (as Walker put it), HOW DOES THE USER _KNOW_ what to assert
with all these declaraions?  It requires an interprocedural analysis.

> [...]
> : The most flexible (and arguably the best) solution is to let the
> : programmer assert the lack of aliasing between any variables at any
> : time. Q provides sufficiently general mechanisms for this.

As long as the _LOADER_ checks the consistency of these assertions
across procedure calls, this is indeed adequate.  Feedback from such
_LOADER_ tests would be sufficient to tell the user the interprocedural
information needed (though, through trial and error).

By the way, as I've pointed out before, the _LACK_ of aliasing
should be the default - the presence of possible aliasing should
require the assertion.  It makes sense to make the most commonly
desired configuration be the default.  You always ask for the
reverse of that.

J. Giles

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (11/08/90)

In article <5085@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
> It would actually never occur to me to post something out of a private
> email conversation onto the net without the other person's permission.

That is an amusing accusation: I posted something *I* wrote, with no
more than paraphrases from you. But let's get to technical details.

Here is a very precise description of how a C compiler can detect the
sort of non-aliasing that a Fortranish program has.

The compiler sees a prototype foo(a1;a2;...;an;v1;v2;...;vn;...) where
a1 through an are all the same pointer type, v1 through vn are all void
pointers or any other pointer parameters that may by the language
definition be aliased with the a's. It also sees global arguments of the
same or void pointer type, g1 through gn. As you may verify, the
compiler can make a complete list of all such variables, provided that
there is a prototype in scope. For simplicity let's rename all these
variables x1 through xn.

Whenever the compiler sees a function, it makes a list of possibly
aliased variables as above. It makes one such list for each pointer
type, but we (and it) can consider just one at a time. Via some fixed
mechanism that does not depend on any further information, it selects a
set of partitions of the x's. For instance, it might choose the
following partitions out of four variables, x1 through x4:

  { {x1}, {x2}, {x3}, {x4} }
  { {x1,x2,x3,x4} }
  { {x1,x2}, {x3,x4} }
  { {x1,x3}, {x2,x4} }
  { {x1,x4}, {x2,x3} }

It must include the partition that has every element in a single set.
(The ``Fortran Case'' is where it includes the every-element-in-one-set
partition and the every-element-in-separate-set partition, and no
others. I claim that this is sufficient to detect the nonaliasing in a
program that has been converted from Fortran.)

When the compiler is compiling the function: It produces one version for
each partition. In the version for a particular partition, it may assume
that all pointer references through variables separated by the partition
are unaliased. In the simplest case, where all variables are in the same
set, this gains it absolutely nothing. In the Fortran Case, when it is
compiling the second version, it can assume that all these variables are
unaliased, just as in Fortran. (The compiler might name the results sort
of the way C++ does.)

When the compiler is compiling a call to the function: It now uses the
partitions as requirements, with the same meaning for aliasing as above.
If it thinks all the x's might be aliased, it has to generate code that
calls the simplest version. The crucial point is that in a correct
translation of a correct Fortran program, it knows that none of the
arguments are aliased, so it can choose code that calls the
all-elements-separated version.

Jim, please say *something* that shows you understand the above
description. Try the fully worked-out example below if you want a
concrete case to stare at.

> From article <2779:Nov608:20:1990@kramden.acf.nyu.edu>, by brnstnd@kramden.acf.nyu.edu (Dan Bernstein):
> > In article <4869@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
> > [...]
> > :   [ file1: int a; int b; main() { ... sub(&a); ... } ]
> > :   [ file2: extern int a; sub(x) int *x; { ... } ]
> > :   [ file3: extern int b; sub(x) int *x; { ... } ]
> These three separate file descriptions I sent (in a somewhat more
> legible form) as an example of the fact that _some_ decision _must_
> be made at load time with respect to aliasing.

This is not correct. The compiler makes all the decisions at compile
time, as described above.

Now for the seventh or eighth time, here's how it applies to the
three-file case above: There are two global integer variables, a and b,
with which *x might be aliased. When the compiler sees the prototype for
file2's sub(), it chooses (for example) the following partitions:

  { {&a,&b,x} }    Note that it knows &a and &b aren't aliased, but
		   there's neither a need nor a way to express that fact
		   in this case.
  { {&b} {&a,x} }
  { {&a} {&b,x} }
  { {&a} {&b} {x} }

When it compiles file2, it produces four versions, one for each of the
above aliasing assumptions. As it turns out, file2 doesn't have b in
scope, so two of the versions will never be used no matter what else
goes on, but this is a side issue.

When it compiles main(), it says ``Aha! x and &a are aliased here. So I
have to use a partition containing {&a,x}, but I don't need one with
{&b,x}.'' So it calls the version compiled for the partition
{ {&b} {&a,x} }, which as I've explained assumes &a and x aliased. In
other words, IT CORRECTLY FIGURES OUT AT COMPILE TIME TO USE THE SLOW
CODE IN FILE2.

In contrast, if it had used the file3 version, it would generate a call
to the same partition. But file3 can then see that &b and x are not
aliased. In other words, IT CORRECTLY FIGURES OUT AT COMPILE TIME TO USE
THE FAST CODE IN FILE3.

I am getting really, really, really sick of trying to explain this to
you; I hope my excessively detailed and formal approach here does the
job. I will give up if you persist in ignoring the truth.

> version of 'sub' in file2, then the argument and the global would be
> aliased in that context - either slow code or unsafe optimizations
> must be accepted.  If the version of 'sub' from file3 is loaded,
> then this problem doesn't arise and we would prefer the optimized

Yes, that is EXACTLY what happens.

> This was his proposed solution which he claims solves the problem
> without any loader intervention (indeed, without load-time action
> of any kind).

This scheme requires NO changes to the loader. It will make temporary .o
files larger, I admit---a justifiable loss for being able to detect
aliasing at compile time without bothering the programmer. 

Certainly you admit that libraries should have fast and slow versions
when aliasing can make a large difference.

> Yes, I think you must.  At this point, you have 6 object modules.

Incorrect. I have three. One for each file, as always.

> But, wait a minute!!  How did I decide that?  I did it by looking the
> the relevant code in the caller and the callee (an interprocedural
> analysis - I always said there'd be one of those).

No, this requires no interprocedural analysis. The ultimate proof of
this is that my scheme *can* be used for libraries.

(Of course, the compiler can save a lot of code generation if it *does*
use interprocedural analysis to figure out which partitions it needs.
Perhaps the right partitions could be dynamically added to the system
library when user code asks for them. Who knows. The full method does
not generate sufficient overhead for this to be a big problem.)

> Further, the loader itself could have made the
> decision if the compiler passed the right information along (which
> is what I always recommended).

The compiler *has* passed the right information along. In the
particular, all-important-to-you case of Fortran array manipulations,
the compiler can detect aliasing *without any help from the programmer*,
because all arrays are passed as simple arguments and it's trivial to
detect what data flows where.

> HOW DOES THE USER _KNOW_ what to assert
> with all these declaraions?  It requires an interprocedural analysis.

Wrong, and wrong. The user does not do ANYTHING, and there is NO
interprocedural analysis required. You are becoming a bit repetitive.

> > : The most flexible (and arguably the best) solution is to let the
> > : programmer assert the lack of aliasing between any variables at any
> > : time. Q provides sufficiently general mechanisms for this.
> As long as the _LOADER_ checks the consistency of these assertions
> across procedure calls, this is indeed adequate.

The compiler *must* do this check, again because libraries may not be
loaded with programs for a while. You're arguing in a separate thread
that the loader should also do this check, but I still don't see your
justification. In any case, we both agree that it's a benefit to allow
certain classes of assertions to be part of the function prototype.

> Feedback from such
> _LOADER_ tests would be sufficient to tell the user the interprocedural
> information needed (though, through trial and error).

Trial and error? What a ridiculous idea. Why do you want to burden the
user with any of this?

> By the way, as I've pointed out before, the _LACK_ of aliasing
> should be the default - the presence of possible aliasing should
> require the assertion.

The user *NEVER* has to make an assertion to the compiler in order to
handle this case.

> It makes sense to make the most commonly
> desired configuration be the default.  You always ask for the
> reverse of that.

Sir, you appear to be lying. But I've promised myself not to drag you
over the coals for your repeated misquotes, misparaphrases, misleading,
and apparent outright lies.

---Dan

jlg@lanl.gov (Jim Giles) (11/08/90)

From article <5853:Nov720:22:4890@kramden.acf.nyu.edu>, by brnstnd@kramden.acf.nyu.edu (Dan Bernstein):
> [...]
>> It makes sense to make the most commonly
>> desired configuration be the default.  You always ask for the
>> reverse of that.
> 
> Sir, you appear to be lying. But I've promised myself not to drag you
> over the coals for your repeated misquotes, misparaphrases, misleading,
> and apparent outright lies.

I'm really tired of your incorrect ad hominem attacks.  The facts are
these: 1) aliasing is desireable less often than non-aliased variables;
2) I recommend that non-aliased be the default and the user make
assertions to _allow_ aliasing; 3) I made that recommendation in
_response_ to your suggestion that a no-alias assertion be used;
4) as above, I said that the default should be the more commonly
desired; 5) you _DID_ suggest (point3) the reverse.  Anyone who
has read this discussion can clearly see the truth in this.

J. Giles

jlg@lanl.gov (Jim Giles) (11/08/90)

Well, I've looked over your little proposal.  And, I'll have to admit
that it does "solve" the specific problem I posted.  If I had known
the misapprehension you were under about the problem I was addressing,
I would have posted a different example.  However, I will address three
objections to your scheme:

1) The number of different possible partitions is exponential in the
   number of procedure arguments, and is further enlarged by additional
   aliasing possibilities from the global variables.  To get the
   efficiency of the scheme I've been recommending, you would either
   have to always generate code for each possible partition of the
   variables, or the user would have to make his aliasing assertions
   as prolific as the scheme I am suggesting (to cause the generation
   of the specific partitions that the code actually needs).  Your
   scheme has the 'advantage' of automatically working even without
   these assertions.

2) I put the word 'advantage' in apostrophes above because I don't
   really consider it an advantage.  Usually, when unexpected aliasing
   occurs in a program it is an error.  Your scheme just goes ahead
   and runs without giving any indication to the user that deep down
   some aliasing has occurred.  This is the kind of obscure and hard
   to find error that was part of the motivation for the scheme that
   am proposing.

3) The last of these objections to your scheme is the fact that the
   first counter-example I tried actually couldn't be done correctly
   by your method:

      static int a;
      main() {
      ...
	 sub(&a);
      ...
      }
      fcn((int *) x) {
      ...
	 /* is 'x' aliased to 'a' here? */
      ...
      }

   Here, sub() is a procedure in a library (that you may or may not
   have the source for).  The procedure fcn() will be called at some
   time by the library.  The question is whether the parameter '&a'
   is passed through the library and back to fcn().  Since the variable
   'a' is declared 'static' (which for some reason, C interprets in
   this context to mean 'private'), 'a' does not show up in any of the
   partition comparisons in the library - in fact, 'a' is completely
   free from aliasing while in the library.  But, unless you do a
   call-chain analysis that propagates procedure arguments you will
   not be able to determine if 'x' and 'a' are aliased in fcn().
   This is one of the things that my scheme _can_ do.

J. Giles

gl8f@astsun.astro.Virginia.EDU (Greg Lindahl) (11/08/90)

In article <5853:Nov720:22:4890@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:

>Here is a very precise description of how a C compiler can detect the
>sort of non-aliasing that a Fortranish program has.

[ description deleted ]

Your description doesn't seem to work for Fortran-style aliasing. You
can't just test the starting address of an array; you must consider
the set of all addresses in the arguments which are changed by any
actual control path through the subroutine. Your example was worked
out with 3 scalar arguments. If I were a CS type, I'd probably think
the array case was pretty nasty to solve.

example: from my reading of the standard, this is valid:

      real a(10)
      call foo(a,a(3))
      ...
      subroutine foo(x,y)
      real x(3), y(7)
      x(2) = 1.0
      y(2) = 1.0

x(3) overlaps with y(1), yet since neither x(3) nor y(1) is written by
the routine, the anti-aliasing rule holds. Determining the set of all
addresses written by a subroutine with all sorts of data-dependent
branches is not fun.

Now, hit 'n' because the interesting part of this posting is over.

More quotes from the Dan Bernstein quote collection, all of these
aimed at Jim:

>Jim, please say *something* that shows you understand the above
>description.

>Now for the seventh or eighth time, here's how it applies to the
>three-file case above:

>I am getting really, really, really sick of trying to explain this to
>you; I hope my excessively detailed and formal approach here does the
>job. I will give up if you persist in ignoring the truth.

> You are becoming a bit repetitive.

>Sir, you appear to be lying. But I've promised myself not to drag you
>over the coals for your repeated misquotes, misparaphrases, misleading,
>and apparent outright lies.

And, in a different posting, Dan said to me:

> Weird, you're right.

Dan, I expect these sorts of lines in alt.flame, but not
comp.lang.misc.

peter@ficc.ferranti.com (Peter da Silva) (11/08/90)

In article <5085@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
> version of 'sub' in file2, then the argument and the global would be
> aliased in that context - either slow code or unsafe optimizations
> must be accepted.

Aliasing can be detected at run time and appropriate code called. If the
code is short enough that this produces an unacceptable overhead, then
subroutine call time dominates anyway and the whole thing should have
been inline and it becomes a compiler problem.

(yes, I know this can lead to a combinatorial explosion... but I'm not
 convinced that a 2-way case isn't good enough.)
-- 
Peter da Silva.   `-_-'
+1 713 274 5180.   'U`
peter@ferranti.com

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (11/09/90)

In article <5243@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
> Well, I've looked over your little proposal.  And, I'll have to admit
> that it does "solve" the specific problem I posted.

Thank you! Do you also agree that I wasn't ``ignoring'' your repeated
insistences otherwise in e-mail?

> 1) The number of different possible partitions is exponential in the
>    number of procedure arguments, and is further enlarged by additional
>    aliasing possibilities from the global variables.

I wish you would consider all possibly aliased variables, both arguments
and globals, together. ``The number of different possible partitions is
exponential in the number of possibly aliased variables.''

That is correct. Several months ago I posted the results up to 6
variables.

However, as I've said repeatedly, a compiler would never be crazy enough
to generate *all* possible aliasing possibilities. Probably the best
solution for individual programs is that adopted by Ultrix's -O3, or a
``feedback'' on the many compilers that can take it: generate a list of
all the partitions actually used, with some interprocedural analysis.

But an *adequate* job is done by merely choosing enough partitions to
reasonably cover the cases actually needed. In fact, if a statement of
yours is correct, ``enough'' means two. Viz., you said that there is no
aliasing in most real programs. If you're right, then the compiler can
generate two versions of the procedure: one that assumes no aliasing,
and one that assumes lots of aliasing. The fast version will always, in
practice, be selected.

(By the way, I'd appreciate it if you acknowledged the point in the
previous paragraph, and stopped the whining about a nonexistent
combinatorial explosion. In fact, the only reason I can think of that
you've continually failed to respond to the above point is that you want
to keep up that whining.)

> Your
>    scheme has the 'advantage' of automatically working even without
>    these assertions.

Exactly. The compiler can do some---maybe not a lot, but some---aliasing
analysis without help from the user. I wish you hadn't chopped off our
e-mail conversations just because you didn't understand this point.

> 2) I put the word 'advantage' in apostrophes above because I don't
>    really consider it an advantage.  Usually, when unexpected aliasing
>    occurs in a program it is an error.  Your scheme just goes ahead
>    and runs without giving any indication to the user that deep down
>    some aliasing has occurred.

That's a good point, but who says the compiler can't warn the user
whenever he calls the slow version of a routine? If you want to see
nonexistent error checking, look at how Convex handles aliasing. It
depends on the user for aliasing assertions, and it ``just goes ahead
and runs without giving any indication to the user that deep down some
alaising has occurred.''

> This is the kind of obscure and hard
>    to find error that was part of the motivation for the scheme that
>    am proposing.

Agreed, it's always good to be able to check user assertions. But the
error-checking issue is entirely orthogonal to the point of this thread.

> 3) The last of these objections to your scheme is the fact that the
>    first counter-example I tried actually couldn't be done correctly
>    by your method:
  [ static int a; main() { ... sub(&a); ... } fcn((int *) x) { ... } ]

It *is* handled correctly. fcn's only partition is {{x}}---a cannot be
included, because a is not visible outside the package. Hence it assumes
that &a and x may be aliased, and it generates the slow code for this
case.

I agree that this would be inefficient if x and &a actually weren't
aliased, but so what? It has *nothing* to do with the C equivalent of
Fortran, which is what I'm addressing (and have been since we started
this discussion many months ago). See the subject line. I'm making the
same claim that I made back then: C can optimize Fortran (-converted)
array code as well as Fortran can. Here you're talking about static
variables, which have no real (visibility) equivalent in Fortran.

>    This is one of the things that my scheme _can_ do.

Why do you call it ``your scheme''? It's not ``your scheme'' any more
than it's ``my scheme.'' We both came up with it before mentioning it to
the other, but there's hardly any chance that adding assertions to the
procedure call interface is a new idea. It's like Tarjan (et al., I
forget who his coauthors were) claiming in 1987 that he invented MSD
radix sort. If he had done his homework in Knuth's ACP, he would have
seen credit for ``his scheme'' as already being in wide use---by the
post office!

---Dan

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (11/09/90)

In article <5218@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
> From article <5853:Nov720:22:4890@kramden.acf.nyu.edu>, by brnstnd@kramden.acf.nyu.edu (Dan Bernstein):
> >> It makes sense to make the most commonly
> >> desired configuration be the default.  You always ask for the
> >> reverse of that.
> > Sir, you appear to be lying.
> I'm really tired of your incorrect ad hominem attacks.
  [ ... ]
> 2) I recommend that non-aliased be the default and the user make
> assertions to _allow_ aliasing; 3) I made that recommendation in
> _response_ to your suggestion that a no-alias assertion be used;
> 4) as above, I said that the default should be the more commonly
> desired; 5) you _DID_ suggest (point3) the reverse.

Sir, you are lying. I quote an article of mine from April:

: Hmmm, this doesn't sound right. After all, if A and B could be aliased,
: and B and C could be aliased, then A and C could be aliased. (This is
: why the natural keyword is ``alias,'' not ``noalias.'')

I have always held the same position. You can see exactly the same
position in many of my articles, and in my complaint that the Convex
anti-aliasing mechanism makes the opposite assumption.

You claim that I have suggested otherwise. I challenge you to prove your
claim. You will fail, because you are lying.

---Dan

jlg@lanl.gov (Jim Giles) (11/10/90)

From article <23333:Nov904:43:1390@kramden.acf.nyu.edu>, by brnstnd@kramden.acf.nyu.edu (Dan Bernstein):
> In article <5243@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
>> Well, I've looked over your little proposal.  And, I'll have to admit
>> that it does "solve" the specific problem I posted.
> 
> Thank you! Do you also agree that I wasn't ``ignoring'' your repeated
> insistences otherwise in e-mail?

Well, you certainly addressed the examples themselves.  You have
still ignored nearly everything I actually _said_ about them.  As I
pointed out last time, if I had known the extent to which you were
misinterpreting what I was saying, I would have sent a different
example.  In any case, I did continually point out that the example
given was extremely simplified and that usually the aliasing problem
to be discovered was deep in the call chain and not simply resting on
top like the examples showed.  The next example I was going to post
was the one I gave in the last article (and you incorrectly analyse:
see below).

> [...]
> But an *adequate* job is done by merely choosing enough partitions to
> reasonably cover the cases actually needed. In fact, if a statement of
> yours is correct, ``enough'' means two. Viz., you said that there is no
> aliasing in most real programs. If you're right, then the compiler can
> generate two versions of the procedure: one that assumes no aliasing,
> and one that assumes lots of aliasing. The fast version will always, in
> practice, be selected.

No, you claim is only _almost_ true.  The true restatement of the
last line above is: the fast version will _usually_, in practice,
be selected.  As I have repeatedly admitted, aliasing is (on rare
occasions) actually desireable.  Given what you've said above, you
plan to give only two choices: fast or _very_ slow.  But, what I
have consistently insisted upon is for the code to only be slowed
down by the aliasing that is actually present.  This requires
explicit declaration of what aliasing to allow, which in turn
requires a call chain analysis to determine if these constraints
are satisfied by parameters (which can be passed arbitrary levels
deep).  But, your scheme doesn't do this.  Even if you do allow
explicit assertions about aliasing, you don't provide a call
chain analysis tool which will tell the user the deep effects
of those assertions.

In addition, you still have an exponential number of separate cases
to select since you don't know before hand what specific aliasing
assertions the user will make.  But, that's right, you want to
condemn anyone who uses _any_ aliasing to inefficient code which
assumes _all_ possible aliasing.

> [...]
>> Your
>>    scheme has the 'advantage' of automatically working even without
>>    these assertions.
> 
> Exactly. The compiler can do some---maybe not a lot, but some---aliasing
> analysis without help from the user. I wish you hadn't chopped off our
> e-mail conversations just because you didn't understand this point.

I understood completely.  The compiler can do an enormously useful
amount of analysis _locally_ (and a good thing too, I'm all for
it).  The compiler with your scheme can do a tiny amount of shallow
non-local analysis (at least, assuming the .h files are from the
same version of your library routines as your .o files are).  This
latter is indeed new to me, it would never have occurred to me to
use so much effort to achieve such a trifling result.

> [...]
>> This is the kind of obscure and hard
>>    to find error that was part of the motivation for the scheme that
>>    am proposing.
> 
> Agreed, it's always good to be able to check user assertions. But the
> error-checking issue is entirely orthogonal to the point of this thread.

Well, again maybe I've not been clear.  In _my_ view, error checking
was always an inherent part of this discussion.  It is not second to
efficiency.  In fact, it is _more_important_ that efficiency.  The
two issues are linked by the fact that the call chain analysis I
recommend achieves _both_ error detection from propagated procedure
arguments _and_ allows selection of efficient code.

> [...]
>> 3) The last of these objections to your scheme is the fact that the
>>    first counter-example I tried actually couldn't be done correctly
>>    by your method:
>   [ static int a; main() { ... sub(&a); ... } fcn((int *) x) { ... } ]
> 
> It *is* handled correctly. fcn's only partition is {{x}}---a cannot be
> included, because a is not visible outside the package. Hence it assumes
> that &a and x may be aliased, and it generates the slow code for this
> case.

That's just the thing!  I mean, it shouldn't, should it?  You stated
above: "The fast version will always, in practice, be selected."
What happened to that?  Certainly, Fortran would select the _fast_
version.  Your claim to always do as well as Fortran is slipping.

My scheme will select the fast version for most cases and generate
an error in the rare (I hope) case when &a is actually passed all
the way through to fcn().  My scheme also allows me to declare a
slow version of fcn() if I _want_ to allow &a all the way through.
I can even (with function overloading) have both the fast and slow
versions loaded with the same code and the slow version only show up
on those call chains where it is really necessary.  (My function
overloading allows two functions with the same name which differ
in any specifically declared way in the interface specification.)

> [...]
> I agree that this would be inefficient if x and &a actually weren't
> aliased, but so what? It has *nothing* to do with the C equivalent of
> Fortran, which is what I'm addressing (and have been since we started
> this discussion many months ago). See the subject line. I'm making the
> same claim that I made back then: C can optimize Fortran (-converted)
> array code as well as Fortran can. Here you're talking about static
> variables, which have no real (visibility) equivalent in Fortran.

And yet, the Fortran-like version of this code would pick the fast
compilation.  Certainly Fortran Extended (which _does_ have the
concept of private variables) would optimize this code as if there
were no aliasing through the call chain - Fortran disallows such
aliasing.  My scheme allows such aliasing (if explicitly requested)
_and_ detects and disallows it (when it's _not_ explicitly resuested).

> 
>>    This is one of the things that my scheme _can_ do.
> 
> Why do you call it ``your scheme''? It's not ``your scheme'' any more
> than it's ``my scheme.'' We both came up with it before mentioning it to
> the other, but there's hardly any chance that adding assertions to the
> procedure call interface is a new idea.  [...]

I use this terminology to differentiate between the two different
concepts we support with a short phrase.  Instead of being insulting
or ragging me about some supposed attempt on my part to take credit
for the ideas, why don't you propose some alternate set of short
phrases which express the different ideas?  Why must you style of
argument _always_ be confrontational?  I have that unfortunate trait
to a great extent myself, but I _try_ not to supress it.

Besides, it's not clear that the idea of using the loader to
propagate the constraints through the call chain _isn't_ an original
idea.  I _may_ have every right to call it "my scheme".  I don't
know.  But, since I don't take a proprietary view toward the idea,
the use of the phrase "my scheme" should be regarded as merely
shorthand.

J. Giles

jlg@lanl.gov (Jim Giles) (11/10/90)

From article <24389:Nov906:00:4690@kramden.acf.nyu.edu>, by brnstnd@kramden.acf.nyu.edu (Dan Bernstein):
> [...]
> I have always held the same position. You can see exactly the same
> position in many of my articles, and in my complaint that the Convex
> anti-aliasing mechanism makes the opposite assumption.
> 
> You claim that I have suggested otherwise. I challenge you to prove your
> claim. You will fail, because you are lying.

From article <2779:Nov608:20:1990@kramden.acf.nyu.edu>, by brnstnd@kramden.acf.nyu.edu (Dan Bernstein):
> [...]
> : The most flexible (and arguably the best) solution is to let the
> : programmer assert the lack of aliasing between any variables at any
> : time. Q provides sufficiently general mechanisms for this.

Note: you state here that you assert the _lack_ of aliasing - which
means that you must be assuming that things are aliased by default.
This is the opposite direction to the order I would choose.  This
is a consistent position of yours.  If you really _mean_ that
the default is no-alias then your above solution should read:

   The most flexible (and arguably the best) solution is to let the
   programmer assert the _presence_ of aliasing between any
   variables at any time.

You should either say what you mean or you should stop accusing people
of lying when they take you at your word.

J. Giles

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (11/10/90)

In article <5483@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
> No, you claim is only _almost_ true.  The true restatement of the
> last line above is: the fast version will _usually_, in practice,
> be selected.

Good enough! You have admitted that a local analysis (``usually'')
suffices to detect this all-important nonaliasing. Thank you. I won't
bother to argue this further.

> Given what you've said above, you
> plan to give only two choices: fast or _very_ slow.

That ``_very_ slow'' is the speed of current C, which isn't a total
disaster. Especially if (as you admit) it usually doesn't happen.

> In addition, you still have an exponential number of separate cases
> to select since you don't know before hand what specific aliasing
> assertions the user will make.  But, that's right, you want to
> condemn anyone who uses _any_ aliasing to inefficient code which
> assumes _all_ possible aliasing.

Fgs, Jim, all I'm trying to do is disperse the overly negative aura
you've been trying to cast upon C. The fact is that C array code can run
as quickly as Fortran array code, even if the C compiler doesn't know
beforehand that there's no aliasing. I'm not saying that either language
is perfect; just that Fortran doesn't have a big advantage over C here.

> (at least, assuming the .h files are from the
> same version of your library routines as your .o files are).

Why do you keep bringing up this assumption? ``At least, assuming the
library doesn't have bugs that will make your computer explode.'' For
crying out loud.

  [ supposed counterexample ]
> That's just the thing!  I mean, it shouldn't, should it?  You stated
> above: "The fast version will always, in practice, be selected."
> What happened to that?  Certainly, Fortran would select the _fast_
> version.

As I said before, your example has NOTHING to do with Fortran, since
there is no real (visibility) equivalent to static.

> (My function
> overloading allows two functions with the same name which differ
> in any specifically declared way in the interface specification.)

The guy publicly proclaims the virtues of error checking, but privately
overloads his functions. Ungood.

> Why must you style of
> argument _always_ be confrontational?  I have that unfortunate trait
> to a great extent myself, but I _try_ not to supress it.

My style of argument is not always confrontational. I'm sorry you have
that unfortunate trait to such a great extent.

> Besides, it's not clear that the idea of using the loader to
> propagate the constraints through the call chain _isn't_ an original
> idea.  I _may_ have every right to call it "my scheme".  I don't

C++ uses it, as I've said before in this thread. Jeez.

---Dan

jlg@lanl.gov (Jim Giles) (11/14/90)

From article <6787:Nov1008:04:0290@kramden.acf.nyu.edu>, by brnstnd@kramden.acf.nyu.edu (Dan Bernstein):
> [...]
>   [ supposed counterexample ]
>> That's just the thing!  I mean, it shouldn't, should it?  You stated
>> above: "The fast version will always, in practice, be selected."
>> What happened to that?  Certainly, Fortran would select the _fast_
>> version.
> 
> As I said before, your example has NOTHING to do with Fortran, since
> there is no real (visibility) equivalent to static.

Actually, Fortran _does_ have the capability.  The exact analog to the
C example I gave before is:

      Function main(what, ever)
      integer a
      ...
      call sub(a)
      ...
      return
      entry fcn(x)
C     ... is a aliased to x here? ...
      return
      end

And, as I pointed out, Fortran will pick the FAST version of the
code.  What you propose will pick the SLOW version.  What I propose
will pick the _appropriate_ version.

Further, as I also pointed out, this was just the _FIRST_ counter
example that occurred to me.  The fact that your proposal doesn't
track constraints through the call chain suggests others.

      int a;            sub1(x){        int a;
      main(){           ...             sub2(y){
      ...               sub2(x);        ...
      sub1(&a);         sub3(x);        /* is a aliased to y? */
      }                 }               }


      int b;
      sub3(z){
      ...
      /* is b aliased to z? */
      }

If these routines are in four separate files, the method you
propose will not be able to answer the questions.  A Fortran
version will always optimize as if no aliasing occurred: most
implementations will not (unfortunately) catch the error.  No
matter which decision your method guesses, it will be wrong on
one of the two questions - you can only get the answers right
by propagating the information about the parameters through the
call chain.

J. Giles

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (11/14/90)

In article <5805@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
> From article <6787:Nov1008:04:0290@kramden.acf.nyu.edu>, by brnstnd@kramden.acf.nyu.edu (Dan Bernstein):
> > [...]
> >   [ supposed counterexample ]
> >> That's just the thing!  I mean, it shouldn't, should it?  You stated
> >> above: "The fast version will always, in practice, be selected."
> >> What happened to that?  Certainly, Fortran would select the _fast_
> >> version.
> > As I said before, your example has NOTHING to do with Fortran, since
> > there is no real (visibility) equivalent to static.
> Actually, Fortran _does_ have the capability.  The exact analog to the
> C example I gave before is:

Now you're changing the problem, because in Fortran a is visible with a
wider scope. It is *not* an exact analogue. And the fast version *will*
be selected---if, in fact, aliasing does not occur.

> What I propose
> will pick the _appropriate_ version.

What you propose has nothing to do with optimizing *current* computer
languages. I'm not talking about Nemesis or Q or Fortran 2001. I'm
talking about C.

  [ as another supposed counterexample, an unreadable rendition of: ]
  [ int a; main() { ... sub1(&a); } ]
  [        sub1(x) { ... sub2(x); sub3(x); } ]
  [ int a; sub2(y) { ... } ]
  [ int b; sub3(z) { ... } ]

Now a and b are GLOBAL! My solution DOES detect what aliasing goes on!

Sorry for shouting, but I'm getting really sick of your unjustified
attacks. If your variables have visibility as in Fortran, it *works*. If
you're not talking about a program that can be expressed the same way in
Fortran, then you're not talking about the same problem.

> you can only get the answers right
> by propagating the information about the parameters through the
> call chain.

You are so fascinated by whatever method you think of that you are blind
to the validity of other methods.

---Dan

jlg@lanl.gov (Jim Giles) (11/14/90)

From article <4610:Nov1323:35:0390@kramden.acf.nyu.edu>, by brnstnd@kramden.acf.nyu.edu (Dan Bernstein):
> [...]
>   [ as another supposed counterexample, an unreadable rendition of: ]

Look who's talking.  You're the one that always flattens them out to
unreadible one-liners.

>   [ int a; main() { ... sub1(&a); } ]
>   [        sub1(x) { ... sub2(x); sub3(x); } ]
>   [ int a; sub2(y) { ... } ]
>   [ int b; sub3(z) { ... } ]
> 
> Now a and b are GLOBAL! My solution DOES detect what aliasing goes on!

Really?  I'm sorry.  I don't see it.  The method you wrote out before
will make the _same_ decision for _both_ sub2() and sub3().  The
partition in your method for sub1() is {{x}} - that's the only one.
The partitions for sub2() are either {{a}{y}} or {{a y}}.  The
partitions for sub3() are either {{a}{z}} or {{a z}}.

So, you claim that the compiler determines which versions of sub2()
and sub3() to use while it's compiling sub1().  But, why would the
compiler pick the _slow_ version of sub2() in this case?  How does
the compiler _know_ that main() is going to pass &a?  On the other
hand, why would the compiler _not_ pick the slow version of sub3()?
How does the compiler know (while working on sub1()) that main()
doesn't access b and pass &b as an argument?

> [...]
> Sorry for shouting, but I'm getting really sick of your unjustified
> attacks. If your variables have visibility as in Fortran, it *works*. If
> you're not talking about a program that can be expressed the same way in
> Fortran, then you're not talking about the same problem.

_THIS_ problem _can_ be exactly specified in Fortran.  No more hiding
behind loopholes about Fortran not (presently) having 'private' data.
This is a (vastly) simplified example of 'deep' aliasing which is very
common in large scale computing.  Unless the constraints on aliasing
are propagated through the call tree, you can't find the answers.
(Or, if you can, you certainly haven't demonstrated it yet.)

> [...]
> You are so fascinated by whatever method you think of that you are blind
> to the validity of other methods.

If you'd actually post a valid method, I'd look again.  So far, what
you've posted _won't_do_ what you claim.  The only way I can think of
for you to make this scheme work is for the function prototypes in
your header files to contain all aliasing information, not just for
the function being prototyped, but for all procedures _called_ by
the function, and all those called by them, and all those called
by them, ....  But, this is just the scheme I've been proposing
except the constraints are propagated _up_ the call chain instead
of down.

J. Giles

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (11/14/90)

In article <5837@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
> From article <4610:Nov1323:35:0390@kramden.acf.nyu.edu>, by brnstnd@kramden.acf.nyu.edu (Dan Bernstein):
> >   [ as another supposed counterexample, an unreadable rendition of: ]
> Look who's talking.  You're the one that always flattens them out to
> unreadible one-liners.

This is much more readable than your huge cut-and-paste format. Remember
the importance of context: it's much easier to read things that fit on
one page. My version below takes four lines, shows the variables and
procedures clearly, and doesn't distract the reader.

> >   [ int a; main() { ... sub1(&a); } ]
> >   [        sub1(x) { ... sub2(x); sub3(x); } ]
> >   [ int a; sub2(y) { ... } ]
> >   [ int b; sub3(z) { ... } ]
> > Now a and b are GLOBAL! My solution DOES detect what aliasing goes on!
> Really?  I'm sorry.  I don't see it.

Fortran does not have separate compilation like C. It does not have
static variables like C. Hence a and b are visible throughout---though
your C version does not reflect this accurately. The partitions for
sub1() include a and b, as do the partitions for sub2() and sub3().

Since you apparently didn't understand what I was saying, here goes
again:

> > Sorry for shouting, but I'm getting really sick of your unjustified
> > attacks. If your variables have visibility as in Fortran, it *works*. If
> > you're not talking about a program that can be expressed the same way in
> > Fortran, then you're not talking about the same problem.

---Dan

jlg@lanl.gov (Jim Giles) (11/15/90)

From article <4962:Nov1405:45:5090@kramden.acf.nyu.edu>, by brnstnd@kramden.acf.nyu.edu (Dan Bernstein):
> [...]
>> >   [ int a; main() { ... sub1(&a); } ]
>> >   [        sub1(x) { ... sub2(x); sub3(x); } ]
>> >   [ int a; sub2(y) { ... } ]
>> >   [ int b; sub3(z) { ... } ]
>> > Now a and b are GLOBAL! My solution DOES detect what aliasing goes on!
>> Really?  I'm sorry.  I don't see it.
> 
> Fortran does not have separate compilation like C. [...]

Fortran has _only_ separate compilation.  It's C that is allowed to
compile several routines in the same file as a unit.  In Fortran,
even if the source of several routines are in the same file, the
compiler _must_ process each of them independently.

> [...]                                              It does not have
> static variables like C. [...]

Irrelevant, since this example has no private variables (which C
rather peculiarly calls 'static').

> [...]                    Hence a and b are visible throughout---though
> your C version does not reflect this accurately. [...]

Huh?  It's separate compilation which makes a and b _invisible_
to those functions that don't use them.  Or, are you saying that
a legal C version of sub1() _must_ mention a and b (even though
it doesn't use them)?  I assure you that _no_ C compiler in
existence requires this.  Certainly such a requirement would
violate both ANSI and K&R.

> [...]                                            The partitions for
> sub1() include a and b, as do the partitions for sub2() and sub3().

What?  How does the compiler even _know_ about a and b when it compiles
sub1()?  The file containing sub1() doesn't even mention them.  How in
the world does the compiler generate partition entries for variables
that aren't even mentioned in the source file?

> [...]
>> > Sorry for shouting, but I'm getting really sick of your unjustified
>> > attacks. If your variables have visibility as in Fortran, it *works*.[...]

WHAT visibility "as in Fortran?"  In Fortran, only locally declared
objects are visible.  If you want to access a global variable in
Fortran, you must locally declare it and locally import it (with a
COMMON statement) in each procedure that uses it.  To get the same
visibility rules in C, you have to compile the C code in a separate
file for each procedure.  This is _exactly_ the form of the example
I've given in this thread.

In Fortran, the call to sub2() in the example would be illegal.  The
compiler would generate the FAST (alias free) code for all procedures.
In C, the call to sub2() is legal.  Since the codes in sub2() and
sub3() are symmetric, the compiler clearly must make the _same_
decision about which aliasing status to pick on _both_ procedures.
Since the only safe pick is the SLOW (aliasing present) versions,
you will pick the wrong one for sub3().  Either that, or you're
going to choose an unsafe version of sub2().  The only way to
avoid this difficulty is to propagate the aliasing constraints
through the call tree.

You said above (and I repeat here):

> [...]                                            The partitions for
> sub1() include a and b, as do the partitions for sub2() and sub3().

And indeed, for your scheme to work, the partitions for sub1()
_should_ include a and b.  But, your method (as stated) will _not_
include them.  Those two variables are _never_ mentioned in the
file containing sub1().

J. Giles

gl8f@astsun.astro.Virginia.EDU (Greg Lindahl) (11/16/90)

In article <6787:Nov1008:04:0290@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:

> The fact is that C array code can run
>as quickly as Fortran array code, even if the C compiler doesn't know
>beforehand that there's no aliasing. 

The fact is that you had to hand-optimize to get C to do my example fast.

> I'm not saying that either language
>is perfect; just that Fortran doesn't have a big advantage over C here.

The fact is that not having to hand-optimize is an advantage.

See what I mean, Dan?

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (11/16/90)

In article <1990Nov16.042408.21235@murdoch.acc.Virginia.EDU> gl8f@astsun.astro.Virginia.EDU (Greg Lindahl) writes:
> In article <6787:Nov1008:04:0290@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
> > The fact is that C array code can run
> >as quickly as Fortran array code, even if the C compiler doesn't know
> >beforehand that there's no aliasing. 
> The fact is that you had to hand-optimize to get C to do my example fast.

Only for current compilers. With the modifications I've proposed, C
would handle your case. That's the point I've made from the first
article in this thread.

The problem with having discussions around you is that you refuse to pay
attention to what's going on. Then you insult everyone just for fun.

---Dan