[comp.arch] "interprocedural analysis useless for code optmization"

bcase@cup.portal.com (10/13/88)

There is a paper by S. Richardson and M. Ganapathi at Stanford that might be
of interest to those engaging in discussions that consider alias analysis a
problem (e.g., the machine-independent-intermediate language discussion).

"Interprocedural Analysis Useless for Code Optimization," Richardson &
Ganapathi, Stanford Tech. Report No.  CSL-TR-87-342, Nov. 1987.

At least for Pascal programs, the conclusion that not only is interprocedural
analysis pretty much useless for real programs, but "ALIAS information, which
is the most difficult to compute algorithmically, turns out to be the least
useful.  Our data show that only about 1% of the loads and stores in a given
program are susceptible to being potentially aliased to a var parameter, and
that the speedup gained by making such relationships explicit is almost nil."

Perhaps the case is different for C programs, but the old assumption (at
leat my assumption) that interprocedural optimization would be the next
important level of optimization technology, appears to be false.  Sigh.  [As
side note, I showed this paper to John Banning who, in '79, wrote a thesis
containing the "classic" algorithm for interproc. analysis.  He said "Damn,
if I'd known, I would have done something else." :-)]

The paper shows the speedups for 27 programs (some are small benchmarks);
the best that could be managed was a respectable 12%, but only *4* of the
programs showed any improvement; the other three speedups were 1%, 4%, and
7%.  Pure alias analysis contributed nothing to the speedups!  Wow.  Again,
maybe C programs provide more opportunities.  Any data out there?

barmar@think.COM (Barry Margolin) (10/14/88)

In article <9988@cup.portal.com> bcase@cup.portal.com writes:
>maybe C programs provide more opportunities.  Any data out there?

I don't have any data, but I do suspect that C provides more
opportunities than Pascal.  The difference is that in Pascal, aliases
can only occur due to VAR parameters.  In C, however, the address of
ANY variable (except register variables) may be taken, so there is
more possibility of aliasing.

I think PL/I provides even more possibilities, because of nested
procedures.


Barry Margolin
Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

henry@utzoo.uucp (Henry Spencer) (10/15/88)

In article <9988@cup.portal.com> bcase@cup.portal.com writes:
>At least for Pascal programs, the conclusion that not only is interprocedural
>analysis pretty much useless for real programs, but "ALIAS information, which
>is the most difficult to compute algorithmically, turns out to be the least
>useful..."  ...Perhaps the case is different for C programs...

C is a very different situation because of all the pointers running around.
Pascal's aliasing issues are insignificant by comparison.
-- 
The meek can have the Earth;    |    Henry Spencer at U of Toronto Zoology
the rest of us have other plans.|uunet!attcan!utzoo!henry henry@zoo.toronto.edu

hankd@pur-ee.UUCP (Hank Dietz) (10/15/88)

In article <29285@think.UUCP>, barmar@think.COM (Barry Margolin) writes:
> In article <9988@cup.portal.com> bcase@cup.portal.com writes:
[Stuff as to frequency of aliases being basically a function of
 the language that code is written in...]

I have references as to frequency of aliases (I can send copies if you're
interested).  Basically, C is truly a mess (unless you use register a whole
lot), but any language with arrays, call-by-address (e.g., Pascal VARs),
etc. has these problems.  The extent to which they impact code quality
depends mostly on what the code does (algorithm) and what the target machine
can do, not on the language in which it is written.

							-hankd

mac3n@babbage.acc.virginia.edu (Alex Colvin) (10/15/88)

The possibilities for aliasing are MUCH greater in C.  C allows any non-
register variable to be aliased.  Technically, it may be aliased only by
another variable of the same type (modulo union types).

Pascal seems to have been designed to obviate the need for aliasing
analysis.  It looks like it was successful.

Perhaps the conclusion here is that language design up front simplifies
optimization in back.  What does this tell us about universal intermediate
languages?

Unfortunately, someone's got to compile those millions of lines of C.  And
PL/I and F*TRAN.

henry@utzoo.uucp (Henry Spencer) (10/16/88)

In article <371@babbage.acc.virginia.edu> mac3n@babbage.acc.virginia.edu (Alex Colvin) writes:
>Pascal seems to have been designed to obviate the need for aliasing
>analysis.  It looks like it was successful.

It wasn't particularly designed for it, it just happened that way.  The
strictness of the Pascal type system and the very limited availability of
pointers are the main reasons.  Aliasing can still be a problem in Pascal;
note that the tech report in question didn't say it doesn't happen, just
that there are few opportunities for optimization as a result.
-- 
The meek can have the Earth;    |    Henry Spencer at U of Toronto Zoology
the rest of us have other plans.|uunet!attcan!utzoo!henry henry@zoo.toronto.edu

kirchner@uklirb.UUCP (Reinhard Kirchner) (10/19/88)

From article <371@babbage.acc.virginia.edu>, by mac3n@babbage.acc.virginia.edu (Alex Colvin):
> 
> Pascal seems to have been designed to obviate the need for aliasing
> analysis.  It looks like it was successful.
> 
Also Pascal has some difficult to detect aliases. Among them are:

- VAR-Parameters and global variables
- VAR-Parameters and heap objects
- WITH-accessed parts of structures and direct accessed ones
- VAR-ARRAYs and VAR-Scalars

The VAR-Parameters are the main problem, because many objects which are
accessible as VAR-Parameter may also be accessed directly. So each access
to a VAR kills globals held in Registers and vice versa. I had lots of trouble
with these things when writing the codegenerator for Pascal-SC ( the one
which lets You prove that a CRAY is a way to compute wrong numbers VERY FAST-:)

R. Kirchner

firth@sei.cmu.edu (Robert Firth) (10/24/88)

In article <2982@uklirb.UUCP> kirchner@uklirb.UUCP (Reinhard Kirchner) writes:

>Also Pascal has some difficult to detect aliases. Among them are:
>
>- VAR-Parameters and global variables
>- VAR-Parameters and heap objects
>- WITH-accessed parts of structures and direct accessed ones
>- VAR-ARRAYs and VAR-Scalars
>
>The VAR-Parameters are the main problem, because many objects which are
>accessible as VAR-Parameter may also be accessed directly. So each access
>to a VAR kills globals held in Registers and vice versa...

These are valid points, but I don't think the situation is quite that
bad.  (I've said this before, so will be brief)

a. Pascal is strongly typed. Therefore, a VAR of one type cannot alias
   an object of a different type (minor exception is when one type is
   composed from the other).  This in my experience means that one store
   to a VAR kills less than 20% of known tracked globals

b. Heap objects cannot be aliased with non-heap objects.  Usually, the
   programmer passes a pointer by value rather than the designated object
   by reference, in which case no alias is possible.

c. The language definition prohibits threatening access to a structure
   that has been bound in a WITH statement, so the codegenerator may
   legally assume no aliasing.

Things are even better if you are using an extended Pascal with separate
compilation facilities.  The "globals" are then partitioned into disjoint
sets; a routine defined in one module can directly access a global defined
in another only if there is an explicit IMPORT dependency.  This information
should be kept with each compiled module ("modules threatened list"), and
can be used to filter potential aliases (" a call of M1.ALPHA cannot threaten
globals declared in M2 so I can slave their values across this call").