[comp.arch] Compilers vs. architecture

preston@titan.rice.edu (Preston Briggs) (02/09/90)

In article <19233@dartvax.Dartmouth.EDU> jskuskin@eleazar.dartmouth.edu (Jeffrey Kuskin) writes:

>Isn't one of the RISC folks' main arguments for simple instruction sets
>that current compilers don't effectively exploit the complex addressing
>modes and instructions supported in CISC chips?  Perhaps someone would
>like to speculate on what progess the next decade will bring in
>compiler technology...

Code generators are are powerful enough to handle most complex addressing
modes in a locally optimal fashion.  However, if we're doing lots
of global optimization, the need for complex addresssing modes
doesn't arise (very often, when you've got enough registers, ...)

My small bet for the 90's is lots more work on using dependence
analysis for scalar machines, particularly managing memory
hierarchies.  Everybody who optimizes languages with arrays
is going to have to get into dependence analysis.

More work in software pipelining, interprocedural analysis and
optimization, pointer analysis???  Tools for writing parallel
and distributed code??  Minimizing TLB misses?

Preston Briggs
preston@titan.rice.edu

mshute@r4.uucp (Malcolm Shute) (02/09/90)

In article <4812@brazos.Rice.edu> preston@titan.rice.edu (Preston Briggs) writes:
>My small bet for the 90's is [...]
>Tools for writing parallel
>and distributed code??

As I understand it, the 'tools for writing parallel code' are already
available, to the extent that we have 'tools for writing code'
(though obviously the quest for developing 'better tools for writing code'
will continue).  The point is the programmer should not *have* to
specify the parallelism-information in his program.  The programmer
already has too much to worry about in codifying the problem in the
programming language, without having another housekeeping task thrust
upon him.  We have machines to take on our housekeeping work.
The declarative programming languages seem (opinion) to be one of
the most promising ways to allow the machine to find its own parallelism
information without the programmer having to go to any special lengths
to allow for it.

Having said that, your original point is right to the extent that the
corollary of what you actually seemed to say is the problem:
how to keep the code from distributing too freely, and too far,
across the system (again, of course, a task which should be performed
mechanically, not added as a burden on the programmer).  The harnessing
of 'locality' is perhaps the main stumbling block for most multiprocessor
research machines.  Use of 'locality' is, and always has been, the most
important thing to get right if one hopes to achieve high performance
(witness the effect of adding banks of general purpose registers,
cache memory, etc in conventional machines).  This is, and will always be,
a prime place to focus research effort.

Malcolm Shute.         (The AM Mollusc:   v_@_ )        Disclaimer: all

mshute@r4.uucp (Malcolm Shute) (02/09/90)

>In article <4812@brazos.Rice.edu> preston@titan.rice.edu (Preston Briggs) writes:
>>My small bet for the 90's is [...]
>>Tools for writing parallel
>>and distributed code??
>
In article <651@m1.cs.man.ac.uk> I write:
> [Some things which interpret the above statement to mean 'Tools to help
> the programmer to explicitly express parallel and distributed code']

After hitting the send key, I realised that the author might have intended
to include the possibility of the tools being used 'behind the scenes'
by the compiler/run-time environment.

Since I accept that this interpretation exists, I agree with the statement,
and ask for 'no flames please' wrt my previous posting.

Malcolm Shute.         (The AM Mollusc:   v_@_ )        Disclaimer: all

glass@qtc.UUCP (David N. Glass) (02/14/90)

> In article <19233@dartvax.Dartmouth.EDU> jskuskin@eleazar.dartmouth.edu (Jeffrey Kuskin) writes:
> 
> >Isn't one of the RISC folks' main arguments for simple instruction sets
> >that current compilers don't effectively exploit the complex addressing
> >modes and instructions supported in CISC chips?  Perhaps someone would
> >like to speculate on what progess the next decade will bring in
> >compiler technology...
> 

As processors contain more parallel functional units, compiler technology
is going to have to work harder in keeping all concurrent units as busy as
possible, given the application.  For this reason, software pipelining and
local/global compaction is becoming a very hot topic.

It's easy to see why when you look at some benchmarks we've produced with
our own set of code rescheduling products.  The tools are used as
an assembly language to assembly language translator between the compiler
and assembler steps.  Given current i860 compilers, for example, we have 
seen execution speed improvements of as much as 206% (livermore loop kernel 7) 
with an average of 51% over the entire livermore loop suite.  Our Intel 960CA
scheduler averages 21% execution improvement, with applications like matrix
multiply getting 104% speedup.  The 88000 version improves Linpack 40%,
and livermore loops up to 80% (livermore 1).

I see more work being done in the area of global or inter-basic block 
optimizations.  Things like software pipeling over multiple blocks,
or an outer loop that contains inner loop (or loops).  Further, more
analysis of data dependencies will happen, especially with relation
to managing memory heirarchies.

-- Dave Glass

+----------------------------------------------------------------------
|   David N. Glass                         ...ogcvax            
|   Quantitative Technology Corp.          ...verdix   !qtc!glass 
|   8700 SW Creekside, Suite D             ...sequent             
|   Beaverton, OR  97005
+----------------------------------------------------------------------