preston@titan.rice.edu (Preston Briggs) (02/09/90)
In article <19233@dartvax.Dartmouth.EDU> jskuskin@eleazar.dartmouth.edu (Jeffrey Kuskin) writes: >Isn't one of the RISC folks' main arguments for simple instruction sets >that current compilers don't effectively exploit the complex addressing >modes and instructions supported in CISC chips? Perhaps someone would >like to speculate on what progess the next decade will bring in >compiler technology... Code generators are are powerful enough to handle most complex addressing modes in a locally optimal fashion. However, if we're doing lots of global optimization, the need for complex addresssing modes doesn't arise (very often, when you've got enough registers, ...) My small bet for the 90's is lots more work on using dependence analysis for scalar machines, particularly managing memory hierarchies. Everybody who optimizes languages with arrays is going to have to get into dependence analysis. More work in software pipelining, interprocedural analysis and optimization, pointer analysis??? Tools for writing parallel and distributed code?? Minimizing TLB misses? Preston Briggs preston@titan.rice.edu
mshute@r4.uucp (Malcolm Shute) (02/09/90)
In article <4812@brazos.Rice.edu> preston@titan.rice.edu (Preston Briggs) writes: >My small bet for the 90's is [...] >Tools for writing parallel >and distributed code?? As I understand it, the 'tools for writing parallel code' are already available, to the extent that we have 'tools for writing code' (though obviously the quest for developing 'better tools for writing code' will continue). The point is the programmer should not *have* to specify the parallelism-information in his program. The programmer already has too much to worry about in codifying the problem in the programming language, without having another housekeeping task thrust upon him. We have machines to take on our housekeeping work. The declarative programming languages seem (opinion) to be one of the most promising ways to allow the machine to find its own parallelism information without the programmer having to go to any special lengths to allow for it. Having said that, your original point is right to the extent that the corollary of what you actually seemed to say is the problem: how to keep the code from distributing too freely, and too far, across the system (again, of course, a task which should be performed mechanically, not added as a burden on the programmer). The harnessing of 'locality' is perhaps the main stumbling block for most multiprocessor research machines. Use of 'locality' is, and always has been, the most important thing to get right if one hopes to achieve high performance (witness the effect of adding banks of general purpose registers, cache memory, etc in conventional machines). This is, and will always be, a prime place to focus research effort. Malcolm Shute. (The AM Mollusc: v_@_ ) Disclaimer: all
mshute@r4.uucp (Malcolm Shute) (02/09/90)
>In article <4812@brazos.Rice.edu> preston@titan.rice.edu (Preston Briggs) writes: >>My small bet for the 90's is [...] >>Tools for writing parallel >>and distributed code?? > In article <651@m1.cs.man.ac.uk> I write: > [Some things which interpret the above statement to mean 'Tools to help > the programmer to explicitly express parallel and distributed code'] After hitting the send key, I realised that the author might have intended to include the possibility of the tools being used 'behind the scenes' by the compiler/run-time environment. Since I accept that this interpretation exists, I agree with the statement, and ask for 'no flames please' wrt my previous posting. Malcolm Shute. (The AM Mollusc: v_@_ ) Disclaimer: all
glass@qtc.UUCP (David N. Glass) (02/14/90)
> In article <19233@dartvax.Dartmouth.EDU> jskuskin@eleazar.dartmouth.edu (Jeffrey Kuskin) writes: > > >Isn't one of the RISC folks' main arguments for simple instruction sets > >that current compilers don't effectively exploit the complex addressing > >modes and instructions supported in CISC chips? Perhaps someone would > >like to speculate on what progess the next decade will bring in > >compiler technology... > As processors contain more parallel functional units, compiler technology is going to have to work harder in keeping all concurrent units as busy as possible, given the application. For this reason, software pipelining and local/global compaction is becoming a very hot topic. It's easy to see why when you look at some benchmarks we've produced with our own set of code rescheduling products. The tools are used as an assembly language to assembly language translator between the compiler and assembler steps. Given current i860 compilers, for example, we have seen execution speed improvements of as much as 206% (livermore loop kernel 7) with an average of 51% over the entire livermore loop suite. Our Intel 960CA scheduler averages 21% execution improvement, with applications like matrix multiply getting 104% speedup. The 88000 version improves Linpack 40%, and livermore loops up to 80% (livermore 1). I see more work being done in the area of global or inter-basic block optimizations. Things like software pipeling over multiple blocks, or an outer loop that contains inner loop (or loops). Further, more analysis of data dependencies will happen, especially with relation to managing memory heirarchies. -- Dave Glass +---------------------------------------------------------------------- | David N. Glass ...ogcvax | Quantitative Technology Corp. ...verdix !qtc!glass | 8700 SW Creekside, Suite D ...sequent | Beaverton, OR 97005 +----------------------------------------------------------------------