[gnu.gcc] Instruction Scheduling and Branch Scheduling

tiemann@YAHI.STANFORD.EDU (Michael Tiemann) (06/17/89)

I don't know if this is the right place to post technical information
about GNU CC, but here goes...

The following is the title and abstract for a paper I recently handed in
for a compiler class I took at Stanford this quarter:

		The GNU Instruction Scheduler

			Michael Tiemann


        The GNU C compiler is a mostly portable, optimizing
        compiler which was written by Richard Stallman and
        others to provide the GNU project with a free C
        compiler.  This report describes the GNU Instruction
        Scheduler, a new pass written by the author for the
        GNU C compiler.  This report begins with a brief
        overview describing the architecture of the GNU C
        compiler, followed by a description of list-based
        instruction scheduling.  Design goals of the
        scheduler and accomplishments to date are then
        presented, followed by a section on work yet to be
        done.  This report concludes with an evaluation of
        the scheduler, its limitations and its strengths.

If anyone has any intererest in reading the paper, you can get get it by
anonymous ftp from yahi.stanford.edu, from the file dbr.texinfo.Z.  It
is written using texinfo, and a modified texinfo.tex (to make chapter
fonts smaller) is also provided in the file texinfo.tex.Z.  I will be
too busy catching up with GNU C++ to "maintain" the paper, but if you do
send me comments, I will file them away for review at a later date.

For people who may find themselves in an advanced compiler course, the
following parameters not mentioned in the paper may be of interest.  The
course I took was a 1 quarter (10 week) course titled "Advanced Topics
in Compilers", taught by Monica Lam.  Reading material was plentiful
(usually 2-3 recent journal articles per week) and varied (different
topic every week).  There was a mid-term paper and a final project.  The
time I allotted to the project was just over two weeks, including the
time to write the 20 page final report.  The paper I wrote for the
mid-term was unrelated to instruction scheduling or free software.

Michael

brooks@vette.llnl.gov (Eugene Brooks) (06/18/89)

In article <8906170506.AA29525@yahi> tiemann@lurch.stanford.edu writes:
>I don't know if this is the right place to post technical information
>about GNU CC, but here goes...
WARNING: THIS IS A PURELY TECHNICAL RESPONSE TO A PURELY TECHNICAL POSTING
ALL YOU POLITICAL TYPES HIT THE KILL KEY NOW!  I APOLOGIZE IN ADVANCE FOR
POSTING THIS TO gnu.gcc.

I snarfed Tiemann's paper across the net and read it, I hope that this paper
appears as part of the documentation of the GCC distribution at some point.
Perhaps one could start collecting and stuffing these things in a subdirectory.
At the risk of suggesting that net bandwidth be tied up with such technical
stuff, I suggest that Tiemann actually post his paper to gnu.gcc for those
without ftp access.

Instruction scheduling is a very machine specific operation because it
depends so strongly on the exact characteristics of the target machine.
For the recent pipeline RISC chips with short pipes, Tiemann's results would
appear to apply well.  However, for supercomputers and next years RISC chips
where the pipeline delays for loads and floating point operations
are a bit longer, some of the scheduling tricks which were tried
and found not to work will have a much different performance character.

If would appear that any attempt to do a "machine independent" implementation
of instruction scheduling might require a couple dials that could be set
from within machine dependent files, say output.c, which would control the
action of the scheduling algorithms.  Tiemann's first cut would work well
on a Cyber 205 which has really long pipeline delays and lots of registers.
Tiemann's last cut clearly works well on the SPARC and will probably work
well on the MIPS and other chips with short pipes.  The pipeline delays
of the 88K are longer, for instance three clocks for a load and 6 for
floating point, and the first cut might have different performance results
for this chip.

brooks@maddog.llnl.gov, brooks@maddog.uucp