[comp.compilers] Debugging optimized code

lyle@cse.ogi.edu (Lyle Cool) (07/25/90)

Greetings all!

I would like to come up with a master's thesis on some aspect of debugging
optimized code. I am looking for two things right now: 1) references to
relevant articles and/or books, and 2) suggestions for projects approppriate
for a master's thesis. If there is sufficient interest, and if the moderator
is willing, I will post a summary of what I receive.

Thanks,

Lyle Cool

lyle@cse.ogi.edu
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{spdcc | ima | lotus| world}!esegue.  Meta-mail to compilers-request@esegue.

jac@paul.rutgers.edu (Jonathan A. Chandross) (07/27/90)

> I would like to come up with a master's thesis on some aspect of debugging
> optimized code. I am looking for two things right now: 1) references to
> relevant articles and/or books, and 2) suggestions for projects approppriate
> for a master's thesis.

I am writing this with some reluctance.  Partially because I am not certain
if the topic is appropriate for this group, partially because I am uncertain
as to the reaction.

I constantly see individuals in graduate school posting requests for
references for various topics.  Is the art of using a library totally dead?
Is it unreasonable to expect someone who claims they want to do research to
know how to use a library?  How can someone say they want to do research in a
field when they don't even know the rudiments of how to find information or
even what the field is about?  Even a single paper will give you a dozen
references that can be tracked down with only minor effort.

So that this diatribe is not totally content-free, here is a brief
introduction to finding information on compilers.

Computing Surveys 
	Survey articles on various topics.  The references cited should
	give you a good idea of where to look for more information.
SigPlan Compiler Conferences (conference proceedings)
  Symposium on Compiler Construction
  Programming Language Design and Implementation
	These proceedings contain many papers on compilers and related
	systems.  Tends to be practically oriented, but some theory. 
SigPlan Notices
	Journal.  Often contains interesting things.  Not refereed;
	beware -- quality varies tremendously.
POPL (Principles of Programming Languages) Conference proceedings
	Papers on compilers; tends to be more theoretical than SigPlan.
TOPLAS (Transactions On Programming Languages and Systems)
	papers on compilers; tends to be more theoretical than SigPlan
ASPLOS (Architectural Support for Programming Languages and Operating Systems)
	More hardware oriented, explains hardware requirements but often
	describes compiler related issues
MICRO (Microprogramming) conference proceedings
	Issues regarding LIW, VLIW machines.  Compaction, flow analysis,
	and hardware issues.  Mostly microprogrammed hardware design.
AFIPS proceedings
	The old volumes (sixties, early seventies) contain descriptions
	of many old compilers and systems.  Very helpful to understand
	where things come from.

If you've looked through these proceedings and journals and you still
can't find what you want, ask somebody.

BUT DO YOUR HOMEWORK FIRST.

Don't expect others to do your research for you.  You won't learn anything
at all that way.

You can also look at PhD dissertations and tech reports that are cited in the
references.  Or, if you have some time to spare, browse through the tech
reports issued by Berkeley, Rice, University of Illinois-UC, Stanford,
Cornell, Princeton, NYU, Columbia, Rutgers, etc. to find dissertations and
tech reports that look like they would be helpful.  A paper is too short to
contain much introduction and explanation; a tech report often contains good,
solid, background material.  They are also much more recent than papers.  A
paper often takes a year or more to get published; by the time it comes out,
the information may no longer be state of the art. 

As far as your second question, i.e. a good topic, read through some of the
papers.  Get a feel for the issues; what's hard, what's easy, and why.
Critique the work -- write down a list of pros and cons for each paper.  What
does it do well?  What does it not do well?  What does it avoid entirely?
Group the papers into categories.  Trace ideas through them.  See how the
cross-fertilization process works.  See if you can spot places where a
combination of the techniques might work.  Think some more.  Talk to people
about your conclusions.  Think a lot more.  Re-read what you've written.  Is
it correct?  Is it fair?  Keep thinking.

Then, try to find a problem that needs solving.  Do a little preliminary
investigation of the topic to ensure you haven't bitten off too much.  Talk
to your advisor about the topic and see if he/she agrees that it is
appropriate.

If you want to do research you have to think.  It isn't easy and it isn't
always pleasant.  But it is the only way.  And the end result is usually
worth it.

Jonathan A. Chandross
Internet: jac@paul.rutgers.edu
UUCP: rutgers!paul.rutgers.edu!jac
[This point is well taken, in the future I'll encourage senders of similar
messages to find out enough to ask something more specific. -John]
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{spdcc | ima | lotus| world}!esegue.  Meta-mail to compilers-request@esegue.

johnson@cs.uiuc.edu (Ralph Johnson) (07/31/90)

One of the good reasons for posting to the net is that you can get information
that hasn't been published.  This is especially important in the area of
programming languages, where there are not a lot of places to publish.

Larry Zurawski did his PhD thesis on debugging optimized code.  We have a
(long) paper on the subject that I am polishing.  I would be happy to send it
to people like Lyle Cool, especially if they criticize it.  The basic idea is
to achieve the best optimizations possible while still ensuring that the
debugger behaves as expected.  This is done by noting every point in the
program where the debugger could be invoked and ensuring that it can recover
the unoptimized state.  The compiler stores recovery information for each of
these points in the program, and it occasionally omits an optimization.
However, it doesn't omit many; many programs are not slowed down at all by
omitting optimizations, while the worse case we found was 15% slower.

I can send paper if you send me your paper mail address, or LaTeX if that is
suitable.

Ralph Johnson -- University of Illinois at Urbana-Champaign

[I am certainly not opposed to people asking for info on the net, but do think
it appropriate that there be some evidence that they have done some homework
before they send out a message.  That's why you haven't seen any messages here
saying "Are there any books on how to write a compiler?" while we've had quite
a lot of discussion on the pros and cons of various books that people have
looked at. -John]
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{spdcc | ima | lotus| world}!esegue.  Meta-mail to compilers-request@esegue.

pattis@cs.washington.edu (Richard Pattis) (07/31/90)

Polle Zelweger.  Her thesis was done with John Hennessy at Stanford, but
Polle got her Ph.D. at Berkeley (both institutions may have published the
Thesis). She is now at Xerox PARC.

Rich Pattis
[Does anyone know if it was actually published, or does one have to order
the microfilm? -John]
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{spdcc | ima | lotus| world}!esegue.  Meta-mail to compilers-request@esegue.

jclark@src.honeywell.com (Jeff Clark) (08/01/90)

In article <1990Jul30.225825.28729@esegue.segue.boston.ma.us> pattis@cs.washington.edu (Richard Pattis) writes:

> Polle Zelweger.  Her thesis was done with John Hennessy at Stanford, but
> Polle got her Ph.D. at Berkeley (both institutions may have published the
> Thesis). She is now at Xerox PARC.

It is available as Tech. Rpt. CSL-84-5 from Xerox PARC.  The citation and
abstract follows.

@TechReport{zell84,
  author = 	"Polle T. Zellweger",
  title = 	"Interactive Source-Level Debugging of Optimized Programs",
  institution =	"Xerox Palo Alto Research Center",
  year = 	"1984",
  OPTtype = 	"Research Report (PhD Thesis)",
  OPTnumber = 	"CSL-84-5",
  OPTaddress = 	"3333 Coyote Hill Road, Palo Alto, California 94304",
  OPTmonth = 	"May",
  OPTabstract =	"The transformations performed by an optimizing compiler have
		 traditionally impeded interactive debugging in source
		 language terms:  after optimization, a program's source text
		 and object code do not have a straightforward correspondence.
		 This dissertation shows that effective interactive
		 source-level debuggers can be provided for optimized
		 programs.  Such debuggers can reduce debugging time and
		 programmer confusion.  These benefits are especially
		 important given the increasing availability of optimizing
		 compilers.

		 The first half of the dissertation studies the overall
		 problem of debugging optimized programs.  It presents a
		 general view of debuggers and defines two important levels of
		 debugger behavior for optimized programs.  A debugger
		 provides ``expected behavior'' if it hides the effects of the
		 optimizations from the user by doing behind the scenes
		 processing.  It provides ``truthful behavior'' if it
		 indicates that it cannot give the exact answer to a debugging
		 query (because the executing programs differs from the source
		 program).  The user may be able to deduce the correct answer
		 from the partial information displayed by a truthful
		 response.  A thorough study of the interactions between
		 optimization and debugging is included.  In addition, a
		 collection of solution techniques to relieve the problems
		 caused by optimization are described.

		 The second half of the dissertation describes implementation
		 experience with one aspect of the problem.  A prototype
		 debugging system called Navigator was developed for the Cedar
		 programming environment at the Xerox Palo Alto Research
		 Center.  Navigator can be used interactively to monitor
		 program execution flow in the presence of two simple but
		 nontrivial optimizations: inline procedure expansion and
		 cross-jumping (merging identical tails of code paths that
		 join).  Navigator provides expected behavior by combining
		 information collected by the compiler about the effects of
		 the optimizations and information collected by the debugger
		 about the control-flow history of the computation.  Program
		 execution space and speed are almost totally unaffected when
		 no debugging requests are active.  When debugging is
		 requested, Navigator provides its added functionality without
		 noticeably degrading debugger response time for most
		 programs.  Proofs of correctness of the compiler and the
		 debugger algorithms are given, as well as an analysis of
		 their efficiency."}

Jeff Clark	Honeywell Systems and Research Center	Minneapolis, MN
inet: jclark@src.honeywell.com		tel: 612-782-7347
uucp: jclark@srcsip.UUCP		fax: 612-782-7438
[A similar message from johnson@cs.uiuc.edu pointed out that the PARC
report is preferable to other sources because it contains graphics
missing from other versions. -John]
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{spdcc | ima | lotus| world}!esegue.  Meta-mail to compilers-request@esegue.