[comp.lang.c] What I expect from a C optimizer

stuart@bms-at.UUCP (Stuart Gathman) (01/12/89)

1)	Jump optimization - easy for machines, hard for humans and machine
	dependent to boot, almost any decent compiler does it.  This 
	includes choosing optimal 'switch' strategies.

2)	Common subexpression elimination.  This includes pointer expressions
	and should apply to an entire function, not just one statement.

	All variables used should be divided into two classes, 'alias'
	and 'noalias', by one of three methods (in increasing order of expense):

	a) 'register' variables are noalias.
	b) 'register' variables and 'auto' variables whose address is 
		never taken within the function are noalias.
	c) 'register' variables, 'auto' variables whose address is never
		taken within the function, and 'static' variables whose
		address is never taken within the module are noalias.

	For structured variables, taking the address of any member at any
	level cancels noalias status for the entire structure.

	(One can see why compiler makers wanted the 'noalias' keyword.
	 One can also see why it would be dangerous.)

	Subexpression trees are not considered identical when a member
	of the tree is 'alias' and there is a function call or modify
	indirect operation seperating the references.    When expression
	rearrangement allows, the offending operation should be moved.

	Of course, modification of any variable member of the tree also
	renders them non-identical.  Some compilers may be able to detect 
	identities (e.g. *p++, *--p), but this is not required.  The
	key thing is that a->b->c->d used twice will not follow the pointer
	chain each time.

	CSE elimination should not be confined to values in registers.
	The compiler should allocate and compute temporaries where
	required.

3)	Constant reduction.  This is specified as a requirement in
	K&R, but many compilers don't do it!  Gcc does it in the
	parse phase.  In particular this includes:

	if (0) { ... }

4)	Register optimization: let CSE determine register usage.
	There is no reason to assign a fixed register to a variable.
	Registers that are never flushed do not need memory allocated.
	Register flushes never followed by a reload should be eliminated.

	Flush variables declared 'register' as a last resort.
	Flushing strategy can range from a simple LRU over sequential
	program flow to attempts to guess frequency of usage.  The
	latter should still give 'register' priority because the
	compiler doesn't know usage context (apart from profiler
	feedback used by some systems - but it is still unknown the
	first time it is compiled, and the profile results from every
	potential application would have to be combined to do library
	routines properly).

	LRU over program flow with 'register' priority would suit me
	just fine.

5)	Loop unrolling: this is OK as long as code size is not increased
	significantly.  If I want speed over size, I'd rather unroll it myself.

GCC does the important things in several steps:

	1) constant reduction is done while parsing.

		output: tree representation of program with all variables
			as memory references.  Gcc calls this form 'C-tree'

	2) translate to program using infinite registers.  Do CSE and
	   assign 1 register to each unique sub expression.  GCC calls
	   this form 'RTL'.

	3) assign virtual to real registers with appropriate flushes and
	   reloads from memory.  This is still in RTL form.

	4) translate to actual machine instructions with peephole
	   optimizations.  This is programmable using a machine 
	   description in Gcc, the other steps are portable.

	5) 4 can be done before 3

Apparently, CSE is the hardest part since very few compilers do a decent
job of it.
-- 
Stuart D. Gathman	<stuart@bms-at.uucp>
			<..!{vrdxhq|daitc}!bms-at!stuart>