[net.unix-wizards] A couple questions for PCC wizards

donn@utah-cs.UUCP (Donn Seeley) (10/18/84)
Hacking my merry way through the 4.2 BSD C compiler recently, I
discovered some peculiarities in the sucomp() routine in
lib/pcc/order.c.  This routine is used to estimate the register
requirements (the 'Sethi-Ullman' number) of an expression, with the
idea that this lets you predict ahead of time which subexpressions will
have to be stored away on the stack and which can be computed with the
available 'temporary' registers.  (On the VAX, the PCC's temporary
registers are r0-r5.)

One useful thing that sucomp() does is to check the forms of leaf nodes
to see if they need not use any temporary registers.  A leaf node will
have a zero SU number if it is not already a temporary register, and it
is of int, unsigned or double type (ignoring some minor complications
with a special node type that is used for things like stack
variables).  The restriction on arithmetic types seems reasonable --
integral types smaller than ints will need to be widened, and the same
applies to floats, due to the rules of type conversion in C
expressions.  My question is, why aren't pointer types allowed the same
optimization?  The obvious answer (they need space to be dereferenced)
doesn't really work, since the calculation for the dereferencing
operator already takes this into account.  The next argument might be
that when the pointer is dereferenced into one or more temporary
registers, the SU number for the dereferencing is still the size of the
pointed-to type and the optimization doesn't buy us anything.  This
suggestion fails to explain why pointer expressions which DON'T involve
dereferencing have to suffer...  Does anyone have a good reason why
something would break if pointers were added to the list of types which
don't require temporary registers?  A side issue: since f77 doesn't
promote floats, should it put float type in the list too?

Another oddity in sucomp() involves commutative operators.  There is
an optimization which swaps the operands of commutative operators if
the right operand is 'tougher' (requires more temporary registers) than
the left.  The idea here is that the left operand will be evaluated
first and its value will be stored in a temporary register (if
necessary), possibly reducing the number of temporary registers which
will be available to compute the right operand.  Thus there are
situations where the right operand can be computed without a store to
memory if the result from evaluating the left operand isn't taking up
needed temporary registers; swapping the operands solves this problem.
(It would be nice if sucomp() could count on the lower levels
evaluating the tougher operand first, regardless of whether the
operation is commutative...  I suppose some difficulty I haven't
noticed yet interferes with this.  Suggestions?) Anyway, the list of
commutative operators is 'plus', bitwise inclusive 'or', and bitwise
exclusive 'or'.  I can sort of see why bitwise 'and' might not be
included, if there are special optimizations that require a particular
order for operands to 'and'.  But why isn't 'multiply' in the list?  I
have hacked the f77 code generator to treat 'multiply' as a commutative
operator for the purpose of this optimization, and as far as I can
tell, this change hasn't broken anything (and it has had noticeable
beneficial effects).  Can anybody give a good reason why 'multiply'
shouldn't be in the list of operators?

This stuff hasn't exactly been keeping me awake nights, but it would be
nice to know whether my surmises are correct...  If no one can rebut
me, I may just go ahead and try them out and see what breaks.

Donn Seeley    University of Utah CS Dept    donn@utah-cs.arpa
40 46' 6"N 111 50' 34"W    (801) 581-5668    decvax!utah-cs!donn