[net.lang.pascal] UCB PC Optimizations

grunwald@uiucdcsb.UUCP (10/28/84)

I'm trying to understand the UCB PC compiler a little more, particularly with
an eye to improving the quality of some of the code produced.

Most of these are issues regarding inefficiencies I've noticed when looking at
the code produced for the 68000.

My first point concerns constants. When emitting a small constant, say "5",
to an integer, you'll often see the code sequence:

	movql	#5,d0
	extw	d0
	extl	d0

The compiler says "Ah, 5 fits in one byte, so emit a P2ICON 5, P2CHAR, and then
do an SCONV to promote it to an integer. This is really a problem with the
68000 (I was using Sun) /lib/f1 phase -- Why doesn't it catch the fact that
the SCONV is not needed in this case? This seems to be fairly frequent.
   I'm not familar with the /lib/f1 phase, but it seems that a little work on
this would cut out a lot of useless code in Pascal programs.

The next thing I noticed was in for loops in the Pascal compiler. They shadow
the iteration variable, hopefully in a register. This works O.K., but I've
noticed two things which seem to be very easy to improve:

(1) The shadow variable has the same type as the iteration variable. The
    increment is done by:

	Emit RV of shadow
	       Emit RV of shadow
	       Convert shadow to integer
	       Emit 1, Integer
	   Plus
	   Convert to shadow-type
	Assign

    If the shadow type is "integer", this produces:
		movl	d7,d0
		addql	#1,d0
		movl	d0,d7
    The "movl" stuff is caused by the type conversions. 

(2) When the loop iteration variable is anything but an integer, it never
    is really put into a register, which is the intent of shadowing in the
    first place. This might be considered a fault of the temporary allocator,
    but I think there are some sound reasons for not putting shorts and chars
    in registers.

Now, since the we know that the iteration variable is within the range of the
lower and upper bounds (a call to RANG is done if -C was specified), we can
actually use any kind of integer which is at least as large as the iteration
variable. So, if we set the shadow variable to be an integer, it gets cast
to a register and we get better code. This is typically very true if you use
sub-ranges for indicies (1..10, etc) or characters to index an array.

An example of the improvements is shown: This is from the following program:

var
	k : 0..10;
	s : array[0..10] of integer;
begin
    for k := 0 to 10 do s[k] := 0;

Ignoring the initilization, etc, which doesn't change much, the loop body is:

L6:				    L6:
	movb	a6@(-77),_k		    movb    d7,d0
	movb	a6@(-77),d0		    movb    d0,_k
	extw	d0			    movl    d7,d0
	extl	d0			    lsll    #2,d0
	lsll	#2,d0			    movl    d0,a0
	movl	d0,a0			    addl    #_s,a0
	addl	#_s,a0			    moveq   #0,d0
	moveq	#0,d0			    extw    d0
	extw	d0			    extl    d0
	extl	d0			    movl    d0,a0@
	movl	d0,a0@			    cmpl    a6@(-82),d7
	movb	a6@(-77),d0		    jge	    L5
	extw	d0			    addql   #1,d7
	extl	d0			    jra	    L6
	cmpl	a6@(-76),d0
	jge	L5
	movb	a6@(-77),d0
	extw	d0
	extl	d0
	addql	#1,d0
	movb	d0,a6@(-77)
	jra	L6

As I said, a lot more data remains in the registers longer, and the actuall
code is smaller for this admittedly simple (but common) example.

These changes were seen by making the shadow variable be of type integer all
the time and changing the last increment to be a P2ASG P2PLUS instead of the
longer increment given above in (1). The fact that the 0 is extended is still
annoying in my book, but I think that it is really a problem for the code
generator.

If anyone sees a problem with these changes that I don't, please give a yell.
If you're interested in the code for the changes, I can assure you it is very
simple, and I can mail you a DIFF of the 4.2 copies of forop.c and mine.

Dirk Grunwald
{ihnp4, pur-ee} ! uiucdcs ! grunwald
grunwald@uiuc.CSNET

grunwald@uiucdcsb.UUCP (10/29/84)

Actually, I fibbed a bit -- you have to ensure that the for loop index is
type compatible with integers before making the shadow variable be a full
integer in a register. Otherwise, you'll get typing violations and whatnot.
But, that's a simple test to "compat", and the change to the increment still
holds.