[comp.lang.c] loop strength reduction

chris@mimsy.UUCP (Chris Torek) (05/23/89)

In article <1677@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes:
>	for (i = 0; i < LEN; i++)
>		a[i] = 0;

>... on most architectures, this requires that the value in "i" be
>multiplied by "sizeof a[0]" before being added to the address
>represented by the address of "a[0]", and do a strength reduction on
>that multiplication; you then find the induction variable not used, and
>eliminate it, and by the time the smoke clears you have the loop in the
>first example generating the same code as the loop in the second
>example.  (I don't know whether there are any compilers that do this or
>not.)

I fed this through gcc (1.35/vax) using -fstrength-reduce and it
produced the following equivalent:

		movl $19,r1
		addl3 $_a,$76,r0
	L4:
		clrl (r0)
		subl2 $4,r0
		decl r1
		jgeq L4
		...

(using `-mgnu' one gets a jsobgeq instead of decl+jgeq).  I am surprised
that, while it inverts the loop counter, it does not use the predecrement
addressing mode---I rather expected

		movl	$19,r1
		moval	_a+80,r0	# or movab; gcc uses movab elsewhere
	L4:	clrl	-(r0)
		jsobgeq	r1,L4

Or, alternatively, post-increment:

		movl	$19,r1
		moval	_a,r0
	L4:	clrl	(r0)+
		jsobgeq	r1,L4

Or (better but less likely) quadword operations:

		movl	$9,r1
		moval	_a,r0
	L4:	clrq	(r0)+
		jsobgeq	r1,L4
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

guy@auspex.auspex.com (Guy Harris) (05/24/89)

>>(I don't know whether there are any compilers that do this or not.)
>
>I fed this through gcc (1.35/vax) using -fstrength-reduce and it
>produced the following equivalent:

...


>(using `-mgnu' one gets a jsobgeq instead of decl+jgeq). ...

What, not a "sobjgeq"?  Perhaps they decided it was unpronounceable.... :-)

...

>...I am surprised that, while it inverts the loop counter, it does not
>use the predecrement addressing mode---I rather expected

...

>Or, alternatively, post-increment:

Oh well.  Silly me; I should have just tried "cc -S -O4" on the Sun-3
here if I wanted to see if any compilers would do the optimizations in
question; when I did so on the "for (i = 0; ...)" loop, I got:

	...
	moveq	#0,d7		# i = 0;
	movl	#_a,a0		# a0 = &a[0];
	moveq	#0,d1
	addl	d1,a0		# a0 += 0;	/* say WHAT? */
	jra	LY00000
LY00001:
	clrl	a0@+		# *a0++ = 0;
	addql	#1,d7		# i++;
LY00000:
	moveq	#33,d1		# if (i < 33)
	cmpl	d1,d7
	jlt	LY00001		# keep looping

	...

which did use post-increment, but generated some superfluous goo (the
"a0 += 0" stuff).  It also didn't bother optimizing the loop test, and
nuking "i" entirely, which it can get away with given that "i" is only
used as an array index in the body of the loop and isn't used at all
after the loop terminates.