[net.bugs.4bsd] [4bsd-f77 #41] F77 bombs if it tries to perform code motion on strings

4bsd-f77@utah-cs.UUCP (4.2 BSD f77 bug reports) (09/04/84)

From: Donn Seeley <donn@utah-cs.arpa>

Subject: F77 bombs if it tries to perform code motion on strings
Index:	usr.bin/f77/src/f77pass1/optloop.c 4.2BSD

Description:
	F77 wants to move CHARACTER expressions with unchanging
	parameters outside of loops.  Unfortunately it can't create
	variable-length temporaries, so the compiler bombs.  At the
	same time f77 misses an obvious optimization to recognize
	substrings of the form 'c(i:i)' as having constant length 1.
	This bug was reported by Mike Brown at NOAO.

Repeat-By:
	Try to compile the following program (from Mike Brown) with
	the optimizer enabled:

	----------------------------------------------------------------
		character numstf*12, iarray*12
		iarray = ' '
		numstf = '0123456789+-'
	c
		ia = 1
		i = 1
		do 10 i = 1, 12
			if (iarray(i:i) .eq. numstf(ia:ia)) go to 20
	10	continue
	20	continue
		end
	----------------------------------------------------------------

	The compiler will complain:

	----------------------------------------------------------------
	brown/ch.f:
	   MAIN:
	Error on line 12 of brown/ch.f: adjustable length
	Termination code 138
	----------------------------------------------------------------

	A core dump of pass 1 will be left behind.

Fix:
	The problem is that the loop optimization routine finds a
	CHARACTER expression with a variable length that doesn't vary
	over the course of the loop and tries to move the expression
	out of the loop, but alas the stack temporary allocator can't
	make temporaries of variable length, hence the error message.
	The code that calls the allocator doesn't expect any problem,
	and this mistaken assumption causes the core dump a little
	later on.

	The straightforward thing to do is to insist that the compiler
	NOT try to move variable length strings out of loops.  This is
	a simple change to worthcost() in optloop.c:

	----------------------------------------------------------------
	*** /tmp/,RCSt1029138	Mon Aug 20 19:03:21 1984
	--- optloop.c	Sun Aug  5 17:05:29 1984
	***************
	*** 645,650
		return NO;
	  
	      case TADDR:
		if ((memoffset = p->addrblock.memoffset) && ! ISCONST(memoffset))
		return YES;
		else if ((vleng = p->addrblock.vleng) && ! ISCONST(vleng))

	--- 649,656 -----
		return NO;
	  
	      case TADDR:
	+       if ((vleng = p->addrblock.vleng) && ! ISCONST(vleng))
	+ 	return NO;	/* Can't make variable length temporaries */
		if ((memoffset = p->addrblock.memoffset) && ! ISCONST(memoffset))
		return YES;
		else
	***************
	*** 646,653
	      case TADDR:
		if ((memoffset = p->addrblock.memoffset) && ! ISCONST(memoffset))
		return YES;
	-       else if ((vleng = p->addrblock.vleng) && ! ISCONST(vleng))
	-	return YES;
		else
		return NO;

	--- 652,657 -----
		if ((vleng = p->addrblock.vleng) && ! ISCONST(vleng))
		return NO;	/* Can't make variable length temporaries */
		if ((memoffset = p->addrblock.memoffset) && ! ISCONST(memoffset))
		return YES;
		else
		return NO;
	----------------------------------------------------------------

	This solves the problem of the compiler crashing, but look at
	the code that now gets generated for the loop:

	----------------------------------------------------------------
		movl	$1,{ia}
		movl	$1,{i}
		movl	{ia},r10
		movl	{i},r9
		movl	$1,r9
	L17:
		subl3	$1,r10,r0
		subl3	r0,r10,-(sp)
		subl3	$1,r9,r0
		subl3	r0,r9,-(sp)
		movab	{numstf}+-1,r0
		addl3	r10,r0,-(sp)
		movab	{iarray}+-1,r0
		addl3	r9,r0,-(sp)
		calls	$4,_s_cmp
		tstl	r0
		jeql	L2000000
		acbl	$12,$1,r9,L17
	L2000000:
		movl	r9,{i}
		ret
	----------------------------------------------------------------

	The function s_cmp() is being called to compare each byte of
	the string.  Notice that the length of each substring is being
	computed explicitly with an expression like 'i - (i-1)'.  Now
	we all know that strings in Fortran are ridiculous, but this
	takes the cake.  The compiler knows that it can use simple byte
	instructions on strings of length one; why can't it figure out
	that 'iarray(i:i)' is a substring of length 1 and make an
	efficient loop with byte instructions?

	All it takes is a little test in the function mklhs() in expr.c
	to notice that the two operands of a substring operation are
	identical variables.  (It would be nice if we could notice
	identical expressions, but that's considerably more
	complicated.) Here is the change:

	----------------------------------------------------------------
	*** /tmp/,RCSt1029315	Mon Aug 20 19:33:34 1984
	--- expr.c	Sun Aug  5 23:06:34 1984
	***************
	*** 1122,1129
			if(p->lcharp == NULL)
				p->lcharp = (expptr) cpexpr(s->vleng);
			if(p->fcharp)
	! 			s->vleng = mkexpr(OPMINUS, p->lcharp,
	! 				mkexpr(OPMINUS, p->fcharp, ICON(1) ));
			else	{
				frexpr(s->vleng);
				s->vleng = p->lcharp;

	--- 1137,1151 -----
			if(p->lcharp == NULL)
				p->lcharp = (expptr) cpexpr(s->vleng);
			if(p->fcharp)
	! 			{
	! 			if(p->fcharp->tag == TPRIM && p->lcharp->tag == TPRIM
	! 			&& p->fcharp->primblock.namep == p->lcharp->primblock.namep)
	! 				/* A trivial optimization -- upper == lower */
	! 				s->vleng = ICON(1);
	! 			else
	! 				s->vleng = mkexpr(OPMINUS, p->lcharp,
	! 					mkexpr(OPMINUS, p->fcharp, ICON(1) ));
	! 			}
			else	{
				frexpr(s->vleng);
				s->vleng = p->lcharp;
	----------------------------------------------------------------

	Now the code for the loop becomes:

	----------------------------------------------------------------
		movl	$1,{ia}
		movl	$1,{i}
		movab	{numstf}+-1,r0
		movl	{ia},r1
		movb	(r0)[r1],-1(fp)
		movl	{i},r10
		movl	$1,r10
	L17:
		cmpb	{iarray}+-1[r10],-1(fp)
		jeql	L2000000
		aobleq	$12,r10,L17
	L2000000:
		movl	r10,{i}
		ret
	----------------------------------------------------------------

	Since the string length is now a constant, the compiler can
	also perform the code motion that it was prevented from
	attempting by the fix that was installed above...  The loop is
	now nice and compact, free of expensive subroutine calls.

Donn Seeley    University of Utah CS Dept    donn@utah-cs.arpa
40 46' 6"N 111 50' 34"W    (801) 581-5668    decvax!utah-cs!donn