ado@elsie.UUCP (02/12/84)
Subject: 4.?bsd C compiler error (better fix) Index: .../pcc/local2.c in 4.?BSD Description: The C compiler generates incorrect code in some cases. A fix posted earlier fixed the problem while degrading code in some cases; this fix avoids code degradation. Repeat-By: Use the command cc -S test.c where test.c contains: test() { register struct { short i; short j; } * sp; while ((sp++)->i != 0); } and then look at the "test.s" file produced. You'll see that the "while" loop generates this code: L16: tstw (r11)+ jeql L17 jbr L16 which only increments r11 by two (rather than four) each time through. Fix: This is the fix to the version distributed by Berkeley. ed - .../pcc/local2.c /ISPTR(p->in.left->in.type)/c #ifdef OLDVERSION if ( ISPTR(p->in.left->in.type) ) { #else /* ** We want to look for a pointer to a pointer, rather than a pointer. */ if ( ISPTR(p->in.left->in.type) && ISPTR(DECREF(p->in.left->in.type))) { #endif OLDVERSION . w q -- UUCP: decvax!harpo!seismo!rlgvax!cvl!elsie!ado DDD: (301) 496-5688
chris@umcp-cs.UUCP (08/25/84)
It's obviously a compiler bug. More interesting is the code that comes out of the (4.2BSD) compiler: % /lib/ccom int one = (unsigned) 2 >> 1; [compiler output] .data .align 2 .globl _one _one: movl $1,r0 movl r0,-4(fp) movl -4(fp),r0 movl r0,-8(fp) movl -8(fp),r0 . . . It only happens with the typecasts; ``int one = 2 >> 1;'' works. ``int one = (char) 2 >> 1;'' generates a similar endless stream, but it begins differently: .data .align 2 .globl _one _one: movl $2,r0 movl r0,-4(fp) mnegl $1,r0 movl r0,-8(fp) movl -8(fp),r0 movl r0,-12(fp) . . . Obviously what's happening is this: during expression tree optimization, the compiler is trying to compute the value at RUNtime. However, as there are no free registers, it keeps sticking values into r0, finding out it can't get another free register, pushing r0 for safekeeping, getting the next number into r0, finding out it doesn't have a free register for the ``ashl'' operation, . . . . Things work if the operation is done inside {}s because there are some registers free. The bug is probably simply that the compiler is missing optimizations for expressions involving types other than `int' (and `long') for the ">>" and "<<" operators when both LHS and RHS are integer constants. (Both << and >> generate the endless stream of junk). (I wonder why the register allocation code doesn't scream loudly when asked for registers while code generation is off?) -- In-Real-Life: Chris Torek, Univ of MD Comp Sci (301) 454-7690 UUCP: {seismo,allegra,brl-bmd}!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@maryland
chris@umcp-cs.UUCP (Chris Torek) (08/25/84)
Index: lib/pcc 4.2BSD
Description:
The C compiler aborts with a compiler error if function arguments
are declared `static'.
(I happened to notice that the code which does arguments doesn't
check for `static's among its declared arguments; probably that
is all that is required to `fix' it.)
Repeat-By:
Compile this:
foo (i)
static i;
{
}
donn@utah-cs.UUCP (08/25/84)
To recap: The following program blows up the 4.2 (== System III?) C compiler: ------------------------------------------------------------------------ int one = (unsigned) 2 >> 1; ------------------------------------------------------------------------ The precise message you get is: ------------------------------------------------------------------------ "one.c", line 1: compiler error: expression causes compiler loop: try simplifying ------------------------------------------------------------------------ Chris Torek's observation is close to the mark (if not on it), but perhaps it's not as clear as it could be. The problem is that the initializer is not being reduced to a constant before the code generator receives it. If the code generator gets an expression, it goes ahead and generates the instructions needed to compute it; apparently it doesn't mind if it ends up generating instructions into data space, nor does it seem to care if any of the startup code for processing text has been run. The code generator can handle constant initializers (with a fake opcode of INIT) and that's it. More directly, what's happening is that when certain casts are transmuted into conversions, the compiler neglects to canonicalize the resulting trees. (Some casts get a second chance -- try substituting '+' for '<<' in the example and see what happens.) It's a fairly simple matter to ensure that makety(), the routine which does the transmutations, always canonicalizes the trees it returns. Here are the changes to /usr/src/lib/mip/trees.c: ------------------------------------------------------------------------ *** /tmp/,RCSt1008238 Sat Aug 25 06:30:16 1984 --- trees.c Sat Aug 25 04:37:26 1984 *************** *** 1061,1067 if( t & TMASK ){ /* non-simple type */ ! return( block( PCONV, p, NIL, t, d, s ) ); } if( p->in.op == ICON ){ --- 1061,1067 ----- if( t & TMASK ){ /* non-simple type */ ! return( clocal( block( PCONV, p, NIL, t, d, s ) ) ); } if( p->in.op == ICON ){ *************** *** 1079,1085 } } ! return( block( SCONV, p, NIL, t, d, s ) ); } --- 1079,1085 ----- } } ! return( clocal( block( SCONV, p, NIL, t, d, s ) ) ); } ------------------------------------------------------------------------ I'm not sure that the first change is strictly necessary, but better safe than sorry. At any rate the sample program now compiles correctly. I don't think these changes have any nasty side effects -- I recompiled the compiler with the changes installed and it produced exactly the same binary as the old compiler did, for what it's worth. If the System V C compiler fixes this, I don't want to hear about it, Donn Seeley University of Utah CS Dept donn@utah-cs.arpa 40 46' 6"N 111 50' 34"W (801) 581-5668 decvax!utah-cs!donn
chris@umcp-cs.UUCP (Chris Torek) (08/26/84)
As long as we have people fixing bugs, how about someone patching up the pcc/ code that converts RS into LS and negates the shift? It fixes up constants but it doesn't remove double negations. Example: foo () { register int i,j,k; k = i >> -j; } produces (prettified) .align 1 .globl _foo _foo: .word 0xe000 mnegl r10,r0 mnegl r0,r0 ashl r0,r11,r0 movl r0,r9 ret What's the idea of the double mnegl? I got disgusted enough with this (and other similar but tougher-to-optimize) stuff in the output stage of my Versatec TeX filter that I stuck in asm()s (under #ifdef vax though). -- In-Real-Life: Chris Torek, Univ of MD Comp Sci (301) 454-7690 UUCP: {seismo,allegra,brl-bmd}!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@maryland
root@ur-valhalla.UUCP (08/27/84)
Well, I've installed Donn Seeley's fix for the C compiler screwing up on
initializing shifted, unsigned variables, and found out a nasty side effect:
The (unsigned) cast is IGNORED! This is my program:
main()
{ int big_number = ((~((unsigned) 0)) >> 1);
printf("this is a big number: %d\n",big_number);
}
Before installing the fix the program produced this output:
this is a big number: 2147483647
(To remind the original problem: should big_number be declared as static or
as a global variable, C compiler would blow up and go into a loop).
After installing the fix, I got the following output:
this is a big number: -1
Sigh....
--
Krzysztof Kozminski (truly sorry for not being able to fix the problem myself)
{seismo,allegra,decvax}!rochester!ur-valhalla!kris