dca@toylnd.UUCP (David C. Albrecht) (06/18/88)
Oooh, I'm annoyed. For those of you that use Lattice 4.1 I have found what I consider a big BUG. Not the 'code doesn't work variety' but rather the 'produces lousy code' type. I have developed over the last year or so the instincts of localizing register variables and auto variables with their area of usage to give maximum efficiency in register allocation and use of stack space. Come to find out this can produce worse code than not using register declarations at all. Gack! Ack! Phooey! Example: If I was to loop through an array assigning the elements to a value I might have: { register short i; for (i = 0; i < MAX_SIZE; i++) { some_array[i] = -1; } } One expects the register variable i to be dead after the last brace and thus available for re-use, right? Wrongo! Lattice doesn't think so. Or how bout this: { short i[100]; i[0] = 12; } You would expect the space for i on the stack to be freed after the last brace for re-use. Nope. Want proof? Take a look at the following code section: main(argc,argv) int argc; char **argv; { { register long r1, r2, r3, r4, r5; long a[1]; r1 = 0; a[r1] = 1; r2 = 2; r3 = 3; r4 = 4; r5 = r4 + r3; a[r1] = r1 + r2; } { register long r1, r2, r3, r4; long a[1]; r1 = 0; a[r1] = r1 + r1; r2 = 2; r3 = 3; r4 = a[r1]; } } Now lets look at the omd: Some notes: Lattice seems to reserve D0-D3 for its use and allocates starting with D0. It allocates D4-D7 to register variables and starts with D7. LATTICE OBJECT MODULE DISASSEMBLER V2.00 Amiga Object File Loader V1.00 68000 Instruction Set EXTERNAL DEFINITIONS _main 0000-00 SECTION 00 "testcase.o" 00000054 BYTES main(argc,argv) int argc; char **argv; { 0000 4E55FFE8 LINK A5,FFE8 0004 48E73F00 MOVEM.L D2-D7,-(A7) { register long r1, r2, r3, r4, r5; long a[1]; r1 = 0; 0008 7E00 MOVEQ #00,D7 First register alloc a[r1] = 1; 000A 2007 MOVE.L D7,D0 First lattice reg 000C 2200 MOVE.L D0,D1 Second lattice reg 000E E541 ASL.W #2,D1 0010 7401 MOVEQ #01,D2 Third lattice reg 0012 2B8210E8 MOVE.L D2,E8(A5,D1.W) Note a[] is at E8. r2 = 2; 0016 7C02 MOVEQ #02,D6 Second register alloc r3 = 3; 0018 7A03 MOVEQ #03,D5 Third register alloc r4 = 4; 001A 7804 MOVEQ #04,D4 Last register alloc r5 = r4 + r3; 001C 2404 MOVE.L D4,D2 Reuse lattice reg 001E 2404 MOVE.L D4,D2 ? 0020 D485 ADD.L D5,D2 a[r1] = r1 + r2; 0022 2607 MOVE.L D7,D3 Last lattice reg 0024 2600 MOVE.L D0,D3 0026 D686 ADD.L D6,D3 0028 2B8310E8 MOVE.L D3,E8(A5,D1.W) } { register long r1, r2, r3, r4; long a[1]; r1 = 0; Note that it put r1 in 002C 7200 MOVEQ #00,D1 a lattice reg not a user reg. a[r1] = r1 + r1; 002E 2601 MOVE.L D1,D3 0030 E543 ASL.W #2,D3 0032 2001 MOVE.L D1,D0 0034 D081 ADD.L D1,D0 0036 2B8030EC MOVE.L D0,EC(A5,D3.W) Note a[] is at EC. r2 = 2; Out of lattice regs 003A 7002 MOVEQ #02,D0 Saves r2 on the stack 003C 2B40FFF8 MOVE.L D0,FFF8(A5) gag! r3 = 3; Ditto! 0040 7003 MOVEQ #03,D0 0042 2B40FFF4 MOVE.L D0,FFF4(A5) r4 = a[r1]; Ditto again. 0046 2B7530ECFFF0 MOVE.L EC(A5,D3.W),FFF0(A5) } } 004C 4CDF00FC MOVEM.L (A7)+,D2-D7 0050 4E5D UNLK A5 0052 4E75 RTS SECTION 01 "__MERGED" 00000000 BYTES Moral of the story is check your register variables they may not be producing the code you expect. We are not amused. Time to go find the LBBS number. Growl, snarl, snap. David Albrecht
dillon@CORY.BERKELEY.EDU (Matt Dillon) (06/22/88)
: One expects the register variable i to be dead after the last brace : and thus available for re-use, right? Wrongo! Lattice doesn't : think so. Damn right it should. I do the same sort of thing... use localized register variables. One thing I have yet to see addressed properly by either Aztec or Lattice is the following: { register short i, j, k; for (i = 0; i < 10; ++i) <blah> for (j = 0; j < 10; ++j) <blah> for (k = 0; k < 10; ++k) <blah> } I is not used while J is being used, neither I or J are being used while K is being used, etc.... ONLY ONE REGISTER SHOULD BE USED FOR ALL THREE REGISTER VARIABLES!! Often, I have all sorts of temporary variables of differing types (usually differing pointer types), and even though I use them sequentially (where the same register could have been used), the compiler always assigns different registers to them. Allowing multiple register variables to 'share' registers under the above circumstances greatly increases register utilization. :Or how bout this: : : { short i[100]; : i[0] = 12; : } : : You would expect the space for i on the stack to be freed after the last : brace for re-use. Nope. Actually, no. Usually, stack space is overlayed: { { char x[256]; <blah> } { char y[256]; <blah> } } I.e. all stack is allocated at entry. In this case, 256 bytes should be allocated because x and y do not mix. Think about the efficiency this gives you. You have a tight loop: for (i = 0; i < 1000; ++i) { short x = i << 2; <blah> } You do NOT want stack space to be allocated and deallocated for every loop!!!! That's right, the variable x DIES on every loop by semantics. -Matt
glewis@cit-vax.Caltech.Edu (Glenn M. Lewis) (06/23/88)
In article <8806212123.AA01328@cory.Berkeley.EDU> Matt Dillon writes: >... > One thing I have yet to see addressed properly by either Aztec or >Lattice is the following: > > { > register short i, j, k; > > for (i = 0; i < 10; ++i) > <blah> > for (j = 0; j < 10; ++j) > <blah> > for (k = 0; k < 10; ++k) > <blah> > } > > I is not used while J is being used, neither I or J are being > used while K is being used, etc.... > > ONLY ONE REGISTER SHOULD BE USED FOR ALL THREE REGISTER VARIABLES!! Are you suggesting that the compiler ought to figure out that the value of 'i' will not be used again in this function, and the same for 'j' and 'k'? It seems that you are. That would be interesting. Often, when manipulating strings, I have a register variable such as 'i' the runs through the string in a for loop, and then I use the value after it, and then continue in another for loop. But I don't believe that I have ever written a routine that declared more than one register variable where that variable couldn't be re-used, if the value was no longer needed. In other words, if I had a situation like the one above, I would just say "register int i" and just let 'i' handle all those loops. I don't see any need to allocate two more variable to do the work that the first one could have done. I believe it would be dangerous to let the compiler re-use a variable in the manner that you describe, especially when using a debugger. If you check the address and/or value of 'i' or 'j', they would be the same, and you would think that a bug has been found. But yes, I agree that if the compiler were smart enough to look through the entire routine before allowing re-use of a register variable, that it should work properly. I would just like to point out that it is easy enough for the programmer to detect these situations, and use the register variable over "manually". -- Glenn -- glewis@cit-vax.caltech.edu
scott@applix.UUCP (Scott Evernden) (06/23/88)
In article <8806212123.AA01328@cory.Berkeley.EDU> dillon@CORY.BERKELEY.EDU (Matt Dillon) writes: > One thing I have yet to see addressed properly by either Aztec or >Lattice is the following: > > register short i, j, k; > > for (i = 0; i < 10; ++i) > <blah> > for (j = 0; j < 10; ++j) > <blah> > for (k = 0; k < 10; ++k) > <blah> >... > ONLY ONE REGISTER SHOULD BE USED FOR ALL THREE REGISTER VARIABLES!! As much as I might agree with you, I have yet to encounter a compiler that will do this. Even the best optimizing compilers fair no better than Manx and Lattice in this area. Does anyone know different?? BTW (and to fill some space here so this will pass rn), I noted while studying PDC some years ago, that it actually would ignore 'register' declarations altogether. The dang thing actually selected registers based on a computed usage of variables in generated blocks. Incredible. -scott
louie@trantor.umd.edu (Louis A. Mamakos) (06/23/88)
In article <728@applix.UUCP> scott@applix.UUCP (Scott Evernden) writes: > >As much as I might agree with you, I have yet to encounter >a compiler that will do this. Even the best optimizing compilers >fair no better than Manx and Lattice in this area. Does anyone >know different?? > I've used two compilers which do what you want. Given this test program: main(argc, argv) int argc; char **argv; { register int i, j, k; for(i = 0; i < 20; i++) { foo(); } for(j = 0; j < 30; j++) { foo(); } for(k = 0; k < 660; k++) { foo(); } } The Greenhills 68000 C compiler produces this code: SECTION 9 XDEF main main: MOVE.L D2,-(SP) MOVE.L 8(SP),D0 MOVE.L 12(SP),D0 MOVEQ #0,D2 .L10: JSR foo ADDQ.L #1,D2 MOVEQ #20,D0 CMP.L D2,D0 BGT .L10 MOVEQ #0,D2 .L7: JSR foo ADDQ.L #1,D2 MOVEQ #30,D0 CMP.L D2,D0 BGT .L7 MOVEQ #0,D2 .L4: JSR foo ADDQ.L #1,D2 CMPI.L #660,D2 BLT .L4 MOVE.L (SP)+,D2 RTS SECTION 14 * allocations for main * D2 i * D2 j * D2 k * 8(SP) argc * 12(SP) argv SECTION 9 SECTION 14 XREF foo * allocations for module SECTION 9 END Note that it does, in fact, use D2 for all three register variables. Looking at the output of the GNU C compiler (unfortunately, I only have the VAX target around at the moment..) we see much the same thing: #NO_APP .text .align 1 .globl _main _main: .word 0x40 clrl r6 L4: calls $0,_foo incl r6 cmpl r6,$20 jlss L4 clrl r6 L8: calls $0,_foo incl r6 cmpl r6,$30 jlss L8 clrl r6 L12: calls $0,_foo incl r6 cmpl r6,$660 jlss L12 ret This uses r6 for all three register variables. I'd love to have a GNU C compiler hosted on my Amiga, just need a few more meg of memory. Louis A. Mamakos WA3YMH Internet: louie@TRANTOR.UMD.EDU University of Maryland, Computer Science Center - Systems Programming
tom@garth.UUCP (Tom Granvold) (06/23/88)
- The C compiler for the Intergraph Clipper is able to know the lifetime of a variable and can reuse the same register. They call this 'register allocation by coloring'. This is done in addition to many other optimizations, and the variables do not need to be declared register in order for the complier to do this. Of course this is not of much help to use Amiga owners. Tom Granvold
dillon@CORY.BERKELEY.EDU (Matt Dillon) (06/24/88)
>I thought (to an extent) that the idea behind C was that a sufficiently >simple, machine-oriented language would make optimizers unnecessary, >since you could specify HOW you wanted things done at the simplest level. No. It *allows* you to get down to the bare bones when you want, but a programmer would do it only for very critical sections of code, as it usually makes the code unreadable. >For example, isn't it reasonable that a compiler should produce better >code for B than A? > >A: *p = ~mask[column & 15] & *p B: register int x; > | mask[column & 15] & value; x = mask[column & 7]; > *p = (~x & *p) | (x & value); > See what I mean? >Then again, why program in a high-level language at all? :-) Portability. For example, you might wonder why the very first language available for the MC88000 is C (by Greenhills, in fact)? Because with that, one can port just about any UNIX OS to it in less than a month, the libraries in even less time. Once you've got that, suddenly thousands of programs are available without having to be ported at all... simply recompile and <poof>. Also, debuggability (is that a word?) In fact, one is less likely to make an error coding in a high level language than coding in assembly, assuming he knows the language of course. >The point is, compilers shouldn't put their time into making ridiculous code >resonable; they should spend their time making reasonable code tight. It depends what your definition of reasonable code is, doesn't it. Frankly, I would rather have something that's readable. -Matt
dca@toylnd.UUCP (David C. Albrecht) (06/25/88)
> One thing I have yet to see addressed properly by either Aztec or > Lattice is the following: > > { > register short i, j, k; > > for (i = 0; i < 10; ++i) > <blah> > for (j = 0; j < 10; ++j) > <blah> > for (k = 0; k < 10; ++k) > <blah> > } > > I is not used while J is being used, neither I or J are being > used while K is being used, etc.... > > ONLY ONE REGISTER SHOULD BE USED FOR ALL THREE REGISTER VARIABLES!! > Well, this is a bit more complicated as it requires live/dead analysis. In brace entry/exit they have to clear the variables from the symbol table, in a proper implementation freeing register variable allocations and local variable stack space ought to be relatively easy. Virtually every C compiler I know of gets this right. Even pcc (gasp). To some degree you can use this 'basic' feature to get good register allocation in the absense of a good register allocator. This is what really miffs me about Lattice's screw up. > :Or how bout this: > : > : { short i[100]; > : i[0] = 12; > : } > : > : You would expect the space for i on the stack to be freed after the last > : brace for re-use. Nope. > > Actually, no. Usually, stack space is overlayed: > Apologies for obscure terminology. The compiler should be maintaining the value for the stack necessary to store local variables for the routine. On exit from a set of braces any local variables should be 'freed' and thus that stack space be available for re-use. Note that when I say 'freed' I am referring to a compile time concept here not any sort of actual run-time adjustment of the stack pointer. I would expect that the high water mark of the stack space required in the routine would be allocated at runtime on entry and deallocated on exit but local variables would reuse portions of the allocated section if they are not simultaneously active or as Matt put it they should be 'overlayed'. The point remains, however, that Lattice 4.1 doesn't 'overlay' variables local to braces within the body of a routine but rather allocates space enough for every variable in the routine. This isn't as big a faux paux as the register allocation but it is generally wasteful. postnews f o o d David Albrecht
dillon@CORY.BERKELEY.EDU.UUCP (06/30/88)
>where that variable couldn't be re-used, if the value was no longer needed. >In other words, if I had a situation like the one above, I would just say > "register int i" >and just let 'i' handle all those loops. I don't see any need to allocate >two more variable to do the work that the first one could have done. Please, don't remark on my lack of a good example. I *DID* say that this comes up (all that time in my case) not when you have the same type, but differing types... usually differing pointer types. FOREXAMPLE, I might want to put a passed pointer variable in a register so I can initialize some other structure, but beyond that never use the passed pointer variable again. poof(ss) register SOMESTRUCTURE *ss; { register BLAH *blah; ss->x = 43; ss->t = 23; ss->querty = "hello"; blah->ss = ss; blah blah blah ... <lots of code that never uses ss again> } In my case, this occurs often enough that I use up all the available registers and then I'm up shit creek. Using sub code blocks only partially fixes the situation. > I believe it would be dangerous to let the compiler re-use a variable >in the manner that you describe, especially when using a debugger. If you >check the address and/or value of 'i' or 'j', they would be the same, and >you would think that a bug has been found. I won't argue with you too much, since you are obviously unaware that this is a standard compiler design practice. >that it should work properly. I would just like to point out that it is >easy enough for the programmer to detect these situations, and use the >register variable over "manually". I'm glad you agree with me, though I disagree with your last remark. > > -- Glenn > -Matt