lindwall@sdsu.UUCP (John Lindwall) (03/02/89)
This posting was prompted by a friendly competition between two IBM
PC compilers - Microsoft C 5.0 and Turbo C 2.0. We compared the assembly
output generated by these compilers from a simple piece of code (I had a
similar statement in a graphics program I was writing at the time). Here
is the code:
main()
{
int x, y, z;
x = 8*y - 8*z;
}
[Keep reading - this IS an amiga posting.]
It turned out that MSC generated nicer code for this statement. It
fetched y and z , performed the subtraction, bit shifted the result, and
stored the result in x. Turbo fetched y, shifted it, fetched z, shifted
it, subtracted, and stored the result in x.
So what does this have to do with Amigas? Well, out of curiosity I
tried the same test on the only Ami compiler C I have, Manx 3.6a. Here is
how I compiled it:
cc -a +L test.c
And here is the code generated:
move.l -8(a5),d0 ; get y
asl.l #3,d0 ; y *= 8
move.l -12(a5),d1 ; get z
asl.l #3,d1 ; z *= z
sub.l d1,d0 ; x = 8*y - 8*z
move.l d0,-4(a5)
Just out of curiosity I was wondering if someone could compile this
code using Lattice so we can compare results. Use long integers. Please
reveal the command line used to perform the compilation.
Thanks!
johnl@tw-rnd.SanDiego.NCR.COM
john.lindwall@tw-rnd.SanDiego.NCR.COM
deven@pawl.rpi.edu (Deven Corzine) (03/03/89)
Lattice C V5.00 generated similar code; it did not factor the
expression and do the shift once only. (with or without the global
optimizer.) with your example, the compiler complained about
uninitialized automatic variables and the optimizer simply killed the
whold thing as being dead code. recoded as:
test(x,y);
int x,y;
{
return(x*8-y*8);
}
compiled with:
lc -O -v test.c
nothing complained, produced code was:
link a5,#0000 ; (exactly what does this do?)
; [something w/stack]
move.l 000c(a5),d0 ; x
asl.l #3,d0 ; x*8
move.l 0008(a5),d1 ; y
asl.l #3,d1 ; y*8
sub.l d0,d1 ; x*8-y*8
move.l d1,d0 ; return x*8-y*8
unlk a5 ; undo the link
rts ; return
Deven
--
------- shadow@pawl.rpi.edu ------- Deven Thomas Corzine ---------------------
Cogito shadow@acm.rpi.edu 2346 15th Street Pi-Rho America
ergo userfxb6@rpitsmts.bitnet Troy, NY 12180-2306 (518) 272-5847
sum... In the immortal words of Socrates: "I drank what?" ...I think.
bader+@andrew.cmu.edu (Miles Bader) (03/04/89)
deven@pawl.rpi.edu (Deven Corzine) writes: > Lattice C V5.00 generated similar code; it did not factor the > expression and do the shift once only. (with or without the global > optimizer.) >... > test(x,y); > int x,y; > { > return(x*8-y*8); > } Just for reference, this is what gcc -O (which does a pretty damn good job optimizing) on a sun outputs: _addt8: link a6,#0 movel a6@(8),d0 asll #3,d0 movel a6@(12),d1 asll #3,d1 subl d1,d0 unlk a6 rts It manages to save one instruction by subtracting into the return register. I don't think this is accidental, as it manages to do the same thing with the order of the subtraction reversed. -Miles
brianr@tekig5.PEN.TEK.COM (Brian Rhodefer) (03/05/89)
Why would an optimizing compiler put `link a5, 0000' and 'unlnk a5' instructions into a subroutine that needed no local variables?
dillon@POSTGRES.BERKELEY.EDU (Matt Dillon) (03/05/89)
>Why would an optimizing compiler put `link a5, 0000' and 'unlnk a5' >instructions into a subroutine that needed no local variables? (1) So A5 can be used to referenced arguments (2) So a debugger can backtrace the stack frame. Apart from that, there is no reason to use link/unlk at all. If you want to talk about optimization, one can cut the call-return overhead by half or more by: (1) caller passes the return address in a register and jmp's or bra's to the routine instead of jsr'ing (2) callee pops caller's arguments (it can pop the args and free up its own stack (local vars) in one instruction.. an add). -Matt
darin@nova.laic.uucp (Darin Johnson) (03/07/89)
In article <3839@tekig5.PEN.TEK.COM> brianr@tekig5.PEN.TEK.COM (Brian Rhodefer) writes: >Why would an optimizing compiler put `link a5, 0000' and 'unlnk a5' >instructions into a subroutine that needed no local variables? 1) to support alloca type stuff? 2) Because it's simple. Otherwise compilers would have to backpatch the generated code. If it was determined later that the routine didn't need a link/unlnk, then it would have to remove that instruction, shuffle things around, etc. This isn't that difficult, but a lot of compilers don't do it. I see this the most on UN*X systems, whose compilers were derived from PCC. Usually, the code generated is something like this: link a5, $T997 . . unlnk a5 $T997 equ 42 Remember, the words "optimizing compiler" doesn't mean much. If a simple peephole optimizer was thrown in, they can call it an optimizing compiler. (such as a few bigname UN*X machines, whose compilers got rid of the equ statements in the above example, but very little else in my tests. [1985ish]) Darin Johnson (leadsv!laic!darin@pyramid.pyramid.com) Can you "Spot the Looney"?
jesup@cbmvax.UUCP (Randell Jesup) (03/09/89)
In article <462@laic.UUCP> darin@nova.UUCP (Darin Johnson) writes: >In article <3839@tekig5.PEN.TEK.COM> brianr@tekig5.PEN.TEK.COM (Brian Rhodefer) writes: >>Why would an optimizing compiler put `link a5, 0000' and 'unlnk a5' >>instructions into a subroutine that needed no local variables? ... >2) Because it's simple. Otherwise compilers would have to backpatch >the generated code. If it was determined later that the routine >didn't need a link/unlnk, then it would have to remove that instruction, >shuffle things around, etc. This isn't that difficult, but a lot of >compilers don't do it. I see this the most on UN*X systems, whose >compilers were derived from PCC. Usually, the code generated is something Lattice will not normally put in LINK #0,An's if the optimizer is turned on. Otherwise it will, for debugger support (debuggers usually use LINKs to find the stack frames.) However, there are some cases where LINK #0 will still be generated. -- Randell Jesup, Commodore Engineering {uunet|rutgers|allegra}!cbmvax!jesup