kaufman@Neon.Stanford.EDU (Marc T. Kaufman) (12/18/90)
In article <ewright.661465844@convex.convex.com> ewright@convex.com (Edward V. Wright) writes: >In <1990Dec17.172613.7941@cs.umn.edu> sec@cs.umn.edu (Stephen E. Collins) writes: >Actually, this would have to be ->x++: LOAD X -> INC X -> STORE X >Unless you have an instruction to increment variables in memory! Well, since we ARE in a Mac group, lets just look at the code MPW C generates for just such constructs: i = i+1; MOVE.L i,D2 ADDQ.L #$1,D2 MOVE.L D2,i i++; MOVE.L i,D2 ADDQ.L #$1,i ++i; ADDQ.L #$1,i Behold. The 68K does, indeed, have an instruction to increment variables in memory. Marc Kaufman (kaufman@Neon.stanford.edu)
philip@pescadero.Stanford.EDU (Philip Machanick) (12/18/90)
In article <1990Dec18.001753.3756@Neon.Stanford.EDU>, kaufman@Neon.Stanford.EDU (Marc T. Kaufman) writes: |> Well, since we ARE in a Mac group, lets just look at the code MPW C generates |> for just such constructs: |> |> i = i+1; |> MOVE.L i,D2 |> ADDQ.L #$1,D2 |> MOVE.L D2,i |> |> i++; |> MOVE.L i,D2 |> ADDQ.L #$1,i |> |> ++i; |> ADDQ.L #$1,i |> |> Behold. The 68K does, indeed, have an instruction to increment variables in |> memory. Interesting - but remember looking at "toy" examples doesn't tell you much. In "real" code, where performance really matters, I would hope the compiler would have loaded the variable into a register for as long as possbile. Still, it's bad that the compiler doesn't pick up i=i+small constant as a special case - it would be even worse if the Pascal compiler also did this since (the original point) you have no option of asking for i++. Many of the programmer-directed "optimizations" in C, like register variables, ought to be unnecessary with a modern optimizing compiler. -- Philip Machanick philip@pescadero.stanford.edu
kaufman@Neon.Stanford.EDU (Marc T. Kaufman) (12/18/90)
In article <1990Dec18.004838.5623@Neon.Stanford.EDU> philip@pescadero.stanford.edu writes: >In article <1990Dec18.001753.3756@Neon.Stanford.EDU>, kaufman@Neon.Stanford.EDU (Marc T. Kaufman) writes: |>> Well, since we ARE in a Mac group, lets just look at the code MPW C generates |>> for just such constructs: |>> |>> i = i+1; |>> MOVE.L i,D2 |>> ADDQ.L #$1,D2 |>> MOVE.L D2,i |>> |>> i++; |>> MOVE.L i,D2 |>> ADDQ.L #$1,i |>> |>> ++i; |>> ADDQ.L #$1,i |>> |>> Behold. The 68K does, indeed, have an instruction to increment variables in |>> memory. >Interesting - but remember looking at "toy" examples doesn't tell you much. >In "real" code, where performance really matters, I would hope the compiler >would have loaded the variable into a register for as long as possbile. Well, to generate the above code, I declared 'external int i'. The C compiler DOES put things in registers. However, MPW C still generates pretty crufty code for i = i+1: MOVE.L D2,D0 ADDQ.L #$1,D0 MOVE.L D0,D2 and this is with optimization ON! The MOVE.L i,D2 in the i++ case is because the value of the expression (i++) is (i) before the +1. D2 is dead because we never use the expression value, and its a shame that the compiler doesn't remove it. I'm going to look at gcc and see if doesn't generate better code. With the complexity of today's applications, every little 5% helps. Marc Kaufman (kaufman@Neon.stanford.edu) >Still, it's bad that the compiler doesn't pick up i=i+small constant as a >special case - it would be even worse if the Pascal compiler also did this >since (the original point) you have no option of asking for i++. > >Many of the programmer-directed "optimizations" in C, like register >variables, ought to be unnecessary with a modern optimizing compiler. >-- >Philip Machanick >philip@pescadero.stanford.edu
wayner@cello.cs.cornell.edu (Peter Wayner) (12/18/90)
kaufman@Neon.Stanford.EDU (Marc T. Kaufman) writes: >In article <ewright.661465844@convex.convex.com> ewright@convex.com (Edward V. Wright) writes: >>In <1990Dec17.172613.7941@cs.umn.edu> sec@cs.umn.edu (Stephen E. Collins) writes: >>Actually, this would have to be >->x++: LOAD X >-> INC X >-> STORE X >>Unless you have an instruction to increment variables in memory! >Well, since we ARE in a Mac group, lets just look at the code MPW C generates >for just such constructs: > i = i+1; > MOVE.L i,D2 > ADDQ.L #$1,D2 > MOVE.L D2,i > i++; > MOVE.L i,D2 > ADDQ.L #$1,i > ++i; > ADDQ.L #$1,i >Behold. The 68K does, indeed, have an instruction to increment variables in >memory. >Marc Kaufman (kaufman@Neon.stanford.edu) This is good information, but how many cycles does each of these take? They are both going to memory, incrementing and returning a value. I suspect that the latest versions of the 68040 will heavily pipeline this operation, perhaps, but it is not clear that just using one instruction is faster than using 3. There are many examples from the fine old VAX where it was faster to use 3 simple instructions instead of one super-duper one. This observation was the impetus for the RISC movement. So what is it? Anyone have a 680?0 manual handy? Peter Wayner Department of Computer Science Cornell Univ. Ithaca, NY 14850 EMail:wayner@cs.cornell.edu Office: 607-255-9202 or 255-1008 Home: 116 Oak Ave, Ithaca, NY 14850 Phone: 607-277-6678
kaufman@Neon.Stanford.EDU (Marc T. Kaufman) (12/18/90)
In article <49832@cornell.UUCP> wayner@cello.cs.cornell.edu (Peter Wayner) writes: >kaufman@Neon.Stanford.EDU (Marc T. Kaufman) writes: -> ++i; -> ADDQ.L #$1,i ->Behold. The 68K does, indeed, have an instruction to increment variables in ->memory. >This is good information, but how many cycles does each of these take? >They are both going to memory, incrementing and returning a value. >...but it is not clear that just using one instruction is faster than using 3. >Anyone have a 680?0 manual handy? Well, we could just resort to first principles and note that to add 1 to a location in memory requires both reading the old value and writing the new value. Even RISC machines are constrained to do that. In fact, the ADDQ is one of the faster instructions, as it is only 16 bits with no extensions. Marc Kaufman (kaufman@Neon.stanford.edu)
Lewis_P@cc.curtin.edu.au (Peter Lewis) (12/18/90)
In article <1990Dec18.001753.3756@Neon.Stanford.EDU>, kaufman@Neon.Stanford.EDU (Marc T. Kaufman) writes: > Well, since we ARE in a Mac group, lets just look at the code MPW C generates > for just such constructs: > > i = i+1; > MOVE.L i,D2 > ADDQ.L #$1,D2 > MOVE.L D2,i > > i++; > MOVE.L i,D2 > ADDQ.L #$1,i > > ++i; > ADDQ.L #$1,i > > Behold. The 68K does, indeed, have an instruction to increment variables in > memory. > > Marc Kaufman (kaufman@Neon.stanford.edu) Since the discusion was started because of the deficiency in Pascal of not having the ++ operator (which, BTW, I agree is a deficiency), I thought it would be interesting to see what THINK Pascal produced. In fact, TP compiles thusly ... var i:integer; begin i:=1; { MOVEQ #1,D7 } i:=i+1; { ADDQ.W #1,D7 } end. So much for C being more efficient than Pascal :-). Of course as someone else pointed out, these are toy examples, and comparing MPW C to THINK Pascal is a bit iffy (someone like to tell us what THINK C produces (with/without a register statement)). But its interesting ... Peter. -- Disclaimer:Curtin & I have an agreement:Neither of us listen to either of us. *-------+---------+---------+---------+---------+---------+---------+-------* Internet: Lewis_P@cc.curtin.edu.au I Peter Lewis ACSnet: Lewis_P@cc.cut.oz.au I NCRPDA, Curtin University Bitnet: Lewis_P%cc.curtin.edu.au@cunyvm.bitnet I GPO Box U1987 UUCP: uunet!munnari.oz!cc.curtin.edu.au!Lewis_P I Perth, WA, 6001, AUSTRALIA Hack: ResEdit ResEdit 2.0b2, change CODE=5, 00091C: 4EBA 02A4 to 4E71 4E71
olson@bootsie.UUCP (Eric Olson) (12/18/90)
In article <49832@cornell.UUCP>, many people calmly discuss: >>>Actually, this would have to be > >>->x++: LOAD X >>-> INC X >>-> STORE X > >>>Unless you have an instruction to increment variables in memory! >>...lets just look at the code MPW C generates for just such constructs: > [Next line quoted out of position -EO] >This is good information, but how many cycles does each of these take? ; For 68000, no wait state memory > >> i = i+1; >> MOVE.L i,D2 ; 16(4/0) >> ADDQ.L #$1,D2 ; 8(1/0) >> MOVE.L D2,i ; 16(2/2) > ;=40(7/2) >> i++; >> MOVE.L i,D2 ; 16(4/0) >> ADDQ.L #$1,i ; 12(1/2) + 12(3/0) > ;=40(8/2) >> ++i; >> ADDQ.L #$1,i ; 12(1/2) + 12(3/0) > ;=24(4/2) ; For 68020, no wait state memory > ; Best Case Cache Case Worst Case >> i = i+1; >> MOVE.L i,D2 ; 3(1/0/0) 7(1/0/0) 9(1/2/0) >> ADDQ.L #$1,D2 ; 0(0/0/0) 2(0/0/0) 3(0/1/0) >> MOVE.L D2,i ; 3(0/0/1) 5(0/0/1) 7(0/1/1) > ;=6(1/0/1) 14(1/0/1) 19(1/4/1) >> i++; >> MOVE.L i,D2 ; 3(1/0/0) 7(1/0/0) 9(1/2/0) >> ADDQ.L #$1,i ; 3(0/0/1) 4(0/0/1) 6(0/1/1) ;+3(1/0/0) 5(1/0/0) 6(1/1/0) > ;=9(2/0/1) 16(2/0/1) 21(2/4/1) >> ++i; >> ADDQ.L #$1,i ; 3(0/0/1) 4(0/0/1) 6(0/1/1) ;+3(1/0/0) 5(1/0/0) 6(1/1/0) > ;=6(1/0/1) 9(1/0/1) 12(1/2/1) For the 68000, the numbers mean: Total Clock Cycles (Read Cycles/Write Cycles) Read Cycles and Write Cycles == 4 Clock Cycles. So, for example, 18(3/1) is 18 clock cycles, of which 12 (4*3) are read cycles, 4 (1*4) are write cycles, and the remainder (2) are cycles required for some internal function of the processor. The assumption that zero wait state memory is used isn't valid for all 68000 based Macintoshes; I can't remember which. For the 68020, the numbers mean: Total Clock Cycles (Read Cycles/Instuction Access Cycles/Write Cycles) Read, Write and Instruction Access Cycles == 3 Clock Cycles. The timings shown for the 68020 assume all operands are longword aligned, a 32-bit data bus, and zero wait state memory. Sorry, I don't have a 68030 manual. So, what does this all mean? 1. 68020s are faster than 68000s. 2. Knowing how fast anything runs on a 68020 is context dependant. 3. Running two instructions takes longer than running one of the two. 4. I'm a sucker when somebody says "Anybody got a manual?" :-) Cheers! -Eric -- Eric K. Olson, Editor, Prepare() NOTE: olson@bootsie.uucp will not work! Lexington Software Design Internet: olson@endor.harvard.edu 72A Lowell St., Lexington, MA 02173 Usenet: harvard!endor!olson (617) 863-9624 Bitnet: OLSON@HARVARD
leonardr@svc.portal.com (Leonard Rosenthol) (12/19/90)
In article <1990Dec18.015258.8631@Neon.Stanford.EDU>, kaufman@Neon.Stanford.EDU (Marc T. Kaufman) writes: In article <1990Dec18.015258.8631@Neon.Stanford.EDU>, you write: > Well, to generate the above code, I declared 'external int i'. The C > compiler DOES put things in registers. However, MPW C still generates > pretty crufty code for i = i+1: > > MOVE.L D2,D0 > ADDQ.L #$1,D0 > MOVE.L D0,D2 > > and this is with optimization ON! > Just out of curiosity, I tried this experiment on our handy UNIX box (which unfortunately is a NeXT running system 2.0) running gcc -O (GNU v1.36/NeXT v3.11) and using the routines: alpha() { int i; i = 1; i = i +1;} beta() { int i; i = 1; i++;} main() {alpha(); beta()} And it generated the following: link fp, #0 unlk fp rts If a return(i) was added to both alpha and beta, then they both generate: link fp, #0 moveq #2, d0 unlk fp rts Seems like a REAL nice optimization, eh?!? Oh, it should be pointed out that the debugger (gdb) requires the link/unlk instructions, and it may be possible to even have THEM optimized out. -- ---------------------------------------------------------------------- + Leonard Rosenthol | Internet: leonardr@sv.portal.com + + Software Ventures | GEnie: MACgician + + MicroPhone II Development Team | AOL: MACgician1 + ----------------------------------------------------------------------
urlichs@smurf.sub.org (Matthias Urlichs) (12/19/90)
In comp.sys.mac.programmer, article <1990Dec18.015258.8631@Neon.Stanford.EDU>,
kaufman@Neon.Stanford.EDU (Marc T. Kaufman) writes:
<
< Well, to generate the above code, I declared 'external int i'. The C
< compiler DOES put things in registers. However, MPW C still generates
< pretty crufty code for i = i+1:
<
< MOVE.L D2,D0
< ADDQ.L #$1,D0
< MOVE.L D0,D2
<
< and this is with optimization ON!
<
MPW Pascal is about as bad.
< [...] I'm going to look at gcc and see if doesn't generate better code.
Substantially; it generates direct ADDQ instructions in all three cases
(x=x+1; x++; ++x), for registers as well as memory locations.
<
While gcc misses a few obvious optimizations like MOVE.L D0,D0 in some cases,
its code quality approaches being comparable to what a human programmer might
create. That can't be said for the MPW compilers, unfortunately, and there's
no Gnu Pascal yet.
(On the other hand, it's relatively easy to translate Pascal to Modula-2, and
the p1 Modula compiler generates good code... ;-) )
--
Matthias Urlichs -- urlichs@smurf.sub.org -- urlichs@smurf.ira.uka.de /(o\
Humboldtstrasse 7 - 7500 Karlsruhe 1 - FRG -- +49+721+621127(0700-2330) \o)/
urlichs@smurf.sub.org (Matthias Urlichs) (12/19/90)
In comp.sys.mac.programmer, article <49832@cornell.UUCP>,
wayner@cello.cs.cornell.edu (Peter Wayner) writes:
<
< This is good information, but how many cycles does each of these take?
< They are both going to memory, incrementing and returning a value. I
< suspect that the latest versions of the 68040 will heavily pipeline
< this operation, perhaps, but it is not clear that just using one instruction
< is faster than using 3.
Don't forget the two memory cycles to read these other instructions, which
also translate to taking more time to read them from disk and effectively
fewer instructions that can be kept in the instruction cache.
The MC68000 manual says that
ADDQ.L #1,(Ax)
will take 20 clock cycles (assuming four cycles per memory access), while
MOVE.L (Ax),Dy
ADDQ.L #1,Dy
MOVE.L Dy,(Ax)
takes 8+8+8=24 cycles. (All of these instructions take up 16 bits.)
--
Matthias Urlichs -- urlichs@smurf.sub.org -- urlichs@smurf.ira.uka.de /(o\
Humboldtstrasse 7 - 7500 Karlsruhe 1 - FRG -- +49+721+621127(0700-2330) \o)/
Invader@cup.portal.com (Michael K Donegan) (12/20/90)
This is all very interesting, but the truth is that there isn't on program in 10,000 for which it matters what code gets generated in this case. And in that one, it only matters in about one percent of its lines. mkd
Lawson.English@p88.f15.n300.z1.fidonet.org (Lawson English) (12/22/90)
Philip Machanick writes in a message to All PM> Many of the programmer-directed "optimizations" in C, like register PM> variables, ought to be unnecessary with a modern optimizing compiler. With pipelining, and global optimization of registers (is this last even possible on a Mac?), the consensus it that the "register" variable days of C or fast fading (though (aside from HyperC) I have yet to see a good optimizing compiler on the Mac. BTW, has anyone thought to time the trap dispatcher for the '020/030 series computers? My work a few years back on an acellerator card indicated that the faster the processor, the more overhead the trap dispatcher gave. Lawson -- Uucp: ...{gatech,ames,rutgers}!ncar!asuvax!stjhmc!300!15.88!Lawson.English Internet: Lawson.English@p88.f15.n300.z1.fidonet.org
Lawson.English@p88.f15.n300.z1.fidonet.org (Lawson English) (12/22/90)
Marc T. Kaufman writes in a message to All MTK> The MOVE.L i,D2 in the i++ case is because the value of the expression MTK> (i++) is (i) before the +1. D2 is dead because we never use the MTK> expression value, and its a shame that the compiler doesn't remove MTK> it. I'm going to look at gcc and see if doesn't generate better MTK> code. With the complexity of today's applications, every little MTK> 5% helps. So are you checking to see how much time is spent in the trap dispatcher? The average program (unless you aren't following the Mac User Interface Guidelines or have massive amounts of number crunching) spends more time there than in any other part of the the program. 20% on a plus, 40+% on a Mac II or higher. Mac optimization is whole different world... Lawson -- Uucp: ...{gatech,ames,rutgers}!ncar!asuvax!stjhmc!300!15.88!Lawson.English Internet: Lawson.English@p88.f15.n300.z1.fidonet.org
keith@Apple.COM (Keith Rollin) (12/24/90)
In article <33019.27737F47@stjhmc.fidonet.org> Lawson.English@p88.f15.n300.z1.fidonet.org (Lawson English) writes: >Marc T. Kaufman writes in a message to All > >MTK> I'm going to look at gcc and see if doesn't generate better >MTK> code. With the complexity of today's applications, every little >MTK> 5% helps. I took a look at one of the samples posted here compiled with both MPW C and gnu C. This was the source: void empty(int i) {}; main() { int i = 1; empty(++i); }; Under MPW C, we get the following: empty(): 00000000: 4E56 0000 LINK A6,#$0000 00000004: 4E5E UNLK A6 00000006: 4E75 RTS main(): 00000000: 4E56 0000 LINK A6,#$0000 00000004: 2F07 MOVE.L D7,-(A7) 0000000e: 7E01 MOVEQ #$01,D7 00000008: 5287 ADDQ.L #$1,D7 0000000A: 2F07 MOVE.L D7,-(A7) 0000000C: 4EBA 0000 JSR empty ; id: 1 00000010: 2E2E FFFC MOVE.L -$0004(A6),D7 00000014: 4E5E UNLK A6 00000016: 4E75 RTS The LINK/UNLK in empty() is interesting. This doesn't occur under MPW 3.1 C, but it seems to be back in MPW 3.2 C. I'm not sure why. Also, when compiling the sample, I got the warning from C that said "Parameter "i" not used within the Body of the function : empty" However, when I removed "i" from the definition header, gC complained with an error that "parameter name omitted". Under gnu C, we get the following: empty(): 00000000: 4E75 RTS main(): 00000000: 4878 0002 PEA $0002 00000004: 4EBA 0000 JSR empty ; id: 5 00000008: 584F ADDQ.W #$4,A7 0000000A: 4E75 RTS Here, we see the same results that one other person noticed when compiling on their NeXT block. Namely, that the i=1;++i; gets optimized to "2". Another interesting thing is that empty() is reduced to a simple RTS. Should an optimizing compiler recognize that this is a null procedure, and remove the call to it altogether? By the way, I also tried out gC on a much larger program. Compiled under MPW 3.2 C, this program was 83K long. Under gC, the program was 81K. Either: a) MPW C is better than we thought b) gC is not as good as we thought c) the author of the program took advantage of constructs that lent themselves to being compiled better, no matter what the compiler was (I know that this alternative is not the case, though) d) or the example I chose just happened to be a bad example ("bad", that is, for anyone trying to show that MPW C is a crummy compiler). -- ------------------------------------------------------------------------------ Keith Rollin --- Apple Computer, Inc. --- Developer Technical Support INTERNET: keith@apple.com UUCP: {decwrl, hoptoad, nsc, sun, amdahl}!apple!keith "Argue for your Apple, and sure enough, it's yours" - Keith Rollin, Contusions
peirce@outpost.UUCP (Michael Peirce) (12/24/90)
In article <47576@apple.Apple.COM>, keith@Apple.COM (Keith Rollin) writes: > By the way, I also tried out gC on a much larger program. Compiled > under MPW 3.2 C, this program was 83K long. Under gC, the program > was 81K. Either: > > a) MPW C is better than we thought > b) gC is not as good as we thought > c) the author of the program took advantage of constructs that lent > themselves to being compiled better, no matter what the compiler was > (I know that this alternative is not the case, though) > d) or the example I chose just happened to be a bad example ("bad", that > is, for anyone trying to show that MPW C is a crummy compiler). I'm no compiler expert (and it probably shows :-), but I think we're barking up the wrong trees with our simple examples. Really good compilers shine in the complex programs. They look at the context of big complex programs and do wonderous things. I remember working with some DEC compilers on VAX/VMS that did such things. One example was some very fancy automatic inlining functions, then it optimizing the result down to very a few instructions. It allowed programmers to write very abstract code and still get the efficency some people only believe you get in C. If a compiler can't figure out that i++ and i=i+1 aren't the same thing, this is just plan stupid. Now, I realize that some compilers are still fairly stupid and we need to pay attention to details in certain critical pieces of code to squeeze the most out our Macs, but someday the compilers on the Mac will catch up with the rest of the industry. (please please please!) Another point to keep in mind is that all the time spend hand tweeking can often be better spent rethinking the algorithm in the first place. Very efficient implementations of poor algorithms will be slower than fair implementations of superior algorithms. -- michael, shooting is mouth off again... -- Michael Peirce -- {apple,decwrl}!claris!outpost!peirce -- Peirce Software -- Suite 301, 719 Hibiscus Place -- Macintosh Programming -- San Jose, California 95117 -- & Consulting -- (408) 244-6554, AppleLink: PEIRCE
freek@fwi.uva.nl (Freek Wiedijk) (12/24/90)
keith@Apple.COM (Keith Rollin) writes: > Another interesting thing is that empty() is >reduced to a simple RTS. Should an optimizing compiler recognize >that this is a null procedure, and remove the call to it altogether? What happens if you declare empty static and compile the program with the flag -finline-functions? I do not have gcc for MPW, so I cannot try it myself. I am very curious what will be left. Freek "the Pistol Major" Wiedijk E-mail: freek@fwi.uva.nl #P:+/ = #+/P?*+/ = i<<*+/P?*+/ = +/i<<**P?*+/ = +/(i<<*P?)*+/ = +/+/(i<<*P?)**
keith@Apple.COM (Keith Rollin) (12/28/90)
In article <1530@carol.fwi.uva.nl> freek@fwi.uva.nl (Freek Wiedijk) writes: >keith@Apple.COM (Keith Rollin) writes: >> Another interesting thing is that empty() is >>reduced to a simple RTS. Should an optimizing compiler recognize >>that this is a null procedure, and remove the call to it altogether? > >What happens if you declare empty static and compile the program with the >flag -finline-functions? I do not have gcc for MPW, so I cannot try it >myself. I am very curious what will be left. Pretty slick! The program: static void empty(int i) {}; main() { int i = 1; empty(++i); }; reduces to: Module: Flags=$00=(Local Code) Module="main%"(1) Segment="Main"(2) 00000000: 4E56 0000 'NV..' LINK A6,#$0000 00000004: 4E5E 'N^' UNLK A6 00000006: 4E75 'Nu' RTS Now...I wonder why the LINK/UNLK are still there. I also used -mbg off, so it's not for debugging purposes... -- ------------------------------------------------------------------------------ Keith Rollin --- Apple Computer, Inc. --- Developer Technical Support INTERNET: keith@apple.com UUCP: {decwrl, hoptoad, nsc, sun, amdahl}!apple!keith "Argue for your Apple, and sure enough, it's yours" - Keith Rollin, Contusions
Keith.Rollin@f20.n226.z1.FIDONET.ORG (Keith Rollin) (12/28/90)
Reply-To: keith@Apple.COM In article <1530@carol.fwi.uva.nl> freek@fwi.uva.nl (Freek Wiedijk) writes: >keith@Apple.COM (Keith Rollin) writes: >> Another interesting thing is that empty() is >>reduced to a simple RTS. Should an optimizing compiler recognize >>that this is a null procedure, and remove the call to it altogether? > >What happens if you declare empty static and compile the program with the >flag -finline-functions? I do not have gcc for MPW, so I cannot try it >myself. I am very curious what will be left. Pretty slick! The program: static void empty(int i) {}; main() { int i = 1; empty(++i); }; reduces to: Module: Flags=$00=(Local Code) Module="main%"(1) Segment="Main"(2) 00000000: 4E56 0000 'NV..' LINK A6,#$0000 00000004: 4E5E 'N^' UNLK A6 00000006: 4E75 'Nu' RTS Now...I wonder why the LINK/UNLK are still there. I also used -mbg off, so it's not for debugging purposes... -- ------------------------------------------------------------------------------ Keith Rollin --- Apple Computer, Inc. --- Developer Technical Support INTERNET: keith@apple.com UUCP: {decwrl, hoptoad, nsc, sun, amdahl}!apple!keith "Argue for your Apple, and sure enough, it's yours" - Keith Rollin, Contusions + Organization: Apple Computer Inc., Cupertino, CA -- Keith Rollin - via FidoNet node 1:105/14 UUCP: ...!{uunet!glacier, ..reed.bitnet}!busker!226!20!Keith.Rollin INTERNET: Keith.Rollin@f20.n226.z1.FIDONET.ORG
urlichs@smurf.sub.org (Matthias Urlichs) (12/31/90)
In comp.sys.mac.programmer, article <47610@apple.Apple.COM>, keith@Apple.COM (Keith Rollin) writes: < In article <1530@carol.fwi.uva.nl> freek@fwi.uva.nl (Freek Wiedijk) writes: < > < >What happens if you declare empty static and compile the program with the < >flag -finline-functions? I do not have gcc for MPW, so I cannot try it < >myself. I am very curious what will be left. < < static void empty(int i) {}; < main() { int i = 1; empty(++i); }; < reduces to: < Module: Flags=$00=(Local Code) Module="main%"(1) Segment="Main"(2) < 00000000: 4E56 0000 'NV..' LINK A6,#$0000 < 00000004: 4E5E 'N^' UNLK A6 < 00000006: 4E75 'Nu' RTS < <Now...I wonder why the LINK/UNLK are still there. I also used -mbg off, so <it's not for debugging purposes... < Probably because you didn't use -fomit-frame-pointer. (Procedure entry/exit code doesn't go thru gcc's optimizer.) -- Matthias Urlichs -- urlichs@smurf.sub.org -- urlichs@smurf.ira.uka.de /(o\ Humboldtstrasse 7 - 7500 Karlsruhe 1 - FRG -- +49+721+621127(0700-2330) \o)/
lins@Apple.COM (Chuck Lins) (01/03/91)
In article <47576@apple.Apple.COM> keith@Apple.COM (Keith Rollin) writes: >By the way, I also tried out gC on a much larger program. Compiled >under MPW 3.2 C, this program was 83K long. Under gC, the program >was 81K. Either: Code size is not necessarily a valid measure of code quality. For the 020 and 030 the compiler can generate MORE instructions and yet the code will run faster. Alignment of data to longwords boundaries is very important on these processors. (And both MPW C and Pascal are poor in this regard.) There are other factors as well but there's no sense going into them unless you want to write a compiler. -- Chuck Lins | "Is this the kind of work you'd like to do?" Apple Computer, Inc. | -- Front 242 20525 Mariani Avenue | Internet: lins@apple.com Mail Stop 37-BD | AppleLink: LINS@applelink.apple.com Cupertino, CA 95014 | "Self-proclaimed Object Oberon Evangelist" The intersection of Apple's ideas and my ideas yields the empty set.