newton@cit-vax.Caltech.Edu (Mike Newton) (08/11/87)
Hi -- This (rather long message) will hopefully save a fair number of people some money when buying compilers. It is also a rather strong flame against current Mac compilers. I suspect this is largely a result of the market-place. With 10 times as many customers buying %*&*&&^%& 8xxx86 systems, a lot more time and effort goes into producing more competitive 80(*&^*&^%$^%)86 compilers. SUMMARY (for those that dont want to read the whole message): If your'e going to get A/UX for your mac II, DONT buy a compiler ... get a copy of Gnu CC and send a contribution. For those planning to run native Mac OS, bitch to Apple and others about the state of the compilers. You are wasting your machine..... (BTW: I have heard that Apple distributes a C compiler with A/UX, but the compiler that they used for all of their work (the Green Hills compiler) cost much extra). Probably like a lot of Mac II buyers, when I saw the latest issue of Byte, I was very disappointed. The article causing this disapointment was the one comparing the Mac II vs. the 80386 based PS2/80. First I was disappointed in the article -- I could not tell which compilers were being used (I may have just not read the article carefully). From a lot of experience programming the 8086 and the 68020, I was shocked. The 68020 __should__ be a faster system, and like a lot of these tests, this seemed to be more of a comparison of compilers than machines. (I can provide a couple of references (some good, some bad) on this. One of them is an IEEE article.) My first reaction was that there might be some mistake. So, I ported the Dhrystone benchmark from Unix to the mac (ie: I changed the calls to the timer routines and nothing else), and compiled it under MPW C (the Green Hills Compiler). At least the MPW compiler produced better code than the compiler used in the Byte article. The benchmark clocked at 2777 Dhrystones (faster than the Sun-3/52 with the Sun 3.2 cc -O!). However, this was still 10-20 percent slower than the ps2/80. I couldnt believe this, so I went and disassembled the compiled code.... ---> NO WONDER THE PS/2 GETS BETTER TIMINGS THAN THE MAC II. <--- ---> THE COMPILER PRODUCES SHITTY CODE <--- (and it is the best one currently available. I hate to see what the others are like!!!!!!!!!!!!!!) Before I go on, some disclaimers, comments ...: [a] I am a compiler writer myself. I'm currently working on a peephole optimization paper, and have written the code generator and run time system for the fastest running version of Prolog. [b] I know some of the Green Hills people, and am not particularly fond of them. I pick on their compiler, but currently their compiler produces the best code of any. [c] I'm thinking of writing my own C compiler for the Mac someday. Unlikely to ever occur, but . . . [d] I DO plan on writing some optimizers. [e] This code was compiled on release 2.0B (i think) [f] I FUCKING HATE IT WHEN THE COMPILERS WONT GIVE YOU ASSEMBLY, BUT INSTEAD GO STRAIGHT TO OBJECT CODE. (MPW may have an option to do this, but it was NOT listed in the documentation that I had access to.). [g] This message was done after a long day. There is easily the possibility that one or two of my samples below are wrong. However, that still leaves MANY! [h] I'll send the full disassembled code to anyone that asks and that I can get my mailer to send to... [i] ALL OF THE SAMPLES SHOWN BELOW COULD BE DETECTED BY A PEEPHOLE OPTIMZER. MORE GLOBAL THINGS ARE HARD TO POINT OUT AND PRODUCE EVEN MORE DRAMATIC EFFECTS ON CODE SPEED IF DONE RIGHT. [j] It's far easier to point out problems with other peoples compilers than to actually write one yourself. [k] I havent included the fact that 68020 code was not being produced. [l] I hate 8086s. I programmed them for a year. So, using tests done on Suns and Macs, I concluded that Gnu CC produced much better code than Green Hills (or LSC or any of the other current MAC compilers), and that it was also better than the Sun 'cc'. In particluar, it really seems as if there was no peephole optimizer when the following instruction is generated: (the condition codes it sets were NOT used...): 528: 1000 MOVE.B D0,D0 ; <<--- STUPID !!!!!!!!!!!!!!!! Anyway, the 'appendix' contains the gory details for anyone that want proof. - mike ps: At one of the places that I consult, I had a chance to look at the Clipper code produced by another version of their compiler. It showed MANY of the same problems. Considering the price of GH compilers, if I were Apple or Fairchild, I'd feel a little cheated. Now, some examples from a disassembled copy of the Dhrystone program ;;; _proc0: . . . 01C: 4EBA 0572 JSR *+$0574 ; 00590 ; ReadDateTime() 020: 301F MOVE.W (A7)+,D0 ; <-- STUPID (see comments 10 lines below) 022: 48C0 EXT.L D0 ; <-- STUPID (see comments 10 lines below) 024: 7A00 MOVEQ #$00,D5 ; i = 0 in LOOP 026: 6002 BRA.S *+$4 ; 2A ; <-- STUPID, Branch 1 more 028: 5285 ADDQ.L #$1,D5 ; i < 500000 02A: 0C85 0000 C350 CMPI.L #$0000C350,D5 030: 6500 FFF6 BCS *-$0008 ; 00028 ; no, -- loop 034: 558F SUBQ.L #$2,A7 036: 486E FFEC PEA $FFEC(A6) 03A: 4EBA 0554 JSR *+$0556 ; 00590 ; ReadDateTime 03E: 301F MOVE.W (A7)+,D0 ; <-- STUPID Since we dont look at the 040: 48C0 EXT.L D0 ; <-- STUPID return value, why do this 042: 202E FFE8 MOVE.L $FFE8(A6),D0 ; when we are going to overwrite IT!!!! 046: 91AE FFEC SUB.L D0,$FFEC(A6) ; nulltime - nulltime - startime 04A: 4878 002A PEA $002A 04E: 4EBA 0ACA JSR *+$0ACC ; 00B1A ; malloc 052: 2B40 F68C MOVE.L D0,$F68C(A5) 056: 4878 002A PEA $002A 05A: 4EBA 0ABE JSR *+$0AC0 ; 00B1A ; malloc 05E: 2B40 F688 MOVE.L D0,$F688(A5) ; PtrGlb = (RecordPtr)malloc(...) 062: 206D F688 MOVEA.L $F688(A5),A0 ; <-- STUPID just move D0 to A0 . . . 08E: 4868 000A PEA $000A(A0) 092: 4EBA 0D32 JSR *+$0D34 ; 00DC6 ; <-- STRCPY is a proc call . . . 0AE: 4EBA 04E0 JSR *+$04E2 ; 00590 ; ReadDateTim 0B2: 301F MOVE.W (A7)+,D0 ; <-- STUPID (see above) 0B4: 48C0 EXT.L D0 ; <-- STUPID . . . 0D0: 2D48 FFFC MOVE.L A0,$FFFC(A6) 0D4: 4FEF 0018 LEA $0018(A7),A7 0D8: 6000 0130 BRA *+$0132 ; 0020A ; <-- STUPID (branch to end ; end of loop, event though ; compiler can detect not to. 0DC: 4EBA 0294 JSR *+$0296 ; 00372 ; Proc5() 0E0: 4EBA 0278 JSR *+$027A ; 0035A ; Proc4() 0E4: 7402 MOVEQ #$02,D2 ; IntLoc1 = 2; 0E6: 2D42 FFE0 MOVE.L D2,$FFE0(A6) 001CE: 2D42 FFE4 MOVE.L D2,$FFE4(A6) . . . 1D2: 222E FFE0 MOVE.L $FFE0(A6),D1 ; This only affects a register so this: 1D6: 202E FFE4 MOVE.L $FFE4(A6),D0 ; <-- STUPID (!) since we KNOW it is in D2 1DA: 4EBA 078E JSR *+$0790 ; 0096A 1DE: 2D40 FFF6 MOVE.L D0,$FFF6(A6) 1E2: 242E FFE4 MOVE.L $FFE4(A6),D2 ; <-- STUPID (!) (see above) 1E6: 94AE FFF6 SUB.L $FFF6(A6),D2 . . . ;;; _proc1 26E: 2F0A MOVE.L A2,-(A7) 270: 246F 0008 MOVEA.L $0008(A7),A2 ; structassign(NextRec,*PtrGlb) 274: 2052 MOVEA.L (A2),A0 276: 226D F688 MOVEA.L $F688(A5),A1 27A: 7014 MOVEQ #$14,D0 ; This should be 7 so that 27C: 30D9 MOVE.W (A1)+,(A0)+ ; <-- STUPID this could be 27E: 51C8 FFFC DBF D0,*-$0002 ; 0027C ; 32bit moves 282: 7005 MOVEQ #$05,D0 284: 2540 0006 MOVE.L D0,$0006(A2) 288: 2052 MOVEA.L (A2),A0 28A: 216A 0006 0006 MOVE.L $0006(A2),$0006(A0) ; <-- STUPID (!) previous stmt cant ; affect memory, so: move.l d0,$6(a0) !! 290: 2052 MOVEA.L (A2),A0 ; NexRecord.PtrComp = PtrParIn->PtrComp 292: 2092 MOVE.L (A2),(A0) ; A good compiler (but NOT a peephole analyser) ; could get rid of the next line!!!!!!!!! 294: 2052 MOVEA.L (A2),A0 ; Proc3(NextRecord.PtrComp); 296: 2F10 MOVE.L (A0),-(A7) 298: 4EBA 008E JSR *+$0090 ; 00328 . . . 2E4: 204A MOVEA.L A2,A0 2E6: 7014 MOVEQ #$14,D0 2E8: 30D9 MOVE.W (A1)+,(A0)+ ; This could have been 32 bit moves!! 2EA: 51C8 FFFC DBF D0,*-$0002 ; 002E8 2EE: 245F MOVEA.L (A7)+,A2 2F0: 4E75 RTS . . . ;;; _Proc7 3E6: 202F 0004 MOVE.L $0004(A7),D0 3EA: 222F 0008 MOVE.L $0008(A7),D1 ; <-- STUPID 3EE: 206F 000C MOVEA.L $000C(A7),A0 ; <-- STUPID 3F2: 5480 ADDQ.L #$2,D0 3F4: D081 ADD.L D1,D0 ; <-- ADD.L $08(A7),D0 3F6: 2080 MOVE.L D0,(A0) ; <-- MOVE.L D0,$0C(A7) 3F8: 4E75 RTS c8() 40A: 2A00 MOVE.L D0,D5 ; this sequence 40C: 5A85 ADDQ.L #$5,D5 ; is: 40E: 2005 MOVE.L D5,D0 ; <-- STUPID STUPID STUPID (see 2 lines above) 410: E580 ASL.L #$2,D0 412: 2040 MOVEA.L D0,A0 414: D1C3 ADDA.L D3,A0 416: 20AF 0024 MOVE.L $0024(A7),(A0) 41A: 2005 MOVE.L D5,D0 ; <-- see this 41C: 5280 ADDQ.L #$1,D0 ; <-- 41E: E580 ASL.L #$2,D0 ; <-- 420: 2040 MOVEA.L D0,A0 422: D1C3 ADDA.L D3,A0 424: 2005 MOVE.L D5,D0 ; <-- STUPID 426: E580 ASL.L #$2,D0 ; <-- COMMON SUBEXPRESSION ELIM 428: 2240 MOVEA.L D0,A1 42A: D3C3 ADDA.L D3,A1 42C: 2091 MOVE.L (A1),(A0) 42E: 2005 MOVE.L D5,D0 430: 721E MOVEQ #$1E,D1 ; <-- STUPID -- just add it to D0 as . . . 432: D081 ADD.L D1,D0 434: E580 ASL.L #$2,D0 436: 2040 MOVEA.L D0,A0 438: D1C3 ADDA.L D3,A0 43A: 2085 MOVE.L D5,(A0) 43C: 2C05 MOVE.L D5,D6 43E: 2005 MOVE.L D5,D0 440: 2200 MOVE.L D0,D1 ; . . .this destroys D1 before it is used again 442: 2401 MOVE.L D1,D2 444: C0FC 00CC MULU.W #$00CC,D0 . . . 4A4: 2401 MOVE.L D1,D2 ; <-- See this?? 4A6: C0FC 00CC MULU.W #$00CC,D0 4AA: 4841 SWAP D1 4AC: C2FC 00CC MULU.W #$00CC,D1 4B0: 7400 MOVEQ #$00,D2 ; <-- STUPID LINE ABOVE !! WHY???? 4B2: D481 ADD.L D1,D2 ; <-- STUPID 4B4: 4842 SWAP D2 . . . ;;; Func1 4D8: 102F 0007 MOVE.B $0007(A7),D0 4DC: 122F 000B MOVE.B $000B(A7),D1 4E0: B001 CMP.B D1,D0 4E2: 6704 BEQ.S *+$0006 ; 004E8 4E4: 4201 CLR.B D1 4E6: 6002 BRA.S *+$0004 ; 004EA ; <-- STUPID -- just put ; instructions here 4E8: 7201 MOVEQ #$01,D1 ; Oh no.... 4EA: 7000 MOVEQ #$00,D0 ; .... 4EC: 1001 MOVE.B D1,D0 ; <-- STUPID (THIS IS RIDICULOUS!) 4EE: 4E75 RTS . . . ;;; _Func2() 524: 4EBA FFB2 JSR *-$004C ; 004D8 528: 1000 MOVE.B D0,D0 ; <-- STUPID !!!!!!!!!!!!!!!! 52A: 508F ADDQ.L #$8,A7 . . . ;;; Func3 57E: 102F 0007 MOVE.B $0007(A7),D0 582: 0C00 0002 CMPI.B #$02,D0 586: 6604 BNE.S *+$0006 ; 0058C 588: 7001 MOVEQ #$01,D0 58A: 6002 BRA.S *+$0004 ; 0058E ; <-- STUPID Branching uncond. to a RTS?? 58C: 7000 MOVEQ #$00,D0 58E: 4E75 RTS ke
dsc@izimbra.CSS.GOV (David S. Comay) (08/11/87)
the one thing i remember from the byte article is that the binaries they used on the mac // were the same ones they used on the mac plus. namely, they contained strictly 68000 instructions. dsc
rokicki@rocky.STANFORD.EDU (Tomas Rokicki) (08/11/87)
> SUMMARY (for those that dont want to read the whole message): If your'e > going to get A/UX for your mac II, DONT buy a compiler ... get a copy of > Gnu CC and send a contribution. Well, I don't know. I was going to port GCC to the Amiga, but I looked closely at the code it generated first. As it turns out, Manx 3.4a and 3.4b for the Amiga generate better code than GCC, even though GCC attempts some pretty hairy optimizations. Could someone post benchmarks for the latest Manx compiler for the Mac? (It should be 3.4 something, not the old 1.foo or anything.) Give the Intel processors some credit, however; their architecture may not be the cleanest, but they are pretty quick. -tom
ayac071@ut-ngp.UUCP (William T. Douglass) (08/12/87)
In article <46745@seismo.CSS.GOV> dsc@izimbra.CSS.GOV (David S. Comay) writes: >the one thing i remember from the byte article is that the binaries >they used on the mac // were the same ones they used on the mac plus. >namely, they contained strictly 68000 instructions. I thought that the compiled code used in the BYTE benchmark came from a Mac SE with a HyperCharger 68020 board, 16 MHz chip with a 7.83 MHz 68881 co-processor. Seems that the code generated would be the same for both machines. Anyone able to clarify this point? Bill Douglass ayac071@ngp.UUCP
newton@cit-vax.Caltech.Edu (Mike Newton) (08/12/87)
Considering the lead time until publication, and the only recent availability of '020' compilers, I am pretty sure that no new 020 instructions were used. BTW: I did _not_ mean to start a benchmark war (or pc vs mac war). I am hoping that [a] these compilers (MPW, LSC, ...) will feel some pressure to improve [b] to save some people some money (Gnu is good) and [c] that the disassembly listing will give people an idea of how their programs are translated - mike -- newton@csvax.caltech.edu {ucbvax!cithep,amdahl}!cit-vax!newton Caltech 256-80 818-356-6771 (afternoons,nights) Pasadena CA 91125 Beach Bums Anonymous, Pasadena President Life's a beach. Then you graduate.
wetter@tybalt.caltech.edu (Pierce T. Wetter) (08/12/87)
In article <3560@cit-vax.Caltech.Edu> newton@cit-vax.UUCP (Mike Newton) writes: > > >Probably like a lot of Mac II buyers, when I saw the latest issue of Byte, >I was very disappointed. The article causing this disapointment was the >one comparing the Mac II vs. the 80386 based PS2/80. First I was >disappointed in the article -- I could not tell which compilers were being >used (I may have just not read the article carefully). From a lot of >experience programming the 8086 and the 68020, I was shocked. The 68020 >__should__ be a faster system, and like a lot of these tests, this seemed >to be more of a comparison of compilers than machines. (I can provide a >couple of references (some good, some bad) on this. One of them is an IEEE >article.) > >My first reaction was that there might be some mistake. So, I ported the >Dhrystone benchmark from Unix to the mac (ie: I changed the calls to the >timer routines and nothing else), and compiled it under MPW C (the Green >Hills Compiler). At least the MPW compiler produced better code than the >compiler used in the Byte article. The benchmark clocked at 2777 >Dhrystones (faster than the Sun-3/52 with the Sun 3.2 cc -O!). > The MPW C compiler did NOT until just recently, procduce 68020 code. The MPW Pascal compiler did. The byte articles probably compared Pascal to C (which is really unfair since the 68xxx series instruction set looks a lot like C) or compiled code on a compiler not built for that microprocessor. The only way to be sure your code comparisons are accurate or even meaningful is to get MPW 2.0 (yes thats 2.0 not 2.0b3 or 2.0d or even 2.1d1, 2.0) and try the following command ( c -mc68020 -mc68881 -elems foo.c ) That should give you the fastest possible benchmarks. Pierce Wetter Apple Forever!!!!!!!!!! (to "The Caissons Go Rolling Along") Scratch the disks, dump the core, Shut it down, pull the plug Roll the tapes across the floor, Give the core an extra tug And the system is going to crash. And the system is going to crash. Teletypes smashed to bits. Mem'ry cards, one and all, Give the scopes some nasty hits Toss out halfway down the hall And the system is going to crash. And the system is going to crash. And we've also found Just flip one switch When you turn the power down, And the lights will cease to twitch You turn the disk readers into trash. And the tape drives will crumble in a flash. Oh, it's so much fun, When the CPU Now the CPU won't run Can print nothing out but "foo," And the system is going to crash. The system is going to crash. -------------------------------------------- wetter@tybalt.caltech.edu --------------------------------------------
tim@ism780c.UUCP (Tim Smith) (08/13/87)
In article <3560@cit-vax.Caltech.Edu> newton@cit-vax.UUCP (Mike Newton) writes:
< Probably like a lot of Mac II buyers, when I saw the latest issue of Byte,
< I was very disappointed. The article causing this disapointment was the
< one comparing the Mac II vs. the 80386 based PS2/80. First I was
^^^^^
< disappointed in the article -- I could not tell which compilers were being
< used (I may have just not read the article carefully). From a lot of
< experience programming the 8086 and the 68020, I was shocked. The 68020
^^^^ ^^^^^
< __should__ be a faster system, and like a lot of these tests, this seemed
< to be more of a comparison of compilers than machines. (I can provide a
< couple of references (some good, some bad) on this. One of them is an IEEE
< article.)
A 68020 system *IS* usually faster than an 8086 system. As you note
above, the PS2/80 is an 80386, not an 8086. The fact that an 8086 is
slow has no bearing whatsoever on what an 80386 can do.
--
Tim Smith, Knowledgian {sdcrdcf,uunet}!ism780c!tim
tim@ism780c.isc.com
hilfingr@tully.Berkeley.EDU.berkeley.edu (Paul Hilfinger) (08/17/87)
Mike Newton writes: > ... > Probably like a lot of Mac II buyers, when I saw the latest issue of Byte, > I was very disappointed. The article causing this disapointment was the > one comparing the Mac II vs. the 80386 based PS2/80. First I was > disappointed in the article -- I could not tell which compilers were being > used (I may have just not read the article carefully). From a lot of > experience programming the 8086 and the 68020, I was shocked. The 68020 > __should__ be a faster system, and like a lot of these tests, this seemed > to be more of a comparison of compilers than machines. (I can provide a > couple of references (some good, some bad) on this. One of them is an IEEE > article.) > ... I showed this article to Robert Dewar at the Courant Institute, NYU, who had the following reaction. ------ Date: Sun, 16 Aug 87 18:42:54 EDT From: dewar@acf2.nyu.edu A couple of comments to whoever wrote this. First I would expect an 80386 to run faster than a 68020 at the same clock rate. The store overlap alone will improve things. Also I assume that the comparison was on a MAC II without the memory management chip, does this chip slow things down further? If so the comparison is unfair in any case, since the 80386 has built in memory management. Of course the PS2/80 is a fairly poor implementation of the 80386 -- there are unconditional wait states. The DP386 is generally faster than the PS2/80, and faster designs with static RAM (e.g. the PC Limited design), are faster still. I am quite surprised that anyone would expect the 68020 to be faster than the 80386, or even as fast, where does this idea come from? Also the gratuitous comments on the 8086 are of course completely irrelevant [when talking about the 80386.] I quite agree that most compilers for BOTH classes of machines are very poor. This is easy to understand, the compiler markets have largely been wrecked, and no company can make money selling high quality C compilers for the end user market. There are just too many people who want cheap compilers, so this is all the market can provide. My brother has always complained that he has a multi-million dollar investment depending on a compiler which costs $400. He has always said that he would be happy to pay 100 times that amount if it would really make a difference in support and quality. Reminds me of the airline market at the moment, where people expect to be able to go from A to B at ridiculously low fares AND to get first class service. You can't have it both ways! [Robert Dewar Net Address: dewar@acf2.nyu.edu ] -------- End of Forwarded Message -------- Paul Hilfinger University of California Net Address: hilfingr@ginger.berkeley.edu
alan@pdn.UUCP (08/19/87)
In article <20149@ucbvax.BERKELEY.EDU> hilfingr@tully.Berkeley.EDU.UUCP (Paul Hilfinger) quotes Robert Dewar: > >Date: Sun, 16 Aug 87 18:42:54 EDT >From: dewar@acf2.nyu.edu > > >... First I would expect an 80386 >to run faster than a 68020 at the same clock rate. The store overlap alone At the same clock rate, same memory access speed, same bus speed and same algorithm in hand-coded assembly, the 68020 averages twice the speed of the '386 according to IEEE benchmarks--using the 68020's cache, 68020 opcodes and addressing modes and '386 opcodes and addressing modes, and using 150ns DRAMS. Using 45ns SRAMS, the '020 averages 30 per cent faster than the '386. These are objective, reproducible facts. >will improve things. Also I assume that the comparison was on a MAC II >without the memory management chip, does this chip slow things down further? Yes, the 68851 or 68461 MMUs engender one extra clock cycle per memory fetch. Score one for the '386. Remember also that all Mac II's come with either the 68461 or the 68851 as standard equipment, and that Sun, Apollo, Masscomp and others produce machines using 68020/MMU CPU chip sets that are 2 to 3 times faster than the Mac II, which has TWO wait states (120ns DRAMs). However, consider the following chart of minimum cycle times for memory accesses (assuming no MMU for the '020, built-in MMUs for the other two): '386 '020 '030 data in cache --- 2 1 data in ram 4 3 2 Considering the fact that the '030 has a data cache as well as an instruction cache, its cycle time superiority is even more significant. >If so the comparison is unfair in any case, since the 80386 has built in >memory management. Of course the PS2/80 is a fairly poor implementation of >the 80386 -- there are unconditional wait states. The DP386 is generally >faster than the PS2/80, and faster designs with static RAM (e.g. the >PC Limited design), are faster still. > >I am quite surprised that anyone would expect the 68020 to be faster than >the 80386, or even as fast, where does this idea come from? > From people who have done professional, unbiased and complete benchmarks of the two CPUs. >[comments on dearth of good compilers for either CPU] Microsoft C v5.0 is a VERY good compiler, certainly exceeding any Mac compiler by a long shot. My Modula-2 compiler for my Stride 440 (68000 12Mhz 1 wait state 120ns drams) produces better code than any Macintosh compiler for any language (Sieve runs 10 times in 1.14 seconds, for example, which is better than the Mac II's time as reported in Byte). >[Robert Dewar >Net Address: dewar@acf2.nyu.edu ] > --Alan "Heard the latest rumor on the 68040, yet?" Lovejoy
jww@sdcsvax.UCSD.EDU (Joel West) (08/19/87)
While I might disagree with the tone, I certainly agree that the existing 68000 compilers for the Mac are disappointing in their use of peephole optimization. The 68000 is no VAX, but it has many peephole optimizations that can be done; while every compiler does not recmove needless code (like ignoring the function result) is beyond me. I'd have to disagree on the assembly code output issue. MPW allows you to DumpObj to reconstruct the assembly code, which is good enough for most purposes. -- Joel West (c/o UCSD) Palomar Software, Inc., P.O. Box 2635, Vista, CA 92083 {ucbvax,ihnp4}!sdcsvax!jww jww@sdcsvax.ucsd.edu or ihnp4!crash!palomar!joel joel@palomar.cts.com
daveb@geac.UUCP (Brown) (08/19/87)
In article <3676@sdcsvax.UCSD.EDU> jww@sdcsvax.UCSD.EDU (Joel West) writes: >While I might disagree with the tone, I certainly agree that >the existing 68000 compilers for the Mac are disappointing >in their use of peephole optimization. The 68000 is no VAX, >but it has many peephole optimizations that can be done; >while every compiler does not remove needless code >(like ignoring the function result) is beyond me. It is interesting to note that the Pascal compilers don't seem to be as bad. In a time-critical loop, I found that writing an explicit array addressing calculation in MPW Pascal produced code only one instruction away from my optimal assembler (a register transfer). I promptly switched to using the Pascal, since it didn't have the parameter passing overhead of the quasi-inline assembler inserter. Can someone with MPW please dump and annotate the inefficencies in the Pascal compiler, please? pretty please? --dave (I am _NOT_ a DCB) collier-brown -- David Collier-Brown. {mnetor|yetti|utgpu}!geac!daveb Geac Computers International Inc., | Computer Science loses its 350 Steelcase Road,Markham, Ontario, | memory (if not its mind) CANADA, L3R 1B3 (416) 475-0525 x3279 | every 6 months.
jww@sdcsvax.UCSD.EDU (Joel West) (08/20/87)
I would note that most Pascal compilers can assume that arrays are < 32 Kb long, or can at least tell short arrays from long arrays. Because of the nature of C, any reasonable compiler must assume that any array can be any length, including longer than 32 Kb.
jww@sdcsvax.UCSD.EDU (Joel West) (08/21/87)
Incidentally, I checked with Byte, and they said they'd gotten more complaints on that one article than anything else ever, and that complaint had come from both the Mac and PC sides. Expect a new set of benchmarks in a future issue, although I have no idea of how they'll go about it differently this time.
daveb@geac.UUCP (Brown) (08/21/87)
In article <3688@sdcsvax.UCSD.EDU> jww@sdcsvax.UCSD.EDU (Joel West) writes: >I would note that most Pascal compilers can assume that arrays >are < 32 Kb long, or can at least tell short arrays from long >arrays. Because of the nature of C, any reasonable compiler >must assume that any array can be any length, including longer >than 32 Kb. Actually the compiler was assuming (and demanding!) that the array was less than 32k bytes. I wrote an explicit address expression and looked to see how well the compiler was doing with it. Surprise! it did remarkably well. And the code on either side looked good, too. But, back to the orig. request: will someone *please* do the same stupidity test with MPW or a suitable Mac Pascal? -- David Collier-Brown. {mnetor|yetti|utgpu}!geac!daveb Geac Computers International Inc., | Computer Science loses its 350 Steelcase Road,Markham, Ontario, | memory (if not its mind) CANADA, L3R 1B3 (416) 475-0525 x3279 | every 6 months.