[comp.sys.mac] Mac C Compilers, Benchmarks, Stupidity

newton@cit-vax.Caltech.Edu (Mike Newton) (08/11/87)

Hi --

This (rather long message) will hopefully save a fair number of people some
money when buying compilers.  It is also a rather strong flame against
current Mac compilers.  I suspect this is largely a result of the
market-place.  With 10 times as many customers buying %*&*&&^%& 8xxx86
systems, a lot more time and effort goes into producing more competitive
80(*&^*&^%$^%)86 compilers.

SUMMARY (for those that dont want to read the whole message): If your'e
going to get A/UX for your mac II, DONT buy a compiler ... get a copy of
Gnu CC and send a contribution.  For those planning to run native Mac OS,
bitch to Apple and others about the state of the compilers.  You are
wasting your machine.....  (BTW: I have heard that Apple distributes
a C compiler with A/UX, but the compiler that they used for all of their
work (the Green Hills compiler) cost much extra).  

Probably like a lot of Mac II buyers, when I saw the latest issue of Byte,
I was very disappointed.  The article causing this disapointment was the
one comparing the Mac II vs. the 80386 based PS2/80. First I was
disappointed in the article -- I could not tell which compilers were being
used (I may have just not read the article carefully).  From a lot of
experience programming the 8086 and the 68020, I was shocked.  The 68020
__should__ be a faster system, and like a lot of these tests, this seemed
to be more of a comparison of compilers than machines.  (I can provide a
couple of references (some good, some bad) on this.  One of them is an IEEE
article.)

My first reaction was that there might be some mistake.  So, I ported the
Dhrystone benchmark from Unix to the mac (ie: I changed the calls to the
timer routines and nothing else), and compiled it under MPW C (the Green
Hills Compiler).  At least the MPW compiler produced better code than the
compiler used in the Byte article.  The benchmark clocked at 2777
Dhrystones (faster than the Sun-3/52 with the Sun 3.2 cc -O!).

However, this was still 10-20 percent slower than the ps2/80.  I couldnt
believe this, so I went and disassembled the compiled code....

      ---> NO WONDER THE PS/2 GETS BETTER TIMINGS THAN THE MAC II. <---
               --->    THE COMPILER PRODUCES SHITTY CODE <---

(and it is the best one currently available.  I hate to see what the others
are like!!!!!!!!!!!!!!)

Before I go on, some disclaimers, comments ...:
	[a] I am a compiler writer myself.  I'm currently working on a
		peephole optimization paper, and have written the code
		generator and run time system  for the fastest running
		version of Prolog.
	[b] I know some of the Green Hills people, and am not particularly
		fond of them.  I pick on their compiler,  but currently
		their compiler produces the best code of any.
	[c] I'm thinking of writing my own C compiler for the Mac someday.
		Unlikely to ever occur,  but . . .
	[d] I DO plan on writing some optimizers.
	[e] This code was compiled on release 2.0B (i think)
	[f] I FUCKING HATE IT WHEN THE COMPILERS WONT GIVE YOU ASSEMBLY, BUT
		INSTEAD GO STRAIGHT TO OBJECT CODE.  (MPW may have an option
		to do this, but it was NOT listed in the documentation that
		I had access to.).
	[g] This message was done after a long day.  There is easily the
		possibility that one or two of my samples below are wrong.
		However, that still leaves MANY!
	[h] I'll send the full disassembled code to anyone that asks and
		that I can get my mailer to send to...
	[i] ALL OF THE SAMPLES SHOWN BELOW COULD BE DETECTED BY A PEEPHOLE
		OPTIMZER.  MORE GLOBAL THINGS ARE HARD TO POINT OUT AND
		PRODUCE EVEN MORE DRAMATIC EFFECTS ON CODE SPEED IF DONE RIGHT.
	[j] It's far easier to point out problems with other peoples
		compilers than to actually write one yourself.
	[k] I havent included the fact that 68020 code was not being produced.
	[l] I hate 8086s.  I programmed them for a year.


So, using tests done on Suns and Macs, I concluded that Gnu CC produced
much better code than Green Hills (or LSC or any of the other current MAC
compilers), and that it was also better than the Sun 'cc'.

In particluar, it really seems as if there was no peephole optimizer when
the following instruction is generated: (the condition codes it sets were
NOT used...):

528: 1000 MOVE.B D0,D0 ; <<--- STUPID !!!!!!!!!!!!!!!!

Anyway, the 'appendix' contains the gory details for anyone that want proof.


- mike


ps: At one of the places that I consult, I had a chance to look at the
Clipper code produced by another version of their compiler.  It showed MANY
of the same problems.  Considering the price of GH compilers, if I were
Apple or Fairchild, I'd feel a little cheated.



Now, some examples from a disassembled copy of the Dhrystone program


;;; _proc0:

. . .
01C: 4EBA 0572     JSR      *+$0574   ; 00590  ; ReadDateTime()
020: 301F          MOVE.W   (A7)+,D0 ; <-- STUPID  (see comments 10 lines below)
022: 48C0          EXT.L    D0   ; <-- STUPID  (see comments 10 lines below)
024: 7A00          MOVEQ    #$00,D5                 ; i = 0 in LOOP
026: 6002          BRA.S    *+$4   ; 2A  ; <-- STUPID, Branch 1 more
028: 5285          ADDQ.L   #$1,D5                  ; i < 500000
02A: 0C85 0000 C350  CMPI.L   #$0000C350,D5
030: 6500 FFF6       BCS      *-$0008   ; 00028  ; no, -- loop
034: 558F            SUBQ.L   #$2,A7
036: 486E FFEC       PEA      $FFEC(A6)
03A: 4EBA 0554       JSR      *+$0556   ; 00590  ; ReadDateTime
03E: 301F            MOVE.W   (A7)+,D0  ; <-- STUPID   Since we dont look at the
040: 48C0            EXT.L    D0  ; <-- STUPID   return value, why do this
042: 202E FFE8       MOVE.L   $FFE8(A6),D0 ; when we are going to overwrite IT!!!!
046: 91AE FFEC       SUB.L    D0,$FFEC(A6)  ; nulltime - nulltime - startime
04A: 4878 002A       PEA      $002A
04E: 4EBA 0ACA       JSR      *+$0ACC   ; 00B1A ; malloc
052: 2B40 F68C       MOVE.L   D0,$F68C(A5)
056: 4878 002A       PEA      $002A
05A: 4EBA 0ABE       JSR      *+$0AC0   ; 00B1A ; malloc
05E: 2B40 F688       MOVE.L   D0,$F688(A5)  ; PtrGlb = (RecordPtr)malloc(...)
062: 206D F688       MOVEA.L  $F688(A5),A0  ; <-- STUPID just move D0 to A0

. . .


08E: 4868 000A       PEA      $000A(A0)
092: 4EBA 0D32       JSR      *+$0D34   ; 00DC6  ; <-- STRCPY is a proc call


. . .

0AE: 4EBA 04E0       JSR      *+$04E2   ; 00590 ; ReadDateTim
0B2: 301F            MOVE.W   (A7)+,D0               ; <-- STUPID (see above)
0B4: 48C0            EXT.L    D0                     ; <-- STUPID

. . .

0D0: 2D48 FFFC       MOVE.L   A0,$FFFC(A6)
0D4: 4FEF 0018       LEA      $0018(A7),A7
0D8: 6000 0130       BRA      *+$0132   ; 0020A ; <-- STUPID (branch to end
                                                        ; end of loop, event though
                                                        ; compiler can detect not to.
0DC: 4EBA 0294       JSR      *+$0296   ; 00372 ; Proc5()
0E0: 4EBA 0278       JSR      *+$027A   ; 0035A ; Proc4()
0E4: 7402            MOVEQ    #$02,D2                ; IntLoc1 = 2;
0E6: 2D42 FFE0       MOVE.L   D2,$FFE0(A6)
001CE: 2D42 FFE4        MOVE.L   D2,$FFE4(A6)

. . .

1D2: 222E FFE0       MOVE.L   $FFE0(A6),D1  ; This only affects a register so this:
1D6: 202E FFE4       MOVE.L   $FFE4(A6),D0  ; <-- STUPID (!) since we KNOW it is in D2
1DA: 4EBA 078E       JSR      *+$0790   ; 0096A
1DE: 2D40 FFF6       MOVE.L   D0,$FFF6(A6)
1E2: 242E FFE4       MOVE.L   $FFE4(A6),D2  ; <-- STUPID (!) (see above)
1E6: 94AE FFF6       SUB.L    $FFF6(A6),D2

. . .


;;; _proc1

26E: 2F0A            MOVE.L   A2,-(A7)
270: 246F 0008       MOVEA.L  $0008(A7),A2        ; structassign(NextRec,*PtrGlb)
274: 2052            MOVEA.L  (A2),A0
276: 226D F688       MOVEA.L  $F688(A5),A1
27A: 7014            MOVEQ    #$14,D0             ; This should be 7 so that
27C: 30D9            MOVE.W   (A1)+,(A0)+         ; <-- STUPID this could be 
27E: 51C8 FFFC       DBF      D0,*-$0002  ; 0027C  ; 32bit moves
282: 7005            MOVEQ    #$05,D0
284: 2540 0006       MOVE.L   D0,$0006(A2)
288: 2052            MOVEA.L  (A2),A0
28A: 216A 0006 0006  MOVE.L   $0006(A2),$0006(A0) ; <-- STUPID (!) previous stmt cant
                                             ; affect memory, so: move.l d0,$6(a0) !!
290: 2052            MOVEA.L  (A2),A0 ; NexRecord.PtrComp = PtrParIn->PtrComp
292: 2092            MOVE.L   (A2),(A0) ; A good compiler (but NOT a peephole analyser)
                             ; could get rid of the next line!!!!!!!!!
294: 2052            MOVEA.L  (A2),A0 ; Proc3(NextRecord.PtrComp);
296: 2F10            MOVE.L   (A0),-(A7)
298: 4EBA 008E       JSR      *+$0090   ; 00328

. . .


2E4: 204A            MOVEA.L  A2,A0
2E6: 7014            MOVEQ    #$14,D0
2E8: 30D9            MOVE.W   (A1)+,(A0)+  ; This could have been 32 bit moves!!
2EA: 51C8 FFFC       DBF      D0,*-$0002  ; 002E8
2EE: 245F            MOVEA.L  (A7)+,A2
2F0: 4E75            RTS      

. . .

;;; _Proc7


3E6: 202F 0004       MOVE.L   $0004(A7),D0
3EA: 222F 0008       MOVE.L   $0008(A7),D1 ; <-- STUPID
3EE: 206F 000C       MOVEA.L  $000C(A7),A0 ; <-- STUPID
3F2: 5480            ADDQ.L   #$2,D0             
3F4: D081            ADD.L    D1,D0        ; <-- ADD.L $08(A7),D0
3F6: 2080            MOVE.L   D0,(A0)      ; <-- MOVE.L D0,$0C(A7)
3F8: 4E75            RTS      
c8()



40A: 2A00            MOVE.L   D0,D5 ; this sequence
40C: 5A85            ADDQ.L   #$5,D5  ; is:
40E: 2005            MOVE.L   D5,D0 ; <-- STUPID STUPID STUPID (see 2 lines above)
410: E580            ASL.L    #$2,D0
412: 2040            MOVEA.L  D0,A0
414: D1C3            ADDA.L   D3,A0
416: 20AF 0024       MOVE.L   $0024(A7),(A0)
41A: 2005            MOVE.L   D5,D0  ; <-- see this
41C: 5280            ADDQ.L   #$1,D0 ; <--
41E: E580            ASL.L    #$2,D0 ; <--
420: 2040            MOVEA.L  D0,A0
422: D1C3            ADDA.L   D3,A0
424: 2005            MOVE.L   D5,D0  ; <-- STUPID
426: E580            ASL.L    #$2,D0 ; <-- COMMON SUBEXPRESSION ELIM
428: 2240            MOVEA.L  D0,A1
42A: D3C3            ADDA.L   D3,A1
42C: 2091            MOVE.L   (A1),(A0)
42E: 2005            MOVE.L   D5,D0
430: 721E            MOVEQ    #$1E,D1  ; <-- STUPID -- just add it to D0 as . . .
432: D081            ADD.L    D1,D0
434: E580            ASL.L    #$2,D0
436: 2040            MOVEA.L  D0,A0
438: D1C3            ADDA.L   D3,A0
43A: 2085            MOVE.L   D5,(A0)
43C: 2C05            MOVE.L   D5,D6
43E: 2005            MOVE.L   D5,D0
440: 2200            MOVE.L   D0,D1  ; . . .this destroys D1 before it is used again
442: 2401            MOVE.L   D1,D2
444: C0FC 00CC       MULU.W   #$00CC,D0

. . .


4A4: 2401            MOVE.L   D1,D2    ; <-- See this??
4A6: C0FC 00CC       MULU.W   #$00CC,D0
4AA: 4841            SWAP     D1
4AC: C2FC 00CC       MULU.W   #$00CC,D1
4B0: 7400            MOVEQ    #$00,D2  ; <-- STUPID LINE ABOVE !! WHY???? 
4B2: D481            ADD.L    D1,D2    ; <-- STUPID 
4B4: 4842            SWAP     D2

. . .


;;; Func1


4D8: 102F 0007       MOVE.B   $0007(A7),D0
4DC: 122F 000B       MOVE.B   $000B(A7),D1
4E0: B001            CMP.B    D1,D0
4E2: 6704            BEQ.S    *+$0006   ; 004E8
4E4: 4201            CLR.B    D1             
4E6: 6002            BRA.S    *+$0004   ; 004EA ; <-- STUPID -- just put 
                                                        ; instructions here
4E8: 7201            MOVEQ    #$01,D1   ; Oh no....
4EA: 7000            MOVEQ    #$00,D0   ; ....
4EC: 1001            MOVE.B   D1,D0     ; <-- STUPID  (THIS IS RIDICULOUS!)
4EE: 4E75            RTS      

. . .

;;; _Func2()

524: 4EBA FFB2       JSR      *-$004C   ; 004D8
528: 1000            MOVE.B   D0,D0   ; <-- STUPID !!!!!!!!!!!!!!!!
52A: 508F            ADDQ.L   #$8,A7


. . .

;;; Func3



57E: 102F 0007       MOVE.B   $0007(A7),D0
582: 0C00 0002       CMPI.B   #$02,D0
586: 6604            BNE.S    *+$0006   ; 0058C
588: 7001            MOVEQ    #$01,D0
58A: 6002            BRA.S    *+$0004   ; 0058E  ; <-- STUPID Branching uncond. to a RTS??
58C: 7000            MOVEQ    #$00,D0
58E: 4E75            RTS      
ke

dsc@izimbra.CSS.GOV (David S. Comay) (08/11/87)

the one thing i remember from the byte article is that the binaries
they used on the mac // were the same ones they used on the mac plus.
namely, they contained strictly 68000 instructions.

dsc

rokicki@rocky.STANFORD.EDU (Tomas Rokicki) (08/11/87)

> SUMMARY (for those that dont want to read the whole message): If your'e
> going to get A/UX for your mac II, DONT buy a compiler ... get a copy of
> Gnu CC and send a contribution.

Well, I don't know.  I was going to port GCC to the Amiga, but I looked
closely at the code it generated first.  As it turns out, Manx 3.4a and
3.4b for the Amiga generate better code than GCC, even though GCC attempts
some pretty hairy optimizations.  Could someone post benchmarks for the
latest Manx compiler for the Mac?  (It should be 3.4 something, not the
old 1.foo or anything.)

Give the Intel processors some credit, however; their architecture may
not be the cleanest, but they are pretty quick.

-tom

ayac071@ut-ngp.UUCP (William T. Douglass) (08/12/87)

In article <46745@seismo.CSS.GOV> dsc@izimbra.CSS.GOV (David S. Comay) writes:
>the one thing i remember from the byte article is that the binaries
>they used on the mac // were the same ones they used on the mac plus.
>namely, they contained strictly 68000 instructions.

I thought that the compiled code used in the BYTE benchmark came from a
Mac SE with a HyperCharger 68020 board, 16 MHz chip with a 7.83 MHz 68881
co-processor.  Seems that the code generated would be the same for both
machines.  Anyone able to clarify this point?

Bill Douglass
ayac071@ngp.UUCP

newton@cit-vax.Caltech.Edu (Mike Newton) (08/12/87)

Considering the lead time until publication, and the only recent availability
of '020' compilers, I am pretty sure that no new 020 instructions were used.

BTW: I did _not_ mean to start a benchmark war (or pc vs mac war).  I am 
hoping that [a] these compilers (MPW, LSC, ...) will feel some pressure
to improve [b] to save some people some money (Gnu is good) and [c] that
the disassembly listing will give people an idea of how their programs
are translated

- mike
-- 
newton@csvax.caltech.edu	{ucbvax!cithep,amdahl}!cit-vax!newton
Caltech 256-80			818-356-6771 (afternoons,nights)
Pasadena CA 91125		Beach Bums Anonymous, Pasadena President

	Life's a beach.  Then you graduate.

wetter@tybalt.caltech.edu (Pierce T. Wetter) (08/12/87)

In article <3560@cit-vax.Caltech.Edu> newton@cit-vax.UUCP (Mike Newton) writes:
>
>
>Probably like a lot of Mac II buyers, when I saw the latest issue of Byte,
>I was very disappointed.  The article causing this disapointment was the
>one comparing the Mac II vs. the 80386 based PS2/80. First I was
>disappointed in the article -- I could not tell which compilers were being
>used (I may have just not read the article carefully).  From a lot of
>experience programming the 8086 and the 68020, I was shocked.  The 68020
>__should__ be a faster system, and like a lot of these tests, this seemed
>to be more of a comparison of compilers than machines.  (I can provide a
>couple of references (some good, some bad) on this.  One of them is an IEEE
>article.)
>
>My first reaction was that there might be some mistake.  So, I ported the
>Dhrystone benchmark from Unix to the mac (ie: I changed the calls to the
>timer routines and nothing else), and compiled it under MPW C (the Green
>Hills Compiler).  At least the MPW compiler produced better code than the
>compiler used in the Byte article.  The benchmark clocked at 2777
>Dhrystones (faster than the Sun-3/52 with the Sun 3.2 cc -O!).
>
   
     The MPW C compiler did NOT until just recently, procduce 68020 code. The
MPW Pascal compiler did. The byte articles probably compared Pascal to C (which
is really unfair since the 68xxx series instruction set looks a lot like C) or
compiled code on a compiler not built for that microprocessor. The only way to
be sure your code comparisons are accurate or even meaningful is to get MPW 2.0
(yes thats 2.0 not 2.0b3 or 2.0d or even 2.1d1, 2.0) and try the following 
command ( c -mc68020 -mc68881 -elems foo.c )  That should give you the fastest
possible benchmarks.
    Pierce Wetter
  Apple Forever!!!!!!!!!!

		(to "The Caissons Go Rolling Along")
Scratch the disks, dump the core,	Shut it down, pull the plug
Roll the tapes across the floor,	Give the core an extra tug
And the system is going to crash.	And the system is going to crash.
Teletypes smashed to bits.		Mem'ry cards, one and all,
Give the scopes some nasty hits		Toss out halfway down the hall
And the system is going to crash.	And the system is going to crash.
And we've also found			Just flip one switch
When you turn the power down,		And the lights will cease to twitch
You turn the disk readers into trash.	And the tape drives will crumble
						in a flash.
Oh, it's so much fun,			When the CPU
Now the CPU won't run			Can print nothing out but "foo,"
And the system is going to crash.	The system is going to crash.

--------------------------------------------

wetter@tybalt.caltech.edu

--------------------------------------------

tim@ism780c.UUCP (Tim Smith) (08/13/87)

In article <3560@cit-vax.Caltech.Edu> newton@cit-vax.UUCP (Mike Newton) writes:
< Probably like a lot of Mac II buyers, when I saw the latest issue of Byte,
< I was very disappointed.  The article causing this disapointment was the
< one comparing the Mac II vs. the 80386 based PS2/80. First I was
				   ^^^^^
< disappointed in the article -- I could not tell which compilers were being
< used (I may have just not read the article carefully).  From a lot of
< experience programming the 8086 and the 68020, I was shocked.  The 68020
			     ^^^^         ^^^^^
< __should__ be a faster system, and like a lot of these tests, this seemed
< to be more of a comparison of compilers than machines.  (I can provide a
< couple of references (some good, some bad) on this.  One of them is an IEEE
< article.)

A 68020 system *IS* usually faster than an 8086 system.  As you note
above, the PS2/80 is an 80386, not an 8086.  The fact that an 8086 is
slow has no bearing whatsoever on what an 80386 can do.
-- 
Tim Smith, Knowledgian		{sdcrdcf,uunet}!ism780c!tim
				tim@ism780c.isc.com

hilfingr@tully.Berkeley.EDU.berkeley.edu (Paul Hilfinger) (08/17/87)

Mike Newton writes:

> ...
> Probably like a lot of Mac II buyers, when I saw the latest issue of Byte,
> I was very disappointed.  The article causing this disapointment was the
> one comparing the Mac II vs. the 80386 based PS2/80. First I was
> disappointed in the article -- I could not tell which compilers were being
> used (I may have just not read the article carefully).  From a lot of
> experience programming the 8086 and the 68020, I was shocked.  The 68020
> __should__ be a faster system, and like a lot of these tests, this seemed
> to be more of a comparison of compilers than machines.  (I can provide a
> couple of references (some good, some bad) on this.  One of them is an IEEE
> article.)
> ...

I showed this article to Robert Dewar at the Courant Institute, NYU, who had the
following reaction.

------

Date: Sun, 16 Aug 87 18:42:54 EDT
From: dewar@acf2.nyu.edu

A couple of comments to whoever wrote this. First I would expect an 80386
to run faster than a 68020 at the same clock rate. The store overlap alone
will improve things. Also I assume that the comparison was on a MAC II
without the memory management chip, does this chip slow things down further?
If so the comparison is unfair in any case, since the 80386 has built in
memory management. Of course the PS2/80 is a fairly poor implementation of
the 80386 -- there are unconditional wait states. The DP386 is generally
faster than the PS2/80, and faster designs with static RAM (e.g. the
PC Limited design), are faster still.

I am quite surprised that anyone would expect the 68020 to be faster than
the 80386, or even as fast, where does this idea come from?

Also the gratuitous comments on the 8086 are of course completely
irrelevant [when talking about the 80386.]

I quite agree that most compilers for BOTH classes of machines are very
poor. This is easy to understand, the compiler markets have largely been
wrecked, and no company can make money selling high quality C compilers
for the end user market. There are just too many people who want cheap
compilers, so this is all the market can provide. My brother has always
complained that he has a multi-million dollar investment depending on
a compiler which costs $400. He has always said that he would be happy
to pay 100 times that amount if it would really make a difference in
support and quality.

Reminds me of the airline market at the moment, where people expect to
be able to go from A to B at ridiculously low fares AND to get first
class service. You can't have it both ways!

[Robert Dewar 
Net Address: dewar@acf2.nyu.edu ]

-------- End of Forwarded Message --------

Paul Hilfinger
University of California

Net Address: hilfingr@ginger.berkeley.edu

alan@pdn.UUCP (08/19/87)

In article <20149@ucbvax.BERKELEY.EDU> hilfingr@tully.Berkeley.EDU.UUCP (Paul Hilfinger) quotes Robert Dewar:
>
>Date: Sun, 16 Aug 87 18:42:54 EDT
>From: dewar@acf2.nyu.edu
>
>
>... First I would expect an 80386
>to run faster than a 68020 at the same clock rate. The store overlap alone

At the same clock rate, same memory access speed, same bus speed and
same algorithm in hand-coded assembly, the 68020 averages twice the
speed of the '386 according to IEEE benchmarks--using the 68020's cache,
68020 opcodes and addressing modes and '386 opcodes and addressing
modes, and using 150ns DRAMS.  Using 45ns SRAMS, the '020 averages 30
per cent faster than the '386.  These are objective, reproducible facts.

>will improve things. Also I assume that the comparison was on a MAC II
>without the memory management chip, does this chip slow things down further?

Yes, the 68851 or 68461 MMUs engender one extra clock cycle per memory
fetch.  Score one for the '386.  

Remember also that all Mac II's come with either the 68461 or the 68851
as standard equipment, and that Sun, Apollo, Masscomp and others produce
machines using 68020/MMU CPU chip sets that are 2 to 3 times faster than
the Mac II, which has TWO wait states (120ns DRAMs).

However, consider the following chart of minimum cycle times for memory
accesses (assuming no MMU for the '020, built-in MMUs for the other
two):

                '386       '020      '030
data in cache    ---          2         1
data in ram        4          3         2

Considering the fact that the '030 has a data cache as well as an
instruction cache, its cycle time superiority is even more significant.

>If so the comparison is unfair in any case, since the 80386 has built in
>memory management. Of course the PS2/80 is a fairly poor implementation of
>the 80386 -- there are unconditional wait states. The DP386 is generally
>faster than the PS2/80, and faster designs with static RAM (e.g. the
>PC Limited design), are faster still.
>
>I am quite surprised that anyone would expect the 68020 to be faster than
>the 80386, or even as fast, where does this idea come from?
>

From people who have done professional, unbiased and complete benchmarks
of the two CPUs.

>[comments on dearth of good compilers for either CPU]

Microsoft C v5.0 is a VERY good compiler, certainly exceeding any
Mac compiler by a long shot.  My Modula-2 compiler for my Stride 440
(68000 12Mhz 1 wait state 120ns drams) produces better code than
any Macintosh compiler for any language (Sieve runs 10 times in
1.14 seconds, for example, which is better than the Mac II's time
as reported in Byte).

>[Robert Dewar 
>Net Address: dewar@acf2.nyu.edu ]
>
--Alan "Heard the latest rumor on the 68040, yet?" Lovejoy

jww@sdcsvax.UCSD.EDU (Joel West) (08/19/87)

While I might disagree with the tone, I certainly agree that
the existing 68000 compilers for the Mac are disappointing
in their use of peephole optimization.  The 68000 is no VAX,
but it has many peephole optimizations that can be done;
while every compiler does not recmove needless code
(like ignoring the function result) is beyond me.

I'd have to disagree on the assembly code output issue.
MPW allows you to DumpObj to reconstruct the assembly
code, which is good enough for most purposes.
-- 
	Joel West  (c/o UCSD)
	Palomar Software, Inc., P.O. Box 2635, Vista, CA  92083
	{ucbvax,ihnp4}!sdcsvax!jww 	jww@sdcsvax.ucsd.edu
   or	ihnp4!crash!palomar!joel	joel@palomar.cts.com

daveb@geac.UUCP (Brown) (08/19/87)

In article <3676@sdcsvax.UCSD.EDU> jww@sdcsvax.UCSD.EDU (Joel West) writes:
>While I might disagree with the tone, I certainly agree that
>the existing 68000 compilers for the Mac are disappointing
>in their use of peephole optimization.  The 68000 is no VAX,
>but it has many peephole optimizations that can be done;
>while every compiler does not remove needless code
>(like ignoring the function result) is beyond me.

It is interesting to note that the Pascal compilers don't seem to be
as bad.  In a time-critical loop, I found that writing an explicit
array addressing calculation in MPW Pascal produced code only one
instruction away from my optimal assembler (a register transfer). I
promptly switched to using the Pascal, since it didn't have the
parameter passing overhead of the quasi-inline assembler inserter.

Can someone with MPW please dump and annotate the inefficencies
in the Pascal compiler, please? pretty please?

--dave (I am _NOT_ a DCB) collier-brown

-- 
 David Collier-Brown.                 {mnetor|yetti|utgpu}!geac!daveb
 Geac Computers International Inc.,   |  Computer Science loses its
 350 Steelcase Road,Markham, Ontario, |  memory (if not its mind)
 CANADA, L3R 1B3 (416) 475-0525 x3279 |  every 6 months.

jww@sdcsvax.UCSD.EDU (Joel West) (08/20/87)

I would note that most Pascal compilers can assume that arrays
are < 32 Kb long, or can at least tell short arrays from long
arrays.  Because of the nature of C, any reasonable compiler
must assume that any array can be any length, including longer
than 32 Kb.

jww@sdcsvax.UCSD.EDU (Joel West) (08/21/87)

Incidentally, I checked with Byte, and they said they'd gotten
more complaints on that one article than anything else ever,
and that complaint had come from both the Mac and PC sides.
Expect a new set of benchmarks in a future issue, although I
have no idea of how they'll go about it differently this time.

daveb@geac.UUCP (Brown) (08/21/87)

In article <3688@sdcsvax.UCSD.EDU> jww@sdcsvax.UCSD.EDU (Joel West) writes:
>I would note that most Pascal compilers can assume that arrays
>are < 32 Kb long, or can at least tell short arrays from long
>arrays.  Because of the nature of C, any reasonable compiler
>must assume that any array can be any length, including longer
>than 32 Kb.
 Actually the compiler was assuming (and demanding!) that the array was
less than 32k bytes.  I wrote an explicit address expression and
looked to see how well the compiler was doing with it.  Surprise!
it did remarkably well.  And the code on either side looked good, too.

  But, back to the orig. request: will someone *please* do the same
stupidity test with MPW or a suitable Mac Pascal?

-- 
 David Collier-Brown.                 {mnetor|yetti|utgpu}!geac!daveb
 Geac Computers International Inc.,   |  Computer Science loses its
 350 Steelcase Road,Markham, Ontario, |  memory (if not its mind)
 CANADA, L3R 1B3 (416) 475-0525 x3279 |  every 6 months.