[comp.sys.mac] Compiler efficiency

palarson@watdragon.waterloo.edu (Paul Larson) (10/31/87)

I have heard some murmurs (they weren't loud enough to be be termed complaints) that some (many?  all?) compilers for the mac produce code of questionable 
efficiency.  I don't know enough assembly to prove or disprove this statement.  Is there any merit to these murmurs?

Please post replies to comp.sys.mac.  If this is true I should think it deserves net discussion (not to mention _outrage_.)


	Johan Larson
		DA - Devil's Advocate

stew@endor.UUCP (11/02/87)

In article <3987@watdragon.waterloo.edu> palarson@watdragon.waterloo.edu (Paul Larson) writes:
>I have heard some murmurs (they weren't loud enough to be be termed
>complaints) that some (many?  all?) compilers for the mac produce code
>of questionable efficiency.  I don't know enough assembly to prove or
>disprove this statement.  Is there any merit to these murmurs?

Compared to other machines I've worked on, Macintosh C compilers (I don't
have experience with other languages) are a big disappointment, even the
professional MPW C from Green Hills.  They don't even do simple peephole
optimizations, let alone the global optimization necessary for really
good code generation.

Anyone want to work on porting Gnu C to MPW?

Stew
Stew Rubenstein
Cambridge Scientific Computing, Inc.
UUCPnet:    seismo!harvard!rubenstein            CompuServe: 76525,421
Internet:   rubenstein@harvard.harvard.edu       MCIMail:    CSC

gary@fairlight.oz (Gary Evesson) (11/02/87)

In article palarson@watdragon.waterloo.edu (Paul Larson) writes:

>I have heard some murmurs (they weren't loud enough to be be termed complaints)
>that some (many?  all?) compilers for the mac produce code of questionable 
>efficiency.  I don't know enough assembly to prove or disprove this statement.
>Is there any merit to these murmurs?
>
>	Johan Larson
>		DA - Devil's Advocate

The last time I looked at the assembler produced by a Mac compiler, it was
absolutely disgusting. Simple things such as redundant branches were generated.
E.g.
		jmp	.1

		...

	.1	jmp	.2

I believe that some compilers have optimising assemblers, but I don't think
that is really much of an excuse for the above horror.

							gary@fairlight.oz
							Gary Evesson

raylau@dasys1.UUCP (Raymond Lau) (11/03/87)

In article <3987@watdragon.waterloo.edu>, palarson@watdragon.waterloo.edu (Paul Larson) writes:
> I have heard some murmurs (they weren't loud enough to be be termed complaints) that some (many?  all?) compilers for the mac produce code of questionable 
> efficiency.  I don't know enough assembly to prove or disprove this statement.  Is there any merit to these murmurs?
> 
> Please post replies to comp.sys.mac.  If this is true I should think it deserves net discussion (not to mention _outrage_.)

Although not a compiler efficiency problem, I have noticed that when LSC links together a file which uses a certain library, it includes the entire libary even if only one procedure/func in the lib is actually called.  When it links in a project, each object in the project (or whatever you wish to term it) is linked in in its entirety even if only one proc/func w/in it is actually called.

I know that this id documented...  I'd guess that it's probably done bec. it's faster....  and I can understand that.  But when the Build Application (or code/da) is invoked, I feel that we should have the option of having the final form of the prgm optimized.

Before, using the MacTraps w/2.01, I've had relatively small prgms (20-35k) come out 3-8k smaller than they do now under the 2.11 MacTraps.  Those numbers add up
.

-----------------------------------------------------------------------------
Raymond Lau                      {allegra,philabs,cmcl2}!phri\
Big Electric Cat Public Unix           {bellcore,cmcl2}!cucard!dasys1!raylau
New York, NY, USA                               {sun}!hoptoad/         
GEnie:RayLau   Delphi:RaymondLau   CIS:76174,2617
"Take it and StuffIt."

hannon@clio.las.uiuc.edu (11/03/87)

palarson@watdragon.waterloo.edu(Johan Larson) writes in comp.sys.mac

>I have heard some murmurs (they weren't loud enough to be be termed
>complaints) that some (many?  all?) compilers for the mac produce code of 
>questionable efficiency.  I don't know enough assembly to prove or disprove
>this statement.  Is there any merit to these murmurs?
>
	After disassembling some Lightspeed Pascal generated code I can
say without a shadow of a doubt that it is some of the WORST code (if not
the worst) I have ever seen.  Now, I have to admit, that one of the reasons
for this is that LSP is a ONE PASS COMPILER which accounts for its speed, but
that should not effect the code as much as it does...  It is not uncommon to
find a sequence of NOP's in  the code, let alone places where three
instructions are used instead of one.
	I've even had one instance where the code generation was soo poor
that it caused my INIT to crash because it somehow modified the PC!!!
	I do recommend LSP for debugging code, but I refuse to compile 
something that will be published/distributed under LSP (I use MPW becuase it's
code generation is EXCELLENT and it is very code compatable with LSP).


+--------------------------------------------------------------------------+
+                                   |                                      +
+  Leonard Rosenthol                |  USnail:   205 E. Healey  #33        +
+  Halevai Software                 |            Champaign, Il  61820      +
+  GEnie: MACgician                 |                                      +
+  ARPA:  hannon@clio.las.uiuc.edu  |  Bitnet:   3FLOSDQ@UIUCNOSA.BITNET   +
+  {ihnp4|convex|pur-ee}!uiucuxc!clio!hannon                               +
+--------------------------------------------------------------------------+
+ Disclaimer #1: Since I own the company, I can say whatever I want, and   +
+                not be responsible for it!                                +
+                                                                          +
+ Disclaimer #2: Anything I say may be construed as being under the        +
+                jurisdiction of Disclaimer #1                             +
+--------------------------------------------------------------------------+

conybear@moncsbruce.oz (Roland Conybeare) (11/04/87)

	I have a Mac 512ke.  I can fit one project per disk without too much
hassle, but I usually end up with well under 100k of disk space left.  If
an optimising compiler uses up more disk space,  I will have to stick with
the older, unoptimisting version I already have.
	Most of my projects spend far more time in development than in use.  The
savings from optimising appear modest at this stage.  They are of little 
value *to me* (in other words I find other improvements of more pressing
importance).
	Let me propose an alternative that would alleviate the burden of
optimisation by hand.  If rumours of a symbolic debugger are correct,
perhaps we could have a command to take selected program source which has
already been compiled, and append the compiled assembler code.  This would
provide a base for hand optimisation.

Roland Conybeare
(conybear@moncsbruce.oz)

singer@endor.harvard.edu (Richard Siegel) (11/04/87)

In article <1881@dasys1.UUCP> raylau@dasys1.UUCP (Raymond Lau) writes:
>Although not a compiler efficiency problem, I have noticed that when LSC
>links together a file which uses a certain library, it includes the entire 
>libary even if only one procedure/func in the lib is actually called.  
>When it links in a project, each obj
>ect in the project (or whatever you wish to term it) is linked in in 
>its entirety even if only one proc/func w/in it is actually called.

	This is correct, and that is done when doing a Check Link on a project
for the simple reason that it's much faster.

>I know that this id documented...  I'd guess that it's probably done bec. 
>it's faster....  and I can understand that.  But when the Build 
>Application (or code/da) is invoked, I feel that we should have the option 
>of having the final form of the prgm optimized.

	In fact, LightspeedC *does* smart-link on a project-file basis;
if there's a module in a project, and that module contains routines that
never get called, that module will not get built into the final application.
Likewise, if you're using a project file as a library in another 
project, that library will be smart-linked as well when you build an
Application.

>Before, using the MacTraps w/2.01, I've had relatively small prgms (20-35k) 
>come out 3-8k smaller than they do now under the 2.11 MacTraps.  
>Those numbers add up.

	This is for the simple reason that the modules that you use
from MacTraps are bigger; this isn't the linker's fault. The 2.11 MacTraps
is simply larger than the 2.01 MacTraps.


I agree that it would be desirable to link on a procedure-by-procedure
basis, but there are tradeoffs: project files will get bigger, and
build times will get slower.

		--Rich

**The opinions stated herein are my own opinions and do not necessarily
represent the policies or opinions of my employer (THINK Technologies, Inc).

* Richard M. Siegel | {decvax, ucbvax, sun}!harvard!endor!singer    *
* Customer Support  | singer@endor.harvard.edu			    *
* Symantec, THINK Technologies Division.  (No snappy quote)         *

lippin@wheatena (The Apathist) (11/05/87)

Recently singer@endor.UUCP (Richard Siegel) said:
>
>I agree that it would be desirable to link on a procedure-by-procedure
>basis, but there are tradeoffs: project files will get bigger, and
>build times will get slower.

While I chose Lightspeed mostly for it's compile-link-and-run speed,
the build time it uses doesn't matter to me significantly.  At the
most, I use "Build Application" about twice a month.  Doing it that
often, I wouldn't mind if it had to go back and recompile everything,
and take half an hour, if it would give me a well-optimized product.

						--Tom Lippincott
						..ucbvax!bosco!lippin

    "There is no place I know like the land of pure imagination..."
						--W. Wonka

alan@pdn.UUCP (Alan Lovejoy) (11/05/87)

In article <304@fairlight.oz> gary@fairlight.oz (Gary Evesson) writes:
>In article palarson@watdragon.waterloo.edu (Paul Larson) writes:
>
>>I have heard some murmurs (they weren't loud enough to be be termed complaints)
>>that some (many?  all?) compilers for the mac produce code of questionable 
>>efficiency.  I don't know enough assembly to prove or disprove this statement.
>>Is there any merit to these murmurs?

You must remember that it's only been in the last year or two that
really *good* (by mini-/mainframe standards) compilers for the IBM PC
family have become available.  The 68000 in general, and the Mac in
particular, just haven't been successful enough long enough to attract the
critical mass necessary for state-of-the-art compilers to be developed.

The failings of most Mac compilers are as follows:

1) Poor register optimization.  This is a killer, since the architecture
of the 68000 *depends* upon efficient register usage for its
performance.  Also, most "C" compilers won't keep values in registers
unless told specifically by the programmer to use "reg"ister storage
for that variable.  This is one of the reasons for the Mac II's poorer
than expected showing against the '386 machines in the BYTE Benchmarks.

2) Poor use of the available addressing modes.  This, too, is a very  
serious problem.  The 68000 has some 14 addressing modes (even the 386
only has 9).  There is a tendency to use d8(An, Dn) for iterating over
arrays, when (An)+ would be much more efficient.  Most compilers won't
use 'LEA <complicated addressing mode operand>, An' folowed by repeated
references to (An) inside a loop, but instead use <complicated
addressing mode> inside the loop.  The expression 'a = b + c' should
be generated as 'move b(A6),D0; add c(A6),D0; move D0,a(A6)' (assuming
that a, b and c start out in memory and are not referenced again frequently
and/or soon enough to be worth keeping in the registers).  But many
compilers can't do things quite that efficiently.  You'll often see
'move c(A6),D0; move b(A6),D1; add D0,D1; move D1,a(A6)'.  I've seen
worse.  I could go on...

3) Poor run-time data structures and/or techniques.  By this I mean things 
like procedure activation records, stack frames, procedure calling 
conventions, argument passing techniques, local and/or external procedure
calling techniques, local, global and/or external variable reference 
techniques, inefficient jump tables and failure to use in-line procedure
expansion for short procedures (some of these problems are the fault
of the C language and/or the Mac's OS, but not most of them).

4) Few compilers make heavy use of the classical and mostly
machine-independent optimizations such as constant folding, common
sub-expression elimination, copy propogation, dead-code elimination, 
code-motion, induction variable elimination and strength reduction.

5) I don't know of *any* Mac compilers that attempt multiple generation
strategies for the some code block, and pick the best one based on the
estimated number of machine cycles each strategy will require for
execution. (This has to be done dynamically for an optimizing compiler,
because the "code template" for each source language construct is not
a static entity when the optimizer is making changes all over the
place).  

I have a Modula-2/68000 compiler that does all these optimizations, but it
doesn't run on the Mac. Its Sieve code does 10 iterations 
in 1.14 seconds on a 12 MHz 68000 (1 wait state).  You might expect
a time of 1.8 seconds if this code were run on an SE, 0.45 seconds
on the Mac II.  The best Mac compilers I've seen run around 3 seconds
on the SE and around 0.7 seconds on the II for this benchmark.  Go back
to the BYTE benchmark articles and compare these numbers!

--alan@pdn

jwhitnel@csi.UUCP (Jerry Whitnell) (11/05/87)

In article <1380@cartan.Berkeley.EDU> lippin@wheatena.UUCP (Tom Lippincott, ..ucbvax!bosco!lippin) writes:
>While I chose Lightspeed mostly for it's compile-link-and-run speed,
>the build time it uses doesn't matter to me significantly.  At the
>most, I use "Build Application" about twice a month.  Doing it that
>often, I wouldn't mind if it had to go back and recompile everything,
>and take half an hour, if it would give me a well-optimized product.

No, No, don't change it!!! :-)  I have a project I'm working on that
I can't run from the compiler, so I have to build an application
every time (don't ask why).  Make optimization an option so that we have
to wait for the application to be rebuilt only if we want to.

>
>						--Tom Lippincott


Jerry Whitnell				Lizzi Borden took an axe
Communication Solutions, Inc.		And plunged it deep into the VAX;
					Don't you envy people who
					Do all the things You want to do?

raylau@dasys1.UUCP (Raymond Lau) (11/06/87)

> 
> I agree that it would be desirable to link on a procedure-by-procedure
> basis, but there are tradeoffs: project files will get bigger, and
> build times will get slower.
> 
> 		--Rich
> 
Then why not have it as an option on the final build?  Keep the module by
module for the Run....but offer a checkbox for the build....


-----------------------------------------------------------------------------
Raymond Lau                      {allegra,philabs,cmcl2}!phri\
Big Electric Cat Public Unix           {bellcore,cmcl2}!cucard!dasys1!raylau
New York, NY, USA                               {sun}!hoptoad/
GEnie:RayLau   Delphi:RaymondLau   CIS:76174,2617
"Take it and StuffIt."

borton@net1.UUCP (11/12/87)

Thank you Alan for the description of [in]efficient 68000 code generation.

I am curious to hear 'expert' opinions on the code generated by LightspeedC,
MPW C, LS Pascal, and MPW Pascal.  I personally am most familiar with LS C
through TMON, and often wonder if a certain construct is really necessary :-).

-cbb
Chris "Johann" Borton, UC San Diego	...!sdcsvax!borton
					borton@ucsd.edu or BORTON@UCSD.BITNET
Letztes Jahr in Deutschland, nog een jaar hier, en dan naar Amsterdam!
"H = F cubed.  Happiness = Food, Fun, & Friends."  --Steve Wozniak

singer@endor.UUCP (11/12/87)

In article <4285@sdcsvax.UCSD.EDU> borton@net1.UUCP (Chris Borton) writes:
>Thank you Alan for the description of [in]efficient 68000 code generation.
>
>I am curious to hear 'expert' opinions on the code generated by LightspeedC,

	Me too. :-)

I'm not sure I can talk about LSC's code generation, and I know I can't
talk about the runtime constructs, but I'll see what I can say, and post
at a later time.

		--Rich

**The opinions stated herein are my own opinions and do not necessarily
represent the policies or opinions of my employer (THINK Technologies, Inc).

* Richard M. Siegel | {decvax, ucbvax, sun}!harvard!endor!singer    *
* Customer Support  | singer@endor.harvard.edu			    *
* Symantec, THINK Technologies Division.  (No snappy quote)         *

jwhitnel@csi.UUCP (Jerry Whitnell) (11/13/87)

In article <3181@husc6.UUCP> singer@endor.UUCP (Richard Siegel) writes:
|In article <4285@sdcsvax.UCSD.EDU> borton@net1.UUCP (Chris Borton) writes:
|>Thank you Alan for the description of [in]efficient 68000 code generation.
|>
|>I am curious to hear 'expert' opinions on the code generated by LightspeedC,
|
|	Me too. :-)

Ask, and ye shall receive...

The code generated by a compiler is generally detirmined by two phases, the
optmizer and the code generator.  The former generally does machine independent
optimizations such as common subexpression elimination (where if an expression
is performed more then once, the compiler modfies the code to execute the
expression once and save the result in a temporary), code hoisting (where
for each loop, any code that generates the same value every time through the
loop is 'hoisted' out of the loop and done once), etc.  The code generator
is what takes the internal machine independent representation and converts
it to the assembly or machine language of the target machine.

For LightspeedC, the code generator is as good as any I've seen.  I've never
seen it generate a brach-to-branch, which suggests Micheal Kahl did not do
a straight-forward code generator but spent a lot of time on it.  About the
only thing that it needs is a peep-hole optimizer for some cases that
occur between statements such as:

     move.w	d0, d7
     move.w	d7, d0		; This instruction is not needed
     ...

is generated for the expression

    register int rc;

    rc = function()
    if (rc == 0x80)

(Note, I don't have the code in front of me so this example may not generate
the code I'm thinking of, but the problem exists in this area).

The thing LightspeedC lacks is an optimizer.  The reason is that optmizers
are large and (usually) slow.  So for testing, which is probably 99% of
the compiles done, a optimizer is not needed since it doesn't matter how
fast a program doesn't work :-).   The lack of an optimizer in LightspeedC
is why MPW will generate smaller code then LightspeedC.  The code will usually
be faster but MPW loses a lot here by using 32 bit ints vs LightspeedC's
16 bit ints.

|
|* Richard M. Siegel | {decvax, ucbvax, sun}!harvard!endor!singer    *


Jerry Whitnell				Lizzi Borden took an axe
Communication Solutions, Inc.		And plunged it deep into the VAX;
					Don't you envy people who
					Do all the things You want to do?

stew@endor.harvard.edu (Stew Rubenstein) (11/14/87)

In article <4285@sdcsvax.UCSD.EDU> borton@net1.UUCP (Chris Borton) writes:
>Thank you Alan for the description of [in]efficient 68000 code generation.
>
>I am curious to hear 'expert' opinions on the code generated by LightspeedC,
>MPW C, LS Pascal, and MPW Pascal.  I personally am most familiar with LS C
>through TMON, and often wonder if a certain construct is really necessary :-).

I second Alan's comments.  I have used both LightspeedC and MPW C
extensively.  MPW C generally produces better code than LightspeedC,
mostly because it automatically puts things in registers.  On the other
hand, since ints are 32 bits in MPW C, it produces a whole lot of
unnecessary EXT instructions, particularly when working with Mac
toolbox calls, most of which work with 16 bit INTEGERs.  Liberal
use of the "register" storage class in LightspeedC produces code
which is nearly as good as MPW.

Both of them are crummy.

Stew Rubenstein
Cambridge Scientific Computing, Inc.
UUCPnet:    seismo!harvard!rubenstein            CompuServe: 76525,421
Internet:   rubenstein@harvard.harvard.edu       MCIMail:    CSC

singer@endor.harvard.edu (THINK Technologies) (11/23/87)

In article <17000058@clio> hannon@clio.las.uiuc.edu writes:
>	After disassembling some Lightspeed Pascal generated code I can
>say without a shadow of a doubt that it is some of the WORST code (if not
>the worst) I have ever seen.  Now, I have to admit, that one of the reasons
>for this is that LSP is a ONE PASS COMPILER which accounts for its speed, but
>that should not effect the code as much as it does...  It is not uncommon to
>find a sequence of NOP's in  the code, let alone places where three
>instructions are used instead of one.

	Lightspeed Pascal's compiler is a one-pass compiler, which means
that no optimization is done; this means that in many cases you'll see
instances of three instructions being used where one would serve.
The NOPs in the code used only for procedures that don't need to save
the machine registers... Granted that this is not good style, but it
doesn't affect the overall efficiency of the code.

>	I've even had one instance where the code generation was soo poor
>that it caused my INIT to crash because it somehow modified the PC!!!

	If you can show me a reproducible example where Lightspeed Pascal
generated bad code, I would be most happy to see it. The few instances where
garbage code gets generated are well documented, and this is not one of them.

>+  Leonard Rosenthol                |  USnail:   205 E. Healey  #33        +

		--Rich




**The opinions stated herein are my own opinions and do not necessarily
represent the policies or opinions of my employer (THINK Technologies).

* Richard M. Siegel | {decvax, ucbvax, sun}!harvard!endor!singer    *
* Customer Support  | singer@endor.harvard.edu			    *
* Symantec, THINK Technologies Division.  (No snappy quote)         *

newbery@comp.vuw.ac.nz (Michael Newbery) (11/26/87)

A paper was presented at a local Apple Consortium Developers conference a
couple of days ago on the efficiency of four Pascal compilers:
MPW, LSP, Turbo & TML.
The authors can be reached via michael@otago.ac.nz (internet address)
but in summary, MPW is the best of a bad bunch (except for string assignments).
AVOID sets if using LSP or TML, the code is horrendous.
None of them are up to things like constant sub-expression elimination and
auto-increment/decrement seems to be an almost unknown address mode.
-- 
Michael Newbery

ACSnet:	newbery@vuwcomp.nz  UUCP: newbery@vuwcomp
Une boule qui roule tue les poules.		(Landslides kill chickens)