[comp.sys.mac] Code Optimization

palarson@watdragon.waterloo.edu (Paul Larson) (01/24/88)

Currently, single pass compilers are very popular, due to their speed.  This
speed allows the development cycle to be shortened, sometimes by very
significant amounts.  However, unless I'm very wrong, the speed of the
single pass compiler is gained at the expense of code quality.  One 
immediately obvious solution is a code optimizer, which could be used on a
compiled program once the critical debuggion is done.  However, such a 
code optimizer could also be put to another, and potentially more powerful
use.

The Mac and Amiga worlds are currently in the process of changing from the
68000 processor to the similar, but more powerful 68020.  The 68030 has also
been rumoured to be lurking somewhere on the horizon.  However, regardless of
whether a '000 or a '020 is installed, the same code is used.  Thus, if a 
system is using the more advanced '020, it is using code which was written for
a more primitive, though similar, processor and thus the code does not take
proper advantage of the more advanced features of the more modern processors.
The same agrument applies for the math co-processors, the 68881 and its new
sibling the 68882, which are rapidly becoming more popular on home computers.

The code optimizer, if properly written, could remove the inefficiency of
using primitive code on an advanced processor, simply by modifying the
code to take advantage of the features of the more advanced processor.
Since not everyone has a 68020, it would be left up to the user to decide
whether or not (s)he wants the code of a program modified in such a way.
The optimizer could also be written to rewrite the program code so as
to take advantage of a coprocessor, it one is present.

Is there some fundamental problem with what is described above?  Is anyone
working on such an optimizer?
I welcome your comments and criticisms.

	Johan Larson

denbeste@bbn.COM (Steven Den Beste) (01/25/88)

Johan Larson writes:

> Currently, single pass compilers are very popular, due to their speed.  This
> speed allows the development cycle to be shortened, sometimes by very
> significant amounts.  However, unless I'm very wrong, the speed of the
> single pass compiler is gained at the expense of code quality.  One 
> immediately obvious solution is a code optimizer, which could be used on a
> compiled program once the critical debuggion is done.  However, such a 
> code optimizer could also be put to another, and potentially more powerful
> use.
It depends on what you mean by a "single pass" compiler. If you mean a
single-pass compiler which yields assembly language, then you are sort of
cheating, because the two passes of the assembler represent the second and
third passes of your "one-pass" compiler.

If you mean a single-pass compiler which goes direct to object (or executable -
did I mention the two linking passes above, too?) then the big big problem with
single pass compilers is that they have to keep everything in memory. If your
program won't fit (and it has to be the WHOLE program, too - you can't break it
up unless you have a separate linker (which means two more passes)) then you
are SOL.

The biggest advantage of a multi-pass compiler is that the available memory at
compilation is almost unrelated to the size of the program being compiled (the
only thing which is related is the symbol table size).

If a single-pass compiler operates in memory, there is no fundamental reason
why it can't generate code just as efficient as a multi-pass compiler.

(By the way, I think you must be a Mac person. I don't know of any single pass
C-compilers for the Amiga, and very little development is done in any other
languages for it.)

> The Mac and Amiga worlds are currently in the process of changing from the
> 68000 processor to the similar, but more powerful 68020.  The 68030 has also
> been rumoured to be lurking somewhere on the horizon.  However, regardless of
> whether a '000 or a '020 is installed, the same code is used.  Thus, if a 
> system is using the more advanced '020, it is using code which was written for
> a more primitive, though similar, processor and thus the code does not take
> proper advantage of the more advanced features of the more modern processors.
> The same agrument applies for the math co-processors, the 68881 and its new
> sibling the 68882, which are rapidly becoming more popular on home computers.
Now, I may be wrong about this, but the big advantage of the 68020 isn't so
much new instructions (there are a few, but they aren't commonly used) but the
fact that it executes the old instructions much faster - because it uses a
32-bit internal data path and the 68000 uses a 16-bit path, so it takes twice
as many internal cycles on the 68000 for a given amount of wide data.

The execution speed of a program using the 68000 set, but run on the 68020,
will not be much worse typically than the same program compiled to use the full
68020 set - but both will beat the pants off of a 68000 running the same clock
rate.

The one big exception to this is that the 68000 cannot work with a math
co-processor and the 68020 (and the 68010 before it) can - so for math stuff if
you use the coprocessor instructions you can get orders of magnitude increases
in speed. There are ways of working around this so that the same code works on
both, but takes advantage of the math chip if it is there:

Way #1: All math operations are handled as calls to a run-time-loaded library
of math routines. On a 68000 or a 68020 without math chip, you load a software
floating point library. On the 68020 with math chip, you load one which calls
the chip. (This is what the Amiga does.)

Way #2: Everyone's code uses the math chip instructions, but on a 68000 or an
uninstalled 68020 they trap as illegal instructions - to a special routine
which emulates them in software and returns!

In the former case you pay a small performance hit on the 68020 with math
chip as compared to the latter.

> The code optimizer, if properly written, could remove the inefficiency of
> using primitive code on an advanced processor, simply by modifying the
> code to take advantage of the features of the more advanced processor.
> Since not everyone has a 68020, it would be left up to the user to decide
> whether or not (s)he wants the code of a program modified in such a way.
> The optimizer could also be written to rewrite the program code so as
> to take advantage of a coprocessor, it one is present.
An optimizer cannot run on compiled binary - the problem is indeterminate. The
reason is simple: The optimizer cannot differentiate between data and code.
(And if it optimizes the data, the program dies.) Worse than that, even if it
knew that it was in code, it cannot unambiguously identify the beginnings of
instructions.

In any case, in anything greater than a peephole optimizer, even the raw
assembly language isn't enough - the optimizer must be given information about
the original program.

An example is "common subtree" optimization in a complex
expression: To identify this is difficult and fraught with errors in the
assembly language. The right place to do it is in the parser when building the
expression tree.

Another example (probably closer to your heart) would be "Oh, I can use a 68881
instruction here". Unless your math library emulates the 68881 on a
one-call-per-one-instruction basis, you may not be able to figure out that a
68881 instruction will help - without reference to the original source.
(And if you are operating on binary, you may not know that this particular
subroutine call was a call to the math library, anyway!)

Peep-hole optimizers have their place, but the kind of optimization you are
asking for can't be done that way.

...and even if it could, I gather that you are thinking of this as a COMMERCIAL
thing - and what manufacturer will distribute the assembly language of their
program, complete with labels? (I gather that your idea was that someone buys a
commercial program intended for a 68000, passes it through the magic optimizer,
and all-of-a-sudden it is optimized for the 68020 with math chip.)

> Is there some fundamental problem with what is described above?  Is anyone
> working on such an optimizer?
> I welcome your comments and criticisms.
> 
> 	Johan Larson

It seems to me you've missed an easy answer to your problem: Use your
inefficient fast single-pass compiler while you develop, and when you are done
use the efficient slow multi-pass compiler for the production version.

Why do you need an optimizer at all? Use the one that's built into the
multi-pass compiler.
-- 
Steven C. Den Beste,   Bolt Beranek & Newman, Cambridge MA
denbeste@bbn.com(ARPA/CSNET/UUCP)    harvard!bbn.com!denbeste(UUCP)

cmcmanis%pepper@Sun.COM (Chuck McManis) (01/26/88)

In article <4776@watdragon.waterloo.edu> (Paul Larson) writes:
> Currently, single pass compilers are very popular, due to their speed.  This
> speed allows the development cycle to be shortened, sometimes by very
> significant amounts.  However, unless I'm very wrong, the speed of the
> single pass compiler is gained at the expense of code quality.  One 
> immediately obvious solution is a code optimizer, which could be used on a
> compiled program once the critical debuggion is done. ...
>                                                           ...  Thus, if a 
> system is using the more advanced '020, it is using code which was written for
> a more primitive, though similar, processor and thus the code does not take
> proper advantage of the more advanced features of the more modern processors.
> The same agrument applies for the math co-processors, the 68881 and its new
> sibling the 68882, which are rapidly becoming more popular on home computers.

Interesting point Paul, a couple of comments :

On the Amiga the Lattice C compiler optimizes to use code that would be faster
on the '010 - '020 by using instructions, like dbra, that take advantage of the
caching aspects of these chips. Since this does not impose a penalty on the
68000 code it seems to be a good optimization choice. 

And yes, incremental and single pass compilers cannot implement some 
optimization schemes but they are fast at compiling. So they are really good
for prototyping. Thus I think a more reasonable solution would be to have,
in effect, two compilers living in the same executable. One the incremental
compiler, the other a multipass optimizing compiler. Then you offer the user
a choice of which mode to compile the program in. I believe this is the 
approach that Microsoft has taken with Quick-C. Claiming that it is 100%
code compatible with their regular C compiler, so that once you have a 
working program you can recompile with their regular compiler to get optimized,
space efficient code. 

--Chuck McManis
uucp: {anywhere}!sun!cmcmanis   BIX: cmcmanis  ARPAnet: cmcmanis@sun.com
These opinions are my own and no one elses, but you knew that didn't you.

nick@ccicpg.UUCP (Nick Crossley) (01/27/88)

In article <6252@cc5.bbn.COM> denbeste@bbn.COM (Steven Den Beste) writes:
>Johan Larson writes:
...
>> The Mac and Amiga worlds are currently in the process of changing from the
>> 68000 processor to the similar, but more powerful 68020.
...
>Now, I may be wrong about this, but the big advantage of the 68020 isn't so
>much new instructions (there are a few, but they aren't commonly used) but the
>fact that it executes the old instructions much faster - because it uses a
>32-bit internal data path and the 68000 uses a 16-bit path, so it takes twice
>as many internal cycles on the 68000 for a given amount of wide data.
>
...
>denbeste@bbn.com(ARPA/CSNET/UUCP)    harvard!bbn.com!denbeste(UUCP)

Another significant difference is that the 68020 has several additional
addressing modes, so that double indirection (ie for handles) and scaled
indexing for arrays, amongst other things, can be done in one instruction.
It also has 32-bit multiply and divide instructions, which the 68000 does
not.  This is significant if your integer size is 32 bits (as in MPW C).
A peephole optimiser could conceivably improve things here, but as Steven
Den Beste says, you would really need assembler input for this, not the
finished linked binary program.

Nick Crossley, CCI CPG.
...!uunet!ccicpg!nick

gary@fairlight.oz (Gary Evesson) (01/27/88)

This is an idea that I thought about in the early days of the Mac. I think it's
a good idea, but convincing someone to do it is another question. Writing
optimisers is a art in itself - not an easy thing to do. It doesn't seem that
any of the manufacturers of compilers for these machines are interested
(I would welcome a contradiction from anyone) as far as I can tell. I guess the
trick is to apply enough market pressure for them to do something about the
quality of their code, which I agree is atrocious.

							gary@fairlight.oz
							Gary Evesson

rs4u+@andrew.cmu.edu (Richard Siegel) (01/27/88)

>Does such a wonder compiler exist?  And if so, what is it?  MPW is not
>terribly fast, and LSC (at least the current version) is not renowned
>for its code generation.  (It's also single-pass.)  Will the next version
>of LSC be just such a super-compiler?  Or will it at least have an op-
>timization option, as the first poster suggested?

Unfortunately, I can't comment on future versions of the compiler right now... 
:-)

Sorry.

	--rs

===================================================================
Richard Siegel
THINK Technologies, QA Technician (on leave)

Arpa: rs4u@andrew.cmu.edu
UUCP: {decvax,ucbvax,sun}!andrew.cmu.edu!rs4u
==================================================================

bobc@oophost.UUCP (Bob Campbell) (01/28/88)

While I agree with Steven's conclusions about code optimization I find
that I must take exception to many of his comments about one-pass
compilers.
In article <6252@cc5.bbn.COM> denbeste@bbn.COM (Steven Den Beste) writes:
>...
>cheating, because the two passes of the assembler represent the second and
>third passes of your "one-pass" compiler.

Your analysis of what a "one-pass" compiler is is a little weak.  The
one pass compilers that I am aware of do not generate assembler
output.  Why? because using the assembler removes all of the speed
advantages that were gained by using "one-pass" compiler.  A true one
pass compiler takes one pass using back patching to generate code.  It
does require that the object code reside in memory.  This is the method
used by MacMETH, TML Modula-2, and most likely by the LightSpeed
products (put I can't say for sure about them).  (It should also be
noted that one-pass assemblers exist also.)

>If your program won't fit (and it has to be the WHOLE program, too -
>you can't break it up unless you have a separate linker (which means
>two more passes)) then you are SOL.

This is not true.  MacMETH is a good example of a one pass compiler
that that directly produces loadable code.  By the nature of Modula-2
it only needs to have at most one module in memory at a time.  Granted
this does not produce the smallest programs possible, but it is very
fast.  MacMETH and I assume LightSpeed C have dynamic loading, that
allows the compiled modules to be used together with out requiring a
linker.  In effect each object file becomes a "code" resource (not
exacly like a Mac "CODE" resource but simular), and the one pass
compiler generates a jump table that the dynamic loader uses when
the program is loaded.

>The biggest advantage of a multi-pass compiler is that the available memory at
>compilation is almost unrelated to the size of the program being compiled (the
>only thing which is related is the symbol table size).

This is only partly true.  A multi-pass compiler removes many of the
memory limits places on the compiler.  Which is important for a
optimizing compiler where extra tables of information are needed.
However many code optimizations require many passes over the code to
do their work.

>If a single-pass compiler operates in memory, there is no fundamental reason
>why it can't generate code just as efficient as a multi-pass compiler.

Read the "Dragon Book" especially the chapters on code optimization,
you will find that even if you keep all of the code in memory, you
will need to take several passes over that code, and therefore you
will be a "multi-pass" compiler.

>Now, I may be wrong about this, but the big advantage of the 68020 isn't so
>much new instructions (there are a few, but they aren't commonly used) but the
>fact that it executes the old instructions much faster...

I listened to someone from Motorola speak on this subject, while all
of the 68000 instructions are faster on a 020, the 020 has a number of
instructions and addressing modes that if used can speed up the code
by a noticable amount.

>An optimizer cannot run on compiled binary - the problem is indeterminate. The
>reason is simple: The optimizer cannot differentiate between data and code.

From this point on things are basicly clear, it is not reasonable as
pointed out to optimize binaries, for the clear reason that it is not
possible to tell what is code and what is data.  In many cases an
optimizer could be constructed (if it knew all of the details on how
the compiler generated it's code)  but it would most likely be much
more efficent to build the optimizer into the compiler.

One idea that has been tossed around is the thought of having the
compilers produce assembly language output and then optimize that.
This would produce a language indepented optimizer (execpt that it
would depend on the assembly language), this would however require a
large effort....  In short the solution provided by Steven seems to be
the most correct, assuming that the multi-pass compiler has a good
optimizer (which many don't):

>It seems to me you've missed an easy answer to your problem: Use your
>inefficient fast single-pass compiler while you develop, and when you are done
>use the efficient slow multi-pass compiler for the production version.

Bob Campbell

jwhitnel@csi.UUCP (Jerry Whitnell) (01/28/88)

In article <22039@yale-celray.yale.UUCP> robertj@yale.UUCP writes:
>
>Does such a wonder compiler exist?  And if so, what is it?  MPW is not
>terribly fast, and LSC (at least the current version) is not renowned
>for its code generation.  (It's also single-pass.)  Will the next version
>of LSC be just such a super-compiler?  Or will it at least have an op-
>timization option, as the first poster suggested?

I'm sure Rich will correct me if I'm wrong but the next version of
LightspeedC will not have an optimizer, just a source-level debugger
and (I think) code generation for 020/881.  Availability sometime
before summer.  Please note I do NOT work for Think and
all of the above is based on public comments by employees of Think.
In other words, I don't know nothing about nothing :-).

As another alternative, I'm looking into porting GNU C over to the Mac.
The code generation/optimization is suppose to be better then MPW
(although I havn't seen it yet).  However, based on other GNU programs
I've worked with, this will be a very slow compiler, suited mostly for
final compiles, not debugging.  Porting this compiler is a non-trival exercise
(an understatement if I ever heard one :-)) so don't hold your breath.
If I do decide to go ahead with the port, I'll be looking for others
to help with this port, so I will be keeping you informed on the
progress.
>
>Rob Jellinghaus                | "...And Renee was standing over me saying,

Jerry Whitnell				Been through Hell?
Communication Solutions, Inc.		What did you bring back for me?
						- A. Brilliant

harald@ccicpg.UUCP ( Harald Milne) (01/28/88)

In article <4776@watdragon.waterloo.edu>, palarson@watdragon.waterloo.edu (Paul Larson) writes:
> The Mac and Amiga worlds are currently in the process of changing from the
> 68000 processor to the similar, but more powerful 68020.  The 68030 has also
> been rumoured to be lurking somewhere on the horizon.  However, regardless of
> whether a '000 or a '020 is installed, the same code is used. 

	Yes and no. There should be no difference in code generation on the
Amiga.

> Thus, if a 
> system is using the more advanced '020, it is using code which was written for
> a more primitive, though similar, processor and thus the code does not take
> proper advantage of the more advanced features of the more modern processors.

	Well here is the no answer. The Amiga makes use of shared libraries,
not unlike the shared memory concept in System V UNIX. Just what advanced
features are you talking about? Cache management, floating point coproccessor
interface, or what?

	Cache management is one thing, and the Amiga OS uses it (yes the
cache is turned on), if a 68020 is detected. The OS does this on boot.

	In terms of floating point, things are very different here. Floating
point operations on the Amiga are not compiled "inline", they are called.
If you have a 68000, you can use the standard software emulation (IEEE). If
you have the 68020, and have the 68881 or 68882 present (also detected on boot) you can simply replace the math library, and use 68881/68882 instructions
directly. No recompiles, no changes, existing software is upward compatabible.

> The same agrument applies for the math co-processors, the 68881 and its new
> sibling the 68882, which are rapidly becoming more popular on home computers.

	Uh, they are? What do you have? A MacII? At a $4000.00 base price, this
revolution is not comming soon. We are talking about some expensive hardware!
But after what I read in the latest Unix/World, the 68000 family could fall
out of favor, meaning lower prices for us all. That could make it popular.

> The code optimizer, if properly written, could remove the inefficiency of
> using primitive code on an advanced processor, simply by modifying the
> code to take advantage of the features of the more advanced processor.

	Well you lost me completely here. What advantages are not being used?

> Since not everyone has a 68020, it would be left up to the user to decide
> whether or not (s)he wants the code of a program modified in such a way.
> The optimizer could also be written to rewrite the program code so as
> to take advantage of a coprocessor, it one is present.

	No rewriting needed for the Amiga.

	The Amiga handles all this.

> Is there some fundamental problem with what is described above?  Is anyone
> working on such an optimizer?
> I welcome your comments and criticisms.

	Just what I mentioned so far.

> 	Johan Larson
-- 
Work: Computer Consoles Inc. (CCI), Advanced Development Group (ADG)
      Irvine, CA (RISCy business! Home of the CCI POWER 6/32)
UUCP: uunet!ccicpg!harald

dgold@apple.UUCP (David Goldsmith) (01/29/88)

In article <10000@ccicpg.UUCP> nick@ccicpg.UUCP (Nick Crossley) writes:
>Another significant difference is that the 68020 has several additional
>addressing modes, so that double indirection (ie for handles) and scaled
>indexing for arrays, amongst other things, can be done in one instruction.

Actually, it turns out it's both smaller and faster to dereference a handle
with two separate instructions, just like on the 68000.  The single 68020
instruction is larger and slower than the two separate 68000 instructions.
-- 
David Goldsmith
Apple Computer, Inc.

AppleLink: GOLDSMITH1
UUCP:  {nsc,dual,sun,voder,ucbvax!mtxinu}!apple!dgold
CSNET: dgold@apple.CSNET, dgold%apple@CSNET-RELAY
BIX: dgoldsmith

dillon@CORY.BERKELEY.EDU (Matt Dillon) (01/29/88)

: In terms of floating point, things are very different here. Floating
:point operations on the Amiga are not compiled "inline", they are called.
:If you have a 68000, you can use the standard software emulation (IEEE). If
:you have the 68020, and have the 68881 or 68882 present (also detected on boot) you can simply replace the math library, and use 68881/68882 instructions
:directly. No recompiles, no changes, existing software is upward compatabible.

What do you mean floating point operations are not compiled inline?
The *real* answer is that you can have it both ways... inline if you don't
care about downward compatibility to a 68000, and via shared library calls
if you want downward compatibility (i.e. 68020 machines would have a different
library implementing the same functions).

:> The same agrument applies for the math co-processors, the 68881 and its new
:> sibling the 68882, which are rapidly becoming more popular on home computers.
:
: Uh, they are? What do you have? A MacII? At a $4000.00 base price, this
:revolution is not comming soon. We are talking about some expensive hardware!

Huh? what kind of answer is that? "Since I don't use it because it
costs too much nobody else uses it." ... give me a break!

:> The code optimizer, if properly written, could remove the inefficiency of
:> using primitive code on an advanced processor, simply by modifying the
:> code to take advantage of the features of the more advanced processor.
:
: Well you lost me completely here. What advantages are not being used?

I can think of two right off. (1) The 68020 bit field instructions,
and (2) The 68020's extra addressing modes that allow for automatic pre
multiplication (read: array indexing in one instruction).

-Matt

haitex@pnet01.cts.com (Wade Bickel) (01/29/88)

gary@fairlight.oz (Gary Evesson) writes:
>This is an idea that I thought about in the early days of the Mac. I think it's
>a good idea, but convincing someone to do it is another question. Writing
>optimisers is a art in itself - not an easy thing to do. It doesn't seem that
>any of the manufacturers of compilers for these machines are interested
>(I would welcome a contradiction from anyone) as far as I can tell. I guess the
>trick is to apply enough market pressure for them to do something about the
>quality of their code, which I agree is atrocious.

        Leon Frenkel's Benchmark Modula-2 compiler for the Amiga is fairly
well optimised.  I've had a few long discussions with him as to how to write
optimised source, and it is suprising the degree to which the coder can be
sloppy in his structuring and still get the same results.  Still it is
essential that certain coding guidlines be followed when possible (such as
creating optimized mathmatical expressions).

        I've been trying to convince Leon to implement some kind of programmer
controlled register allocation, and his counter argument has been that it is
non-standard, and in most cases the compiler makes better overall register
allocations than programmers would.  A strong supporting argument on his sid
is that if the coder should change previously optimized code, the compiler
will automatically re-optimize whereas the programmer often would not.
        None-the-less, I think at some point he will include programmer
register allocations via compiler directives, which I suggested (as it avoids
most transportability problems) and I think he will do, eventually...


                                                        Thanks,


                                                            Wade.


UUCP: {cbosgd, hplabs!hp-sdd, sdcsvax, nosc}!crash!pnet01!haitex
ARPA: crash!pnet01!haitex@nosc.mil
INET: haitex@pnet01.CTS.COM

rs4u+@andrew.cmu.edu (Richard Siegel) (01/29/88)

gary@fairlight.oz (Gary Evesson) writes:

"It doesn't seem that any of the manufacturers of compilers for these machines 
are interested (I would welcome a contradiction from anyone) as far as I can 
tell."

	You're going to get your contradiction. :-)

THINK Technologies, for one, is interested in producing high-quality 
compilers. However, your must realize the following: when producing a 
revolutionary language system such as LightspeedC or Lightspeed Pascal, it's 
of the first order of importance to release a product that works correctly: 
there should be as few bugs as possible in the environment and in the 
compiler.
	Writing an optimizing code generator is, as you say, an art. It's damned 
difficult, and it takes a long time to debug such an animal. Had we chosen to 
write LightspeedC or Lightspeed Pascal as multi-pass optimizing compilers, the 
release of both products very well could have been delayed by some 
(indefinite) amount of time. And these products might or might not have had 
compiler bugs.

	As it is now, both compilers have a minimum of codegen bugs, and both work 
fairly reliably. Since I'm no longer on the spot at THINK, I don't know what 
kind of codegenerator work is going on, but I can tell you that we're 
committed to producing high-quality compilers for the Mac.

(I know, it sounds like an advertising plug, but it's true. :-))

		--Rich

===================================================================
Richard Siegel
THINK Technologies, QA Technician (on leave)

Arpa: rs4u@andrew.cmu.edu
UUCP: {decvax,ucbvax,sun}!andrew.cmu.edu!rs4u
==================================================================

rs4u+@andrew.cmu.edu (Richard Siegel) (01/30/88)

jwhitnel@csi.UUCP (Jerry Whitnell) writes:

"I'm sure Rich will correct me if I'm wrong but the next version of
LightspeedC will not have an optimizer, just a source-level debugger
and (I think) code generation for 020/881.  Availability sometime
before summer."

Actually, I won't correct you, even if you *are* wrong. I honestly don't know 
what code-gen changes will be made, and even if I did, I'd likely develop 
laryngitis before I could tell anyone. :-)


	--Rich

===================================================================
Richard Siegel
THINK Technologies, QA Technician (on leave)

Arpa: rs4u@andrew.cmu.edu
UUCP: {decvax,ucbvax,sun}!andrew.cmu.edu!rs4u
==================================================================

dwb@apple.UUCP (David W. Berry) (02/02/88)

In article <8801281857.AA09606@cory.Berkeley.EDU> dillon@CORY.BERKELEY.EDU (Matt Dillon) writes:
>:	In terms of floating point, things are very different here. Floating
>:point operations on the Amiga are not compiled "inline", they are called.
>:If you have a 68000, you can use the standard software emulation (IEEE). If
>:you have the 68020, and have the 68881 or 68882 present (also detected on boot) you can simply replace the math library, and use 68881/68882 instructions
>:directly. No recompiles, no changes, existing software is upward compatabible.
>
>	What do you mean floating point operations are not compiled inline?
>The *real* answer is that you can have it both ways... inline if you don't 
>care about downward compatibility to a 68000, and via shared library calls
>if you want downward compatibility (i.e. 68020 machines would have a different
>library implementing the same functions).
	The same thing is true of the Macintosh if you regard the trap
mechanism as a "shared library" which it really is.  Most of the existing
68020 upgrades for the SE replace the standard traps (which run on a 68000
and even faster on an 020) with direct calls to the 881.  You lose a little
to the extra trap overhead, but you do with shared libraries too.
-- 
	David W. Berry
	dwb@well.uucp                   dwb@Delphi
	dwb@apple.com                   973-5168@408.MaBell
Disclaimer: Apple doesn't even know I have an opinion and certainly
	wouldn't want if they did.

dillon@CORY.BERKELEY.EDU (Matt Dillon) (02/02/88)

>	The same thing is true of the Macintosh if you regard the trap
>mechanism as a "shared library" which it really is.  Most of the existing
>68020 upgrades for the SE replace the standard traps (which run on a 68000
>and even faster on an 020) with direct calls to the 881.  You lose a little
>to the extra trap overhead, but you do with shared libraries too.

	I suppose one could call the mac traps shared libraries.  The 
overhead of a trap, though, is too much for my tastes.  Of course *nothing*
beats inline code, not even an optimal Amiga library call (jsr offset(Ax)/rts)

				-Matt

daveb@geac.UUCP (David Collier-Brown) (02/10/88)

In article <312@oophost.UUCP> bobc@oophost.UUCP (Bob Campbell) writes:
|... analysis of what a "one-pass" compiler is is a little weak.  The
|one pass compilers that I am aware of do not generate assembler
|output.  Why? because using the assembler removes all of the speed
|advantages that were gained by using "one-pass" compiler.  
|
|>An optimizer cannot run on compiled binary - the problem is indeterminate. The
|>reason is simple: The optimizer cannot differentiate between data and code.

  Er... It can run on the output of the code generator if that
output is in something close to relocatable format, or if it is in
linkable format (Unix .o, for example).  It just requires more work.
  If most suppliers write linkable code to a standard format,
writing a post-optimizer becomes more worthwhile...

 --dave
-- 
 David Collier-Brown.                 {mnetor yunexus utgpu}!geac!daveb
 Geac Computers International Inc.,   |  Computer Science loses its
 350 Steelcase Road,Markham, Ontario, |  memory (if not its mind) 
 CANADA, L3R 1B3 (416) 475-0525 x3279 |  every 6 months.