[comp.sys.amiga.tech] Another question on the 80286- er, 68000 memory models

page@ulowell.UUCP (05/18/88)

It's the 16-bit vs 32-bit int distinction that produces the far/near
large/small models.  I agree, it's yukky.

rminnich@udel.EDU (Ron Minnich) wrote:
>The amiga has shared libraries and device code. So, what model
>do the libraries use?

32-bit ints.  Call that far/large I guess.

>Whoever decided on adding 80286-style models to the 68000- i am gonna
>send SCA after you.

I think it was Manx's idea; they had all those different libraries
a long time ago.

..Bob
-- 
Bob Page, U of Lowell CS Dept.  page@swan.ulowell.edu  ulowell!page

rminnich@udel.EDU (Ron Minnich) (05/18/88)

OK, i have had it explained to me (thanks!) by someone that my worst
fears were true, that i was not somehow misinterpreting things, and
that lattice really does have different compile-time libraries
for their different models. ouch.
   Now there are two code models (let's call them small and large).
There are two data models. There are four combinations. Let's call
them tiny, small, medium, and large (i am a mean guy, aren't i?- 
making a 68000 look like the brain-dead 286!)
   Now the Amiga is not some pea-brained PC running Messy-DOS.
The amiga has shared libraries and device code. So, what model
do the libraries use? Are they tiny, small, or what? Or do we 
need four different versions of each library/device for each
model that uses them? See where this stuff leads? I am
feeling a little depressed about this. Even if we settle on one
model, then everybody has to use that model, or they are 
not compatible. We now have silly limitations like, say, 
MAC DAs. Please, somebody, say it isn't so, then tell
me why. Please, please, please.
   Whoever decided on adding 80286-style models
to the 68000- i am gonna send SCA after you.

-- 
ron (rminnich@udel.edu)

rminnich@udel.EDU (Ron Minnich) (05/20/88)

In article <7146@swan.ulowell.edu> page@swan.ulowell.edu (Bob Page) writes:
>>what model do the libraries/devices use?
>32-bit ints.  Call that far/large I guess.
Hmm, so what does that mean? That small-model programs that call 
the libraries may occasionally guru in strange ways? (e.g. with 
pointers and (hypothetical) functional parameters)
I am more confused, now.
>I think it was Manx's idea; they had all those different libraries
>a long time ago.
ah, that means lattice did it right first, then screwed up :-(
-- 
ron (rminnich@udel.edu)

rminnich@udel.EDU (Ron Minnich) (05/20/88)

In article <8805182223.AA20918@cory.Berkeley.EDU> dillon@CORY.BERKELEY.EDU (Matt Dillon) writes:
>	I.E. *ANY* run-time (shared) library cannot assume that A4 will
>point to it's data when it is called, period... it's one of those things
That is not the sort of thing i am worried about. internet.device
munges a4, for example. But what about functional parameters? 
>and restore it before returning.  Small-code doesn't enter in the picture
>at all because it uses the relative(pc) format which doesn't depend on
>any (standard) register.
ah, so the lattice -r1 switch is useless for the amiga? Sounds like
if you use -r1, then you should not use libraries? So now the
problem is reduced to dealing with the medium and  large models,
and the sort of falls-between-the-holes type questions which 
i am not 100% sure the rules you cite will always address. At least,
they haven't on xenix kernels ...
   No, i have no problem with your discussion. I knew all that 
stuff anyway. But the fun i am having with xenix kernel hacking
has me a little worried, even though i have not been burned yet ...
For example, i put a pointer into a structure. Now if i am
in the small data mode, then maybe what gets put in is 16 bits. 
Then i go to the library which is large model, ... bang!
   Anyway, i will try to work up a more concrete example and see
if this scenario is possible. I sure hope not.And i have not
seen it yet, or been burned by it, but it bothers me somehow.

-- 
ron (rminnich@udel.edu)

rokicki@polya.STANFORD.EDU (Tomas G. Rokicki) (05/21/88)

>    No, i have no problem with your discussion. I knew all that 
> stuff anyway. But the fun i am having with xenix kernel hacking
> has me a little worried, even though i have not been burned yet ...
> For example, i put a pointer into a structure. Now if i am
> in the small data mode, then maybe what gets put in is 16 bits. 
> Then i go to the library which is large model, ... bang!
>    Anyway, i will try to work up a more concrete example and see
> if this scenario is possible. I sure hope not.And i have not
> seen it yet, or been burned by it, but it bothers me somehow.

Let's get this all straight.  There are three independent things
that affect the `model' on the Lattice and Manx compilers:

	* data addressing.  This can either be 32-bit absolute
	  or 16-bit relative.  *Pointers are 32-bits in either
	  case*.  In the latter case, you are only allowed
	  65K of data.  You can mix these two models, so long as
	  any data used by the 16-bit relative stuff is within
	  the first 65K of the data segment (right?)  You should
	  always use 16-bit relative, and if your code has more
	  than 65K of static data, malloc it anyway.  Please.
	  It's real simple to do.  Another problem is in interrupt
	  code, where you need to set up the base register with
	  a call to _geta4() in Manx, and something similar in
	  Lattice (I've forgotten what it is) at the very
	  beginning.

	* code addressing (jsr's.)  These can either be 32-bit
	  absolute or 16-bit relative.  Again, you can mix
	  models.  If using 16-bit relative, then you have
	  a small problem with branches that are farther than
	  +/- 32K.  Manx solves this by indirecting off an
	  automatically created vector list in the data portion
	  of the program.  Lattice does something similar (but
	  I seem to be having troubles with it somehow . . .)
	  Always use 16-bit relative; no sense not to, since
	  the long ones are taken care of automatically, and
	  smaller code size is usually good.  Again, in
	  interrupt code, you need to set up a4 to get access
	  to these fake long branches in 16-bit offset mode.

	* int size.  Both Manx and Lattice allow 16 and 32-bit
	  ints.  Again, you can mix models, but be *very*
	  careful when you do so, and pull in the correct
	  libraries, due to the way args are passed on the
	  stack and the way printf/scanf work.  You should
	  use one or the other and stick with it.

In summary?  The only thing that really matters is int size, and that
you use the correct libraries for the int size you are working with.
The other models should mix freely, assuming you don't have more than
65K of statically allocated data that pushes the compiler's data out
of range.

Someone please correct me where I'm wrong.

-tom
-- 
    /-- Tomas Rokicki         ///  Box 2081  Stanford, CA  94309
   / o  Radical Eye Software ///                  (415) 326-5312
\ /  |  . . . or I       \\\///   Gig 'em, Aggies! (TAMU EE '85)
 V   |  won't get dressed \XX/ Bay area Amiga Developer's GroupE

michael@stb.UUCP (Michael) (05/26/88)

In article <2869@polya.STANFORD.EDU> rokicki@polya.Stanford.EDU (Tomas G. Rokicki) writes:
>	  Always use 16-bit relative; no sense not to, since
>	  the long ones are taken care of automatically, and
>	  smaller code size is usually good.  Again, in

IF you are passing pointers to functions and using them for comparisons,
THEN you should (must?) use LARGE CODE model. (At least for large programs)

Reason: Otherwise, you will be comparing the address of the jump table
in some cases with the address of the routine itself in other cases.
(The pointer will still be 32 bits, but if you are close to the routine
you will get a pointer to the code, while if you are far, you will get a
pointer to the jump table. At least this was true in the past, and I think
its still true, never did check recently)
: --- 
: Michael Gersten			 uunet.uu.net!denwa!stb!michael
:				 ihnp4!hermix!ucla-an!denwa!stb!michael
:				sdcsvax!crash!gryphon!denwa!stb!michael
: "Machine Takeover? Just say no."
: "Sockets? Just say no."     <-- gasoline

peter@sugar.UUCP (Peter da Silva) (06/01/88)

In article ... rokicki@polya.STANFORD.EDU (Tomas G. Rokicki) writes:
> Let's get this all straight.  There are three independent things
> that affect the `model' on the Lattice and Manx compilers:

> 	* data addressing.  This can either be 32-bit absolute
> 	  or 16-bit relative.  *Pointers are 32-bits in either
> 	  case*.

This sounds more like a job for Mr. Optimiser than something the programmer
should have to deal with. You would have as many or as few base registers
as you needed to get quick access to the data, allocated and set up as needed
within the procedure, block, or even statement for which it's appropriate.

No need for a memory model, unless you're used to doing things the intel way.
The whole A4 business is an obvious copy of 8086 small data models.

Use it, but keep on your vendor's tail about doing it right.

> 	* code addressing (jsr's.)  These can either be 32-bit
> 	  absolute or 16-bit relative.  Again, you can mix
> 	  models.

And again, this is an optimisation that should be applied to all code. The
UNIX *assembler* on the PDP-11 did this: you just specified that you were
branching, and it'd generate either the short branch or the long branch
depending on how far away the destination was. Surely a compiler can do this
within a routine and for calls to static globals.

Again, no need for a memory model. Use it, but remember to let your vendor
know that they still need an optimiser.

> 	* int size.  Both Manx and Lattice allow 16 and 32-bit
> 	  ints.

This is the only case that I know of for which the concept of "models" is
appropriate. This is also the case that bites everyone. I would recommend
32-bit mode, for four reasons:

	1) It'll be more efficient when we're using 68020s.
	2) It makes it easier to port VAX-style code that assumes (sizeof int)
	   is the same as (sizeof char *).
	3) The Amiga libraries take 32-bit ints.
	4) Manx did a half-assed implementation of the X3J11 draft: no
	   function prototyping. When they wake up and put it in then the
	   whole thing will become a non-issue.

Point 1 is a bit of a fake. After all, they still don't align the stack
right for 32-bit memory.

> Someone please correct me where I'm wrong.

You're not wrong, just too willing to make allowances for 8086 mentality.

#ifdef HUMOR
The question is: how do you get on Manx' tail? Manxes don't have any tails :->.
And what sort of beast is a Lattice anyway?
#endif

What do the Modulans do?
-- 
-- Peter da Silva      `-_-'      ...!hoptoad!academ!uhnix1!sugar!peter
-- "Have you hugged your U wolf today?" ...!bellcore!tness1!sugar!peter
-- Disclaimer: These may be the official opinions of Hackercorp.

mwm@eris.berkeley.edu (Mike (I'm in love with my car) Meyer) (06/03/88)

In article <2069@sugar.UUCP> peter@sugar.UUCP (Peter da Silva) writes:
<> 	* data addressing.  This can either be 32-bit absolute
<> 	  or 16-bit relative.  *Pointers are 32-bits in either
<> 	  case*.
<
<This sounds more like a job for Mr. Optimiser than something the programmer
<should have to deal with.

Yes, but - the optimiser may not know the answer when it's generating
code. You see, the problems tend to be with globally allocated data,
whose address isn't known until link time. So what's an optimiser
supposed to do?

<> 	* code addressing (jsr's.)  These can either be 32-bit
<> 	  absolute or 16-bit relative.  Again, you can mix
<> 	  models.
<
<And again, this is an optimisation that should be applied to all code. The
<UNIX *assembler* on the PDP-11 did this: you just specified that you were
<branching, and it'd generate either the short branch or the long branch
<depending on how far away the destination was. Surely a compiler can do this
<within a routine and for calls to static globals.

Note: the PDP-11 "as" handled _branches_, not subroutine jumps. So it
knew both the source & destination of the branch. If your 68K compiler
doesn't handle internal branches as described above, get a smarter
compiler!

On the other hand, subroutine jumps could be to routines that you
don't have an address for yet (they may not even have been written yet
:-). So once again, the optimiser is missing some important data.

<> 	* int size.  Both Manx and Lattice allow 16 and 32-bit
<> 	  ints.
<
<	4) Manx did a half-assed implementation of the X3J11 draft: no
<	   function prototyping. When they wake up and put it in then the
<	   whole thing will become a non-issue.

Lattice did a quarter-assed job. They got function prototypes into
4.0. They also got hooks in to generate calls to libraries inline,
instead of through a stub, and tied the latter to the former in their
distributed files. If you ask for prototypes, you get the inline code.
They could be seperated, but who would not either one of them?

It's not clear that the "models" on the 68000 should be called
"models". After all, you can mix code from between all the models
without too much trouble. This isn't really true on the intel
processors.

	<mike

--
The road is full of dangerous curves			Mike Meyer
And we don't want to go too fast			mwm@berkeley.edu
We may not make it first				ucbvax!mwm
But I know we're going to make it last.			mwm@ucbjade.BITNET

toebes@sas.UUCP (John Toebes) (06/04/88)

Let me comment ahead of time.  I think that we are off base calling the
compiler options 'memory models', but read on...
In article <2069@sugar.UUCP> peter@sugar.UUCP (Peter da Silva) writes:
>In article ... rokicki@polya.STANFORD.EDU (Tomas G. Rokicki) writes:
>> Let's get this all straight.  There are three independent things
>> that affect the `model' on the Lattice and Manx compilers:
>> 	* data addressing.  This can either be 32-bit absolute
>> 	  or 16-bit relative.  *Pointers are 32-bits in either
>> 	  case*.
>This sounds more like a job for Mr. Optimiser than something the programmer
>should have to deal with. You would have as many or as few base registers
>as you needed to get quick access to the data, allocated and set up as needed
>within the procedure, block, or even statement for which it's appropriate.
>
>No need for a memory model, unless you're used to doing things the intel way.
>The whole A4 business is an obvious copy of 8086 small data models.
Wrong.  The A4 business is straight out of the Motorola data books from
looking at the timings and sizes.  The concept of a single register to address
data comes from many machines including Apollo and Macintosh - both 68000
machines.

Using A4 as a base register has other implications that are much
more important.  Because of the C language definition (and even Modula2)
you do not have any idea of the total amount of data that a program might
contain.  Given this, the code must be generated knowing AHEAD of time
how big the data is going to be.  The ONLY real optimization that can be
done to do this without the programmer giving information ahead of time is
to delay generation of code until link time.  There are indeed articles
discussing this type of optimization, but it is far from acceptable in
performance for a simple piece of information.  The word here is 'TRADEOFF'
you EITHER have the programmer specify that he has more than 64K of data
or less than 64K of data and the compiler does its work from that OR
you have the compiler figure it out by trying to compile everything one
way and the let the linker REDO all the work later on (as expensive if not
more so than recompiling everything a second time). 

Perhaps you should spend some time studying the subject first.  I have
done quite a bit of research into what can be done on optimizing code -
4.0 should be evidance of that.  Optimizers only tend to clean up BAD and
sloppy code, the best improvements in speed of a program come from
ALGORITHIMIC [sic] improvements and general code generation strategies.
Compiler technology being researched and written about currently includes
more global application wide optimizations that take advantage of profiling
and inter-procedual data flow analysis.  These are not cheap optimizations,
figures of 100 lines per minute occur in the literature frequently.
>Use it, but keep on your vendor's tail about doing it right.
Which means encourage them to do exactly as they are.  Unless you are
advocating INCREASING compilation (actually link) time.  I was under the
impression that people don't want to wait for the compiler - no matter
how good a code it can generate.  Besides, this is one switch that has a global
positive impact for a simple user choice.
>> 	* code addressing (jsr's.)  These can either be 32-bit
>> 	  absolute or 16-bit relative.  Again, you can mix
>> 	  models.
>
>And again, this is an optimisation that should be applied to all code. The
>UNIX *assembler* on the PDP-11 did this: you just specified that you were
>branching, and it'd generate either the short branch or the long branch
>depending on how far away the destination was. Surely a compiler can do this
>within a routine and for calls to static globals.
As it is by default for Lattice.  However, you seem to neglect the fact that
there are at times that the code is large enough that the fixups take up
more space than using the long branches.  Unike Manx, Lattice does not treat
this as a 'model' allowing you to freely mix the code without any concern.
The linker will patch up any addresses out of range, BUT the user must be
able to disable this if it gets to be too unwieldy.
>> 	* int size.  Both Manx and Lattice allow 16 and 32-bit
>> 	  ints.
>
>This is the only case that I know of for which the concept of "models" is
>appropriate. This is also the case that bites everyone.
'models' is the wrong word here.  A model IMPLIES incompatibility between
the choices for compilation options.  With the Lattice, you can quite readily
mix ALL of the code generation options within the same compilation.  The
only concern comes when you are calling between short/long integer default code
to ensure that it is using the correct parameterization.
>> Someone please correct me where I'm wrong.
>You're not wrong, just too willing to make allowances for 8086 mentality.
The only 8086 mentality here is trying to equate the models.  What we have done
with the Lattice 4.0 compiler is completely unrelated to any other machine.
Each code generation option was designed to take advantage of a feature
pointed out in the MOTOROLA data books.  If we had an 8086 mentality, you
wouldn't be able to mix the code generation options so freely (I can't comment
for any other compiler vendors).  Also, we would have to wack off half
the pointers to give the extra level of frustration :-)
Are you suggesting that we spend more time trying to mimic the 8086? ;-)

/*---------------------All standard Disclaimers apply---------------------*/
/*----Working for but not officially representing SAS or Lattice Inc.-----*/
/*----John A. Toebes, VIII             usenet:...!mcnc!rti!sas!toebes-----*/
/*------------------------------------------------------------------------*/

peter@sugar.UUCP (Peter da Silva) (06/04/88)

In article ... mwm@eris.berkeley.edu (Mike Meyer) writes:
> In article <2069@sugar.UUCP> peter@sugar.UUCP (Peter da Silva) writes:
> < [16 or 32-bit data addressing]

> <This sounds more like a job for Mr. Optimiser than something the programmer
> <should have to deal with.

> You see, the problems tend to be with globally allocated data,
> whose address isn't known until link time. So what's an optimiser
> supposed to do?

If you're using the variable frequently in a routine (say, in a loop) it should
store the address of the variable, or the base address of the array, in a
register. If not, then who cares how you get to it?

That's why I was talking about having registers allocated on the fly, instead
of dedicating A4 to ALL global data.

> < [16 or 32-bit code addressing]

> <And again, this is an optimisation that should be applied to all code...
> <[story about UNIX assembler optimising branches]

> Note: the PDP-11 "as" handled _branches_, not subroutine jumps.

True. But I'd hope that a compiler would be smarter than an assembler. I'm
not happy that it's just as smart and no smarter.

> On the other hand, subroutine jumps could be to routines that you
> don't have an address for yet (they may not even have been written yet
> :-). So once again, the optimiser is missing some important data.

Not at all. You apply this optimisation to the routines you know are local:
that will be in the same load module. You're going to want to make long jumps
outside that load module anyway, since it could be scatter-loaded anywhere.
You *do* want to scatter load: it can make the difference between having your
code run or not when memory is tight.

It's interesting to note that the Sun software for their RISC box does
code modifications like these in the linker.
-- 
-- Peter da Silva      `-_-'      ...!hoptoad!academ!uhnix1!sugar!peter
-- "Have you hugged your U wolf today?" ...!bellcore!tness1!sugar!peter
-- Disclaimer: These may be the official opinions of Hackercorp.

toebes@sas.UUCP (John Toebes) (06/06/88)

In article <2085@sugar.UUCP> peter@sugar.UUCP (Peter da Silva) writes:
>If you're using the variable frequently in a routine (say, in a loop) it should
>store the address of the variable, or the base address of the array, in a
>register. If not, then who cares how you get to it?
>
>That's why I was talking about having registers allocated on the fly, instead
>of dedicating A4 to ALL global data.
Have you tried what you are suggesting?  I have.  It is a general problem on
the 370 for which I have worked on a C compiler.  With the general philosophy
of C code you come out BEHIND.  Just take a second and THINK about it.

Externals must be addressed in some manner.  If you are going to get the
address of the external, it will cost you 6 bytes to load it into the
register (2 bytes for the instruction plus 4 bytes for the relocated address)
Now assuming that on the average you address 5 external variables in a
subroutine, that is a minimum of 30 bytes PER SUBROUTINE of additional
overhead that cannot be eliminated in any way.  Without a GLOBAL PROGRAMWIDE
optimizer, this cost cannot be spread out among modules resulting it this cost
being given to almost every subroutine.  (If you want such an optimizer, I
hope you have a long time to wait for it to finish on any reasonably sized
program).

Note that in general references to EXTERNALS are sparse such that
only one or two modifications/references occur in a module.  With a program
that is written to be reentrant, this issue is completely moot BECAUSE there
are no externals.

Again, I suggest that you research your subject.  If you had then you would
recognize the other areas that could benefit compilers in general.  I spend
a lot of effort and time in improving our compiler technology by looking at
real live user programs and general coding philosophies.  When you see what
type of code is really written you get a feel for what makes sense to
implement and what is theoretical hogwash.

>Not at all. You apply this optimisation to the routines you know are local:
>that will be in the same load module. You're going to want to make long jumps
>outside that load module anyway, since it could be scatter-loaded anywhere.
>You *do* want to scatter load: it can make the difference between having your
>code run or not when memory is tight.
What you are referring to is already in BLINK in a general form that does
not limit itself to just compiler geneated code. ALVs are constructed to
perform any necessary bridging.  The compiler has always generated short
calls for subroutine branches.

/*---------------------All standard Disclaimers apply---------------------*/
/*----Working for but not officially representing SAS or Lattice Inc.-----*/
/*----John A. Toebes, VIII             usenet:...!mcnc!rti!sas!toebes-----*/
/*------------------------------------------------------------------------*/

mwm@eris.berkeley.edu (Mike (I'm in love with my car) Meyer) (06/07/88)

In article <2085@sugar.UUCP> peter@sugar.UUCP (Peter da Silva) writes:
<> You see, the problems tend to be with globally allocated data,
<> whose address isn't known until link time. So what's an optimiser
<> supposed to do?
<
<If you're using the variable frequently in a routine (say, in a loop) it should
<store the address of the variable, or the base address of the array, in a
<register. If not, then who cares how you get to it?

In that case, I'd much rather have the _variable_ in a register!
Trouble is, you have to spill it back to memory for function calls and
references through pointers. Maybe I should just declare it noalias
:-)>. (there's your beard, Peter!)

<> On the other hand, subroutine jumps could be to routines that you
<> don't have an address for yet (they may not even have been written yet
<> :-). So once again, the optimiser is missing some important data.
<
<Not at all. You apply this optimisation to the routines you know are local:
<that will be in the same load module. You're going to want to make long jumps
<outside that load module anyway, since it could be scatter-loaded anywhere.

This is roughly what Lattice/BLINK does now, a John Toebes already
pointed out.

<It's interesting to note that the Sun software for their RISC box does
<code modifications like these in the linker.

Not really. Putting stuff like this in the linker makes a lot of sense
- so long as you don't loose all the advantages in doing it. Somebody
at Sun almost certainly new that, and made sure that didn't happen on
the SPARC.

	<mike
--
When all our dreams lay deformed and dead		Mike Meyer
We'll be two radioactive dancers			mwm@berkeley.edu
Spinning in different directions			ucbvax!mwm
And my love for you will be reduced to powder		mwm@ucbjade.BITNET

cg@myrias.UUCP (Chris Gray) (06/10/88)

>Article 1035 of comp.sys.amiga.tech:
>From: peter@sugar.UUCP (Peter da Silva)
>In article ... rokicki@polya.STANFORD.EDU (Tomas G. Rokicki) writes:
>> Let's get this all straight.  There are three independent things
>> that affect the `model' on the Lattice and Manx compilers:
>> 	* data addressing.  This can either be 32-bit absolute
>> 	  or 16-bit relative.  *Pointers are 32-bits in either
>> 	  case*.
>This sounds more like a job for Mr. Optimiser than something the programmer
>should have to deal with. You would have as many or as few base registers
>as you needed to get quick access to the data, allocated and set up as needed
>within the procedure, block, or even statement for which it's appropriate.
>
>No need for a memory model, unless you're used to doing things the intel way.
>The whole A4 business is an obvious copy of 8086 small data models.

The problem is that the same data must be accessible from everywhere in the
program (except for statics, which are more restricted). Your idea of cacheing
a pointer to somewhere in the globals has merit - I think I'll look at putting
something like that in my Draco compiler (I'm into optimizing right now). Note,
however that a solution like that will never be as efficient as a program-wide
convention of having a single register always point into the globals. For most
programs, the pointer-to-globals technique works, and can produce significantly
faster/smaller code.

>> 	* code addressing (jsr's.)  These can either be 32-bit
>> 	  absolute or 16-bit relative.  Again, you can mix
>> 	  models.
>And again, this is an optimisation that should be applied to all code. The
>UNIX *assembler* on the PDP-11 did this: you just specified that you were
>branching, and it'd generate either the short branch or the long branch
>depending on how far away the destination was. Surely a compiler can do this
>within a routine and for calls to static globals.

The decision as to whether to use a long or short branch can only be made when
the location of the target is known. With the Amiga's scatter loading, this
isn't known until program load time. If the target is too far away, a 32 bit
address may be required, so if the compiler has put in a 16 bit offset only,
someone will have to fake it (which is what both Lattice and Manx do). The
decisions done by the UNIX assembler were for a single assembly (one .c file)
only - interfile references were all long mode. Any compiler worth its salt
will use a short PC relative branch within a function whenever possible.
(Draco doesn't yet, but it's coming :-) )

Further comments about int sizes deleted.

Compiler writers are often in direct competion with one-another. One area of
competition is the speed/size of the generated code. Any technique by which
a compiler can be "improved" in this respect is considered fair game. Given
two identical programs, one compiled with a fully "large model", and one
compiled with a fully "small model", the latter will be smaller and faster.
Which one would sell better? Which one would you keep on your machine?
-- 
Chris Gray		Myrias Research, Edmonton	+1 403 428 1616
	{uunet!mnetor,ubc-vision,watmath,vax135}!alberta!myrias!cg

fnf@fishpond.UUCP (Fred Fish) (06/10/88)

In article <527@sas.UUCP> toebes@sas.UUCP (John Toebes) writes:
>             Because of the C language definition (and even Modula2)
>you do not have any idea of the total amount of data that a program might
>contain.  Given this, the code must be generated knowing AHEAD of time
>how big the data is going to be.  The ONLY real optimization that can be
>done to do this without the programmer giving information ahead of time is
>to delay generation of code until link time.  There are indeed articles
>discussing this type of optimization, but it is far from acceptable in
>performance for a simple piece of information.

As a side note to this discussion, since it's not really relevant
to the Amiga (yet :-), the M88000 linker solves the small/large
model problem by actually patching the object code as necessary.
If a particular data item is in the lower 64K of memory, or within
64K of any known base pointer, the load or store access can be done
with a single instruction (small model equivalent).  If not, then
the access can be done with either a two or three instruction sequence
(large model equivalent), which the linker synthesizes and uses to replace
the original instruction.  The actual code to do this is only a few
pages of C code, and took me about two days to write and debug.

-Fred

-- 
# Fred Fish    hao!noao!mcdsun!fishpond!fnf     (602) 921-1113
# Ye Olde Fishpond, 1346 West 10th Place, Tempe, AZ 85281  USA