[comp.lang.c] Memory Models

drezac@dcscg1.UUCP (Duane L. Rezac) (08/10/89)

Hello: 

I am just getting into C and have a question on Memory Models. I have not
seen a clear explanation on just what they are and how to determine which 
one to use. Does anyone have a short, clear explanation of these for  
someone just starting out in C?  


Duane L. Rezac



-- 
+-----------------------+---------------------------------------------------+
| Duane L. Rezac |These views are my own, and NOT representitive of my place|
| dsacg1!dcscg1!drezac    drezac@dcscg1.dcsc.dla.mil      of Employment.    |
+-----------------------+---------------------------------------------------+

gwyn@smoke.BRL.MIL (Doug Gwyn) (08/11/89)

In article <562@dcscg1.UUCP> drezac@dcscg1.UUCP (Duane L. Rezac) writes:
>I am just getting into C and have a question on Memory Models.

That is not a C language issue.  It's kludgery introduced specifically
in the IBM PC environment.  Unless you have a strong reason not to,
just always use the large memory model.  (A strong reason would be
compatibility with an existing object library, for example.)

peter@ficc.uu.net (Peter da Silva) (08/11/89)

In article <10703@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
> That is not a C language issue.  It's kludgery introduced specifically
> in the IBM PC environment.

Emphatically agree.

> Unless you have a strong reason not to,
> just always use the large memory model.

Disagree. Always use the smallest model you can get away with, but if
the program won't work under a small model don't play games with
NEAR and FAR pointers... just go to a larger model. You will thank
yourself later when you get a real computer.

> (A strong reason would be
> compatibility with an existing object library, for example.)

The massive performance advantage of small model over large is a
strong reason... so long as you don't have to use kludges to fit
into small model.

After all, all of UNIX ran in small model once upon a time :->.
-- 
Peter da Silva, Xenix Support, Ferranti International Controls Corporation.
Business: peter@ficc.uu.net, +1 713 274 5180. | "The sentence I am now
Personal: peter@sugar.hackercorp.com.   `-_-' |  writing is the sentence
Quote: Have you hugged your wolf today?  'U`  |  you are now reading"

davidsen@sungod.crd.ge.com (ody) (08/11/89)

In article <562@dcscg1.UUCP> drezac@dcscg1.UUCP (Duane L. Rezac) writes:

| I am just getting into C and have a question on Memory Models. I have not
| seen a clear explanation on just what they are and how to determine which 
| one to use. Does anyone have a short, clear explanation of these for  
| someone just starting out in C?  

  I'll provide some information, but bear in mind that models are a
characteristic of the linker, rather than something just in C.
Segmented machines can support the models in all languages including
assembler.

The question is if the code and/or data space is limited to 64k or not.
Here's a table of the common models:

		       code
	       64k             >64k
	 _________________________________
	|                |                |
d  64k  |   small        |   medium       |
a	|________________|________________|
t	|                |                |
a >64k  |   compact      |   large        |
	|________________|________________|

  Two other models are tiny (code and data share the same segment) and
huge, in which array and aggregate objects may be larger than 64k.

  The reason for using the smaller models is performance. Data access is
faster in small or medium model.
	bill davidsen		(davidsen@crdos1.crd.GE.COM)
  {uunet | philabs}!crdgw1!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

brs@beach.cis.ufl.edu (Ray Seyfarth) (08/12/89)

There is one significant reason to choose the small memory model if
it is sufficient:   pointers will not point outside the program's
address space.  This is important in MS/DOS, since there is no
memory protection.

A compact, large or huge model program can easily confuse a programmer
a long time if a stray pointer wipes out part of DOS.  The result can
be delayed for a while which adds to the confusion.

The Moral:  If you are trying to learn C, use the small model.
If you know C and want to write programs using a lot of data, choose
your own poison.
--
In Real Life:		UUCP: {gatech|mailrus}!uflorida!beach.cis.ufl.edu!brs
Ray Seyfarth		Internet: brs@beach.cis.ufl.edu
University of Florida	"Ninety percent of life is just showing up." Woody Allen

bright@Data-IO.COM (Walter Bright) (08/12/89)

In article <1633@crdgw1.crd.ge.com> davidsen@crdos1.UUCP (bill davidsen) writes:
>  I'll provide some information, but bear in mind that [memory] models are a
>characteristic of the linker, rather than something just in C.

The linker has nothing to do with it. Memory model is determined by the
compiler (usually with a command line switch) and by which runtime library
is used. The linker doesn't know or care which memory model is used.

>  Two other models are tiny (code and data share the same segment) and
>huge, in which array and aggregate objects may be larger than 64k.

In some C compilers, the tiny model has code and data sharing the same
segment. In Zortech C, they *do not*. The limitation with Zortech's
tiny model is that (code size) + (static data size) < 64k. This means
that the amount of memory available for the stack and heap is
64k - (static data size) instead of 64k - (static data size) - (code size).

It's worth noting that the only difference between the tiny and small memory
models is a different startup object file is used.

P.S. I should know, I implemented the Zortech tiny model, after listening
to people tell me it couldn't be done.

seanf@sco.COM (Sean Fagan) (08/12/89)

In article <10703@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>Unless you have a strong reason not to,
>just always use the large memory model.  (A strong reason would be
>compatibility with an existing object library, for example.)

Another one would be speed and size of executables.

Using large-model is slower than small model, sometimes considerably.

-- 
Sean Eric Fagan  |    "Uhm, excuse me..."
seanf@sco.UUCP   |      -- James T. Kirk (William Shatner), ST V: TFF
(408) 458-1422   | Any opinions expressed are my own, not my employers'.

scs@adam.pika.mit.edu (Steve Summit) (08/12/89)

In article <562@dcscg1.UUCP> drezac@dcscg1.UUCP (Duane L. Rezac) writes:
>I am just getting into C and have a question on Memory Models. I have not
>seen a clear explanation on just what they are and how to determine which 
>one to use. Does anyone have a short, clear explanation of these for  
>someone just starting out in C?

The short answer is, stay as far away from this disgusting
concept as possible.  Segments buy nothing but trouble.

In article <5653@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
>Always use the smallest model you can get away with...
>The massive performance advantage of small model over large is a
>strong reason...

"Massive"?  I've never noticed any difference at all.  (This is
not to say that there is none, only that it is not noticeable in
the programs I write.  I am sure that some difference can be
demonstrated, but I believe neither that the far pointer overhead
is unacceptable, nor that the overhead is inherent in 32-bit
addressing -- that is, the problem is with the segment/offset
distinction and the gory code it forces compilers to generate.
A sensible, "flat" 32-bit address space could certainly be
implemented with little more overhead than the 16-bit near
addressing modes.)

I find segments to be significantly deleterious to _my_
performance.  Just today a program began misbehaving; it turned
out that a few utility object files it was importing from another
source directory were of the wrong model.  (Since the last time
I'd built this program, one source directory's makefile had been
changed to use large model, but the other hadn't.)

The Intel and Microsoft object file formats and linkers only make
a bad idea even worse: there are no explicit warning or error
messages when object files compiled with different memory models
are linked together, although a program so linked is virtually
guaranteed not to work.  If you're lucky you'll get a "fixup
overflow" or some such error at link time (not exactly
descriptive, but better than nothing.)  More likely, though, the
link completes silently but the near/far subroutine call mismatch
causes the machine to lock up on the first function call.  Since
a lockup is also the symptom of seemingly every other simple
programmer error on this benighted architecture, memory model
mismatch isn't always the first thing I look for.

Ever since I started having these object file incompatibility
problems, I've determined that I ought to just use large model
everywhere, since it's the most universal.  (I've got lots of
different directories sharing object and library modules; while
some programs need large model, none need small.  I like to think
that sharing object code is a good idea, preferable both to
reinventing the wheel and to maintaining multiple, potentially
inconsistent copies of source or object files in multiple
directories.  The presence of multiple memory models, however,
actively discourages such sharing.)  If I had the time and the
courage I'd convert every makefile in sight, but there's no
telling how much I'd break in the interim.

(Another horrific problem has to do with binary data files.  If
you write out a structure that happens to contain a pointer, you
can't read it in with a program compiled with a different memory
model.  I always knew binary data files were a bad idea, but I
used to think they only compromised transportability between
machines of different architectures, not different programs on
the same machine.  Now I have several thousand data files,
written by an old, small-model program, which can't be read by
new, fancier programs which want to be built large-model.)

In article <20728@uflorida.cis.ufl.EDU> brs@beach.cis.ufl.edu (Ray Seyfart) writes:
>There is one significant reason to choose the small memory model if
>it is sufficient:   pointers will not point outside the program's
>address space.  This is important in MS/DOS, since there is no
>memory protection.

An interesting observation, which may have some merit, although
I've been crashing my PC daily for as long as I've been using it
(due primarily to the lack of memory or any other protection),
and I don't always use large model, so using small model is not
sufficient if you want to avoid baffling problems.

                                            Steve Summit
                                            scs@adam.pika.mit.edu

clyde@hitech.ht.oz (Clyde Smith-Stubbs) (08/13/89)

From article <5653@ficc.uu.net>, by peter@ficc.uu.net (Peter da Silva):
> In article <10703@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
>> That is not a C language issue.  It's kludgery introduced specifically
>> in the IBM PC environment.
> 
> Emphatically agree.

It IS a language issue - you could argue that near and far address spaces
are not part of the Standard C, therefore it is not a C issue, however
there are sufficient architectures that REQUIRE a useful language  to
permit specification of more than one memory space that it is
a language issue. the 8086/286 may be the best known example of an
architecture that requires far and near address spaces, but it is not
the only one. Others are:
	8051 (on chip vs off chip RAM)
	6801/HC11 (basepage memory)
	64180 (banked code memory)
	65816 (banked code and data)
	8096 (on chip memory)
	Z8000 (another segmented architecture)

I have implemented C compilers for the first five of the above (plus the
8086/286). In doing so I devised a machine independent model for the
semantics of far, near and differing memory models. Such a model allows
code to be written which is both efficient for strange architectures like
the 8051 AND portable.

>> Unless you have a strong reason not to,
>> just always use the large memory model.
> 
> Disagree. Always use the smallest model you can get away with, but if
> the program won't work under a small model don't play games with
> NEAR and FAR pointers... just go to a larger model. You will thank
> yourself later when you get a real computer.

Generally speaking a small model would mean economy of addressing,
and therefore smaller and faster code. If a large address space is not
required then the small model should be used. Where only a limited
set of data structures need to be in far memory then far pointers can
be used to advantage PROVIDING a) you program within the bounds of
a portable model as described above AND b) the compiler is gracious
enough to issue decent warnings (e.g. converting a far pointer to a non-far
pointer). Without these precautions you can easily get into big trouble.

> 
> After all, all of UNIX ran in small model once upon a time :->.

UCB and the VAX have a lot to answer for :-)
----
-- 
Clyde Smith-Stubbs
HI-TECH Software, P.O. Box 103, ALDERLEY, QLD, 4051, AUSTRALIA.
INTERNET:	clyde@hitech.ht.oz.au		PHONE:	+61 7 300 5011
UUCP:		uunet!hitech.ht.oz.au!clyde	FAX:	+61 7 300 5246

schaut@madnix.UUCP (Rick Schaut) (08/13/89)

In article <1633@crdgw1.crd.ge.com> davidsen@crdos1.UUCP (bill davidsen) writes:
>  The reason for using the smaller models is performance. Data access is
>faster in small or medium model.

And procedure calls and returns are faster in the small and compact
models.

-- 
   Richard Schaut     Madison, WI              Madison: an alternative
ArpaNet: madnix!schaut@cs.wisc.edu                      to reality.
UseNet: ...uwvax!astroatc!nicmad!madnix!schaut
             {decvax!att}!

blarson@basil.usc.edu (bob larson) (08/13/89)

In article <309@hitech.ht.oz> clyde@hitech.ht.oz (Clyde Smith-Stubbs) writes:
>From article <5653@ficc.uu.net>, by peter@ficc.uu.net (Peter da Silva):
>>> That is not a C language issue.  It's kludgery introduced specifically
>>> in the IBM PC environment.
>It IS a language issue - you could argue that near and far address spaces
>are not part of the Standard C, therefore it is not a C issue, however
>there are sufficient architectures that REQUIRE a useful language  to

Just because your address space is segmented dosn't mean you have to
kludge your language around.  Prime C does quite nicely without memory
models.  (Actually, it does have one compiler switch to enable parinoia
about pointers possibly pointing in arrays larger than 128k bytes, PC
compilers would call this huge.)  (Additional instructions were added
for C to handle the concept of an efficent pointer to a byte.)
Bob Larson	Arpa:	blarson@basil.usc.edu
Uucp: {uunet,cit-vax}!usc!basil!blarson
Prime mailing list:	info-prime-request%ais1@usc.edu
			usc!ais1!info-prime-request

johnl@esegue.uucp (John Levine) (08/14/89)

In article <19158@usc.edu> blarson@basil.usc.edu (bob larson) writes:
>In article <309@hitech.ht.oz> clyde@hitech.ht.oz (Clyde Smith-Stubbs) writes:
>>From article <5653@ficc.uu.net>, by peter@ficc.uu.net (Peter da Silva):
>>>> [near and far pointers are] not a C language issue.  It's kludgery
>>>> introduced specifically in the IBM PC environment.

>>It IS a language issue - you could argue that near and far address spaces
>>are not part of the Standard C, therefore it is not a C issue, however
>>there are sufficient architectures that REQUIRE a useful language  to

>Just because your address space is segmented dosn't mean you have to
>kludge your language around.  Prime C does quite nicely without memory
>models.  ...

I am probably one of the few people in the world to have had the dubious
pleasure of writing an assembler for the Prime, intended to be the back end
for a C compiler that never saw the light of day.  The Prime address space
is only sort of segmented, because some instructions and address formats pay
attention to segment boundaries and some don't.  The C compiler wherever
possible uses the latter set, so the segmentation is more or less invisible
at the architecture level.  (Well, not quite, the address mode you have to
use to get to the stack is segmented, so the stack frame for any single
procedure is limited to slightly under one segment.  Don't even think about
alloca.)

I also note that since the word-addressed Prime was unable to reference
characters in a reasonable way they added some new instructions specifically
for the benefit of the C compiler that load characters, store characters,
and increment character pointers, which of course means that Prime's
customers whose machines predate the new instructions are unable to run C
programs.  Great.

The '86 architecture, all its innumerable faults notwithstanding, has quite
consistent memory addressing.  The segments are really there, and from other
extensive experience (I did most of the Prime assembler in Turbo C at home!)
I can testify that large model C code is much larger and slower than small
model, so you really need to have some sort of address space hackery to
write useful programs.  Sad, but true.

What I would really like to see is a data declaration language that lets you
take advantage of segments rather than just tolerating them, e.g. group
useful data into segments, map structures into segments in interesting ways,
pass segment handles around.  Never seen one.  Even Multics PL/I just
kludged it.
-- 
John R. Levine, Segue Software, POB 349, Cambridge MA 02238, +1 617 492 3869
{ima|lotus}!esegue!johnl, johnl@ima.isc.com, Levine@YALE.something
Massachusetts has 64 licensed drivers who are over 100 years old.  -The Globe

cowan@marob.masa.com (John Cowan) (08/15/89)

In article <10703@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>In article <562@dcscg1.UUCP> drezac@dcscg1.UUCP (Duane L. Rezac) writes:
>>I am just getting into C and have a question on Memory Models.
>
>That is not a C language issue.  It's kludgery introduced specifically
>in the IBM PC environment.  Unless you have a strong reason not to,
>just always use the large memory model.  (A strong reason would be
>compatibility with an existing object library, for example.)


I have to disagree here.  "Always use the large memory model" is a
prescription for disaster under MS-DOS, due to the use of real mode.
Large model programs use pointers that, if damaged by bugs, can access
every part of memory, including the operating system.  Using small model
whenever possible gives a modicum of protection: at most, runaway pointers
can access up to 64K, most of which is probably above the current program
and not in use by anybody.  This makes operating system crashes far less
likely: at most, the program goes down without taking DOS with it.

I use the small model exclusively when writing programs small enough to fit
into it.  Only if needed do I fire up the large model.  I agree that, modulo
the question of compatibility with existing libraries, the other models are
not very useful.
-- 
Internet/Smail: cowan@marob.masa.com	Dumb: uunet!hombre!marob!cowan
Fidonet:  JOHN COWAN of 1:107/711	Magpie: JOHN COWAN, (212) 420-0527
		Charles li reis, nostre emperesdre magnes
		Set anz toz pleins at estet in Espagne.

md@sco.COM (Michael Davidson) (08/15/89)

In article <10703@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>In article <562@dcscg1.UUCP> drezac@dcscg1.UUCP (Duane L. Rezac) writes:
>>I am just getting into C and have a question on Memory Models.
>
>That is not a C language issue.  It's kludgery introduced specifically
>in the IBM PC environment.  Unless you have a strong reason not to,
>just always use the large memory model.  (A strong reason would be
>compatibility with an existing object library, for example.)
Sorry, but it is an evil necessity brought about by the segmented
architecture of the INTEL 8086 and 80286 - although the most common
place that these processors show up is in the IBM PC environment
this kludgery follows these CPUs wherever they go.

Actually, better advice is to always use small model (ie up to
64k code and 64k data), unless you really don't care about performance.
The cost of continually reloading segment registers (which is what
large model will tend to do) is bad in real mode and horrific in
protected mode. Just remember that small is beautiful....

Tim_CDC_Roberts@cup.portal.com (08/15/89)

I read here not too long ago that several folks had implemented C compilers
for the CDC Cyber series in 170 state, which has 60-bit words with no
byte addressing.  Could someone involved in one of these compilers please
post or e-mail a message describing what kind of "memory model" you used?
For example, did you make char = int = long = 60-bits and waste 54 bits 
for chars, or did you make char = (6 bits) and sizeof(int) = 10, and do
some horrendous shift-and-masking to perform conversions?

Inquiring minds want to know.

Tim_CDC_Roberts@cup.portal.com                | Control Data...
...!sun!portal!cup.portal.com!tim_cdc_roberts |   ...or it will control you.

gwyn@smoke.BRL.MIL (Doug Gwyn) (08/16/89)

In article <888@fiasco.sco.COM> md@sco.COM (Michael Davidson) writes:
>>In article <562@dcscg1.UUCP> drezac@dcscg1.UUCP (Duane L. Rezac) writes:
>>>I am just getting into C and have a question on Memory Models.
>Sorry, but it is an evil necessity brought about by the segmented
>architecture of the INTEL 8086 and 80286 ...

People who aren't "wedded" to the *86 architecture generally don't
seem to think it was necessary to cause memory models to be visible
in higher-level programming languages.  The Apple IIGS architecture
(65816) has a similar trade-off, and the available C compilers for
it do not have "near" and "far" foolishness.  And yes, it is
possible for am Apple IIGS C programmer to select which model is to
be used for his application, at least with Orca/C.

ckl@uwbln.UUCP (Christoph Kuenkel) (08/16/89)

In article <1989Aug14.163909.9920@esegue.uucp>, johnl@esegue.uucp (John Levine) writes:
> I can testify that large model C code is much larger and slower than small
> model, so you really need to have some sort of address space hackery to
> write useful programs.  Sad, but true.
I see it the other way round.  Its true and fine.  My expirience
was with Zilogs Z8000.  I always tried to compile and load my
programs using the small (unsegmented in their notion) model and
the result were astonishing quick programs.  This was in the
early days of the M68000 and the 68000 was quite uninteresting
compared to the Z8000 (which was some years older as far as i
know) due to this effect.

When there were problems coming up, I switched to the segmented
mode with a compiler flag and anything was fine.  No language
kludgery%.  to me its a feature!

christoph

%	except for static objects larger than 64k.  but in
	contrast to the x86, large objects were allocatable via
	malloc.
-- 
# include <std/disclaimer.h>
Christoph Kuenkel/UniWare GmbH       Kantstr. 152, 1000 Berlin 12, West Germany
ck@tub.BITNET                ckl@uwbln             {unido,tmpmbx,tub}!uwbln!ckl

blarson@basil.usc.edu (bob larson) (08/17/89)

In article <1989Aug14.163909.9920@esegue.uucp> johnl@esegue.UUCP (John Levine) writes:
>In article <19158@usc.edu> blarson@basil.usc.edu (bob larson) writes:
>>Just because your address space is segmented dosn't mean you have to
>>kludge your language around.  Prime C does quite nicely without memory
>>models.  ...

>I am probably one of the few people in the world to have had the dubious
>pleasure of writing an assembler for the Prime.  The Prime address space
>is only sort of segmented, because some instructions and address formats pay
>attention to segment boundaries and some don't.

You must have a different definition of segmented than I do.  All
Prime addresses include a segment number and offset, and (with the
excpetion of the new ix mode instructions) no address calcualtion
instruction or addressing mode will carry from the offset to the
segment portion of the address.  (Instructions not designed to
manipulate addresses may of course be used to do so.)

>the address mode you have to
>use to get to the stack is segmented, so the stack frame for any single
>procedure is limited to slightly under one segment.

The C compiler could be taught how to allocate additional memory on
procedure entry if this ever becomes a problem.

>Don't even think about alloca.

It's available from pl1 (called aloc$s) in the standard system
library, and there is a special instruction (stex) just for dynamicly
extending the stack frame.  Why shouldn't I think of it?  (I don't use
alloca because I consider it a bad non-portable hack, but one that
could easily be supported by prime C.)

>I also note that since the word-addressed Prime was unable to reference
>characters in a reasonable way they added some new instructions specifically
>for the benefit of the C compiler that load characters, store characters,
>and increment character pointers,

The instructions were added to do these operations effeciently, they
are the difference between the i and ix instruciton sets.  C compilers
are available for V and ix instruction sets.

> which of course means that Prime's
>customers whose machines predate the new instructions are unable to run C
>programs.  Great.

The V mode c compiler was written before the ix instruction set was
available, and works reasonably well.  Prime has versions of all
programs they sell except the ix mode C compiler that work on non-ix
prime systems.  They are planning future code that won't work without
the ix mode instrucitons however.  Prime hasn't made a system that
won't support ix mode in 5 years.  Not supporting old systems isn't
something I would fault prime for.

>The '86 architecture, all its innumerable faults notwithstanding, has quite
>consistent memory addressing.

Consistantly bad isn't a virtue in my book.

-- 
Bob Larson	Arpa:	blarson@basil.usc.edu
Uucp: {uunet,cit-vax}!usc!basil!blarson
Prime mailing list:	info-prime-request%ais1@usc.edu
			usc!ais1!info-prime-request

mdfreed@ziebmef.uucp (Mark Freedman) (08/19/89)

In article <10703@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>In article <562@dcscg1.UUCP> drezac@dcscg1.UUCP (Duane L. Rezac) writes:
>>I am just getting into C and have a question on Memory Models.
>
>That is not a C language issue.  It's kludgery introduced specifically
>in the IBM PC environment.  Unless you have a strong reason not to,
>just always use the large memory model.  (A strong reason would be
>compatibility with an existing object library, for example.)


   and remember that objects larger than 64K can run into problems because of
the segmented architecture. In Turbo C 2.0, malloc, calloc don't work for
objects larger than 64K (use farmalloc, farcalloc), and far pointers wrap
(the segment register is unchanged ... only the offset has been changed to
protect the innocent :-)).
   huge pointers are normalized (all arithmetic is done via function calls
which perform normalization), but pointers must be explicitly declared as
huge. Even the huge memory model uses far pointers as the default (because of
the overhead, I would imagine).
   I haven't used Microsoft or other MS-DOS implementations, but I suspect that
they have similar design compromises.

   (apologies for the Intel-specific followup, but it might save someone some
aggravation).

mdfreed@ziebmef.uucp (Mark Freedman) (08/19/89)

(if you write out a structure that happens to contain a pointer)

    why would you write a pointer to a memory-location to a file ??

clyde@hitech.ht.oz (Clyde Smith-Stubbs) (08/21/89)

From article <10744@smoke.BRL.MIL>, by gwyn@smoke.BRL.MIL (Doug Gwyn):
> People who aren't "wedded" to the *86 architecture generally don't
> seem to think it was necessary to cause memory models to be visible
> in higher-level programming languages.

There are probably more 80x86 processors out there than any other,
and the 80x86 architecture has brought C and Unix within reach of  people
who would not have access to it otherwise. That doesn't alter the fact that
its memory organization is horrible, but it should be some incentive to
devise rational, portable  techniques of managing address spaces that are
not linear.

> The Apple IIGS architecture
> (65816) has a similar trade-off, and the available C compilers for
> it do not have "near" and "far" foolishness.  And yes, it is
> possible for am Apple IIGS C programmer to select which model is to
> be used for his application, at least with Orca/C.


Actually, the 65816 is a prime example of a processor that cries out
for far and near pointers. The lack of them in the very few compilers
available for the IIgs is just that, a lack, not a feature. 

Hopefully there won't be too many more processors designed that have
architectures like the 80x86 and 65xxx, but I wouldn't like to put
money on it.
-- 
Clyde Smith-Stubbs
HI-TECH Software, P.O. Box 103, ALDERLEY, QLD, 4051, AUSTRALIA.
INTERNET:	clyde@hitech.ht.oz.au		PHONE:	+61 7 300 5011
UUCP:		uunet!hitech.ht.oz.au!clyde	FAX:	+61 7 300 5246

gwyn@smoke.BRL.MIL (Doug Gwyn) (08/22/89)

In article <319@hitech.ht.oz> clyde@hitech.ht.oz (Clyde Smith-Stubbs) writes:
>Actually, the 65816 is a prime example of a processor that cries out
>for far and near pointers.

Somehow I don't find that this is a problem in my programs..

wjr@ftp.COM (Bill Rust) (08/23/89)

In article <1989Aug18.210404.13183@ziebmef.uucp> mdfreed@ziebmef.UUCP (Mark Freedman) writes:
>>In article <562@dcscg1.UUCP> drezac@dcscg1.UUCP (Duane L. Rezac) writes:
>>>I am just getting into C and have a question on Memory Models.
>   huge pointers are normalized (all arithmetic is done via function calls
>which perform normalization), but pointers must be explicitly declared as
>huge. Even the huge memory model uses far pointers as the default (because of
>the overhead, I would imagine).
>   I haven't used Microsoft or other MS-DOS implementations, but I suspect
>that they have similar design compromises.

Note that MSC huge pointers are normalized in a strange way. While TC 
normalizes the segment register after every increment (ptr++), MSC does
not. That is, if you have a record that is 3000h bytes and a starting point
of 400:0, incrementing it six times results in values of 400:3000, 400:6000,
400:9000, 400:c000, 400:f000, and 1400:2000. Referencing the 400:f000 value
will cause a segment wrap. I haven't checked what operations other than
increment do, but I was most distressed at finding this, what I consider an
error but MS does not. They apparently feel that the performance hit is too
great to normalize every time. I feel that if you ask for huge pointers, you
should get them with as much memory addressable from the pointer as possible.
(For grins sake, think about huge pointers under OS/2, a protected mode 
program).

Bill Rust (wjr@ftp.com)

ray@philmtl.philips.ca (Raymond Dunn) (08/24/89)

In article <319@hitech.ht.oz> clyde@hitech.ht.oz (Clyde Smith-Stubbs) writes:
>From article <10744@smoke.BRL.MIL>, by gwyn@smoke.BRL.MIL (Doug Gwyn):
>> People who aren't "wedded" to the *86 architecture generally don't
>> seem to think it was necessary to cause memory models to be visible
>> in higher-level programming languages.
>
>There are probably more 80x86 processors out there than any other,
>and the 80x86 architecture has brought C and Unix within reach of  people
>who would not have access to it otherwise. That doesn't alter the fact that
>its memory organization is horrible, but it should be some incentive to
>devise rational, portable  techniques of managing address spaces that are
>not linear.

Hey, it's easy.  If you don't want to bother yourself with memory models,
then always use the large or huge models and forget about it.

All your code will then of course have to carry the overhead of data and
program pointers greater than 16 bits, just like it does on the processors of
wonderful design.

However if you want to take *advantage* of the fact that you can see a
significant improvement in code size and execution speed by using 16-bit
addresses when possible, then use the small and medium memory models.

(:-) * 0.5
-- 
Ray Dunn.                    | UUCP: ..!uunet!philmtl!ray
Philips Electronics Ltd.     | TEL : (514) 744-8200  Ext: 2347
600 Dr Frederik Philips Blvd | FAX : (514) 744-6455
St Laurent. Quebec.  H4M 2S9 | TLX : 05-824090

fredex@cg-atla.UUCP (Fred Smith) (08/24/89)

In article <664@philmtl.philips.ca> ray@philmtl.philips.ca (Raymond Dunn) writes:
>In article <319@hitech.ht.oz> clyde@hitech.ht.oz (Clyde Smith-Stubbs) writes:
. . .
>Hey, it's easy.  If you don't want to bother yourself with memory models,
>then always use the large or huge models and forget about it.



Well, even then it ain't transparent!  You will still get bitten by
the stupid segmented architecture, because there are, even in large
or huge model, restrictions on either the size of an array, or on
the size and/or alignment of the elements of that array!

See?  Even in Large/Huge model There Ain't No Free Lunch!


Fred

peter@ficc.uu.net (Peter da Silva) (08/24/89)

[ warning, the following discussion is done without reference to a data
  sheet for exact adressing modes and timings. Someone want to come up
  with the exact numbers? ]

In article <664@philmtl.philips.ca>, ray@philmtl.philips.ca (Raymond Dunn) writes:
> Hey, it's easy.  If you don't want to bother yourself with memory models,
> then always use the large or huge models and forget about it.

> All your code will then of course have to carry the overhead of data and
> program pointers greater than 16 bits, just like it does on the processors of
> wonderful design.

Yeh, right. I have a 32-bit value in memory on a 32-bit machine, and I want
to store a word where it points to:

	Load value into 32-bit register, 1 instruction fetch, 1 data fetch.
	Store indirect through 32-bit register, 1 instruction fetch, 1 store.

Worst case, let's say it's a 68000 (32 bit registers, 16 bit bus):

	Store indirect via memory, 1 instruction fetch, 2 data fetches,
		2 stores.

Now I'm on an 80286:

	Load segment into segment register, 1 instruction fetch, 1 data fetch.
	Load offset into register, 1 instruction fetch, 1 data fetch.
	Segment prefix, 1 instruction fetch.
	Store indirect through segment:register, 1 instruction fetch, 2 stores.

At the very least you've had to fetch and decode two extra instructions. not
to mention that writing into segment registers is expensive. Finally, you're
limited to 64K objects unless you make pointer arithmetic subroutines.

OK, let's look at a 16-bit memory model, on the 80286 and the 68000. Store
indirect 16 bits via a word in memory based on a segment register:

80286:
	Load word into register, 1 instruction fetch, 1 data fetch.
	Store indirect via DS:register, 1 instruction fetch, 1 store.

68000:
	Load word into register, 1 instruction fetch, 1 data fetch.
	Store indirect with offset, 1 instruction fetch, 1 store.

There might be a little more overhead on the 68000, but you CAN write 16
bit code for it. And there are compilers that do this (memory models on
the 68000, no less!). And you don't have to cripple people with 64K
limitations.

Finally, on a real 32-bit machine like the 68020 or 80386 there is NO
advantage to 16-bit code... but the 68000 16-bit code will run in native
mode on the 68020. Not so for the 80286 code on the 80386.
-- 
Peter da Silva, *NIX support guy @ Ferranti International Controls Corporation.
Biz: peter@ficc.uu.net, +1 713 274 5180. Fun: peter@sugar.hackercorp.com. `-_-'
"export ENV='${Envfile[(_$-=1)+(_=0)-(_$-!=_${-%%*i*})]}'" -- Tom Neff     'U`
"I didn't know that ksh had a built-in APL interpreter!" -- Steve J. Friedl

gwyn@smoke.BRL.MIL (Doug Gwyn) (08/25/89)

In article <7555@cg-atla.UUCP> fredex@cg-atla.UUCP (Fred Smith) writes:
>Well, even then it ain't transparent!  You will still get bitten by
>the stupid segmented architecture, because there are, even in large
>or huge model, restrictions on either the size of an array, or on
>the size and/or alignment of the elements of that array!

There are such restrictions in almost any C environment; this is not
unique to *86-style architectures.  The compiler is responsible for
arranging proper alignment etc. for you.  

gregg@cbnewsc.ATT.COM (gregg.g.wonderly) (08/25/89)

From article <664@philmtl.philips.ca>, by ray@philmtl.philips.ca (Raymond Dunn):
> Hey, it's easy.  If you don't want to bother yourself with memory models,
> then always use the large or huge models and forget about it.

Funny, I have yet to see a compiler for the Intel 80x (x < 386) family that
can increment a pointer through more than 64K.  Anyone else seen one?

-- 
-----
gregg.g.wonderly@att.com   (AT&T bell laboratories)

wjr@ftp.COM (Bill Rust) (08/25/89)

In article <2694@cbnewsc.ATT.COM> gregg@cbnewsc.ATT.COM (gregg.g.wonderly) writes:
>From article <664@philmtl.philips.ca>, by ray@philmtl.philips.ca (Raymond Dunn):
>> Hey, it's easy.  If you don't want to bother yourself with memory models,
>> then always use the large or huge models and forget about it.
>
>Funny, I have yet to see a compiler for the Intel 80x (x < 386) family that
>can increment a pointer through more than 64K.  Anyone else seen one?

Both TurboC and MSC have huge pointers that go through 64K boundaries. (Albeit,
I think that TC does it better at the penalty of longer execution times.) Since
those are the only two compilers that I have had much experience with in the
last couple of years, my guess would be that most of the others have some
support as well.






Bill Rust (wjr@ftp.com)

ray@philmtl.philips.ca (Raymond Dunn) (08/26/89)

In article <2694@cbnewsc.ATT.COM> gregg@cbnewsc.ATT.COM (gregg.g.wonderly) writes:
>From article <664@philmtl.philips.ca>, by ray@philmtl.philips.ca (Raymond Dunn):
>> Hey, it's easy.  If you don't want to bother yourself with memory models,
>> then always use the large or huge models and forget about it.
>
>Funny, I have yet to see a compiler for the Intel 80x (x < 386) family that
>can increment a pointer through more than 64K.  Anyone else seen one?

MSC 5.1 Huge memory model:

"The huge-model option is similar to the large model option, except that the
restriction on the size of individual data items [to 64K] is removed for
arrays."

There are some restrictions and problems of course, as there are on *most*
architectures, specifically, no array *element* can be more than 64K.  There
are obvious difficulties with sizeof and pointer subtraction unless the
appropriate cast is used.  This is of course a consequence of an int being 16
bits, not of the segmentation.

Let's not get into an achitecture war again.  It's fairly generally accepted
that the more orthogonal architectures are intrinsically "better" than the more
ad-hoc, and that segmentation does have *some* advantages hidden amongst the
anguish.  That is *not* what's being discussed here.
-- 
Ray Dunn.                    | UUCP: ..!uunet!philmtl!ray
Philips Electronics Ltd.     | TEL : (514) 744-8200  Ext: 2347
600 Dr Frederik Philips Blvd | FAX : (514) 744-6455
St Laurent. Quebec.  H4M 2S9 | TLX : 05-824090

karl@haddock.ima.isc.com (Karl Heuer) (08/28/89)

In article <671@philmtl.philips.ca> ray@philmtl.philips.ca (Raymond Dunn) writes:
>[In huge model] there are obvious difficulties with sizeof and pointer
>subtraction unless the appropriate cast is used.  This is of course a
>consequence of an int being 16 bits, not of the segmentation.

If the vendor would implement Real ANSI C instead of a close approximation
that coddles to existing code that assumes too much about array sizes, these
difficulties would go away.  The correct typedefs for huge model are:
	typedef unsigned long int size_t;
	typedef long int ptrdiff_t;
With these in place, everything works fine without casts (provided the user is
actually using these types, and not assuming that |int| always works).  An
implementation that always typedefs size_t to be a 16-bit object, and requires
the user to attach a cast to sizeof() in huge model, is BROKEN.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint