[net.micro.pc] Segment Registers -- 8086/80286

skip@ubvax.UUCP (06/02/86)

In article <146@apr.UUCP> las@apr.UUCP (Larry Shurr) writes:
> ...
>I haven't tried to bring up 3.6 yet, but I have brought up 3.5.  The
>D model works great iiiiiiffffffff..... you also use the -s switch
>(or use -mds if you compile with lc).  This tells the compiler that
>you don't want him to "normalize" all pointers and all results of
>pointer arithmetic.  A normalized pointer is one in which the offset
>is always in the range 0x0 - 0xF and the segment is pointing at the
>appropriate paragraph.  For example:
>
>	0900:1237	is converted to		0A23:0007
>
>The advantages include: comparisons between arbitrary pointers are
>meaningful and addressing (using pointers) > 64k of data.  Disadvantage
>is significant CPU time consumed normalizing pointers.
> ...

... and if you ever plan on writing programs for the i80286, don't
"normalize" pointers.  To paraphrase some Microsoft documentation:

	When a program loads a segment register on an 8086, it is
	actually loading a displacement value.  On the 80286, what
	is loaded is actually a segment number indexing to an entry
	in the master segment table.  The actual displacement value
	is loaded from that table.  There is NO relation between
	segment N and segment N+1 (ie N+1:0 != N:10h).

I assume that they're talking about the protected mode of the 286.  Since
DOS presently doesn't use the protected mode, you're OK for now.  Still,
any programming practice which counts on two segment:offset pairs pointing
to the same place in memory will need to be changed before DOS 5.0.
If segment:offset pairs are used properly, normalization is not needed.

-- Skip Addison
   {lll-crg, decwrl, ihnp4}!amdcad!cae780!ubvax!skip

kpk@gitpyr.UUCP (Kevin P. Kleinfelter) (06/04/86)

In article <494@ubvax.UUCP>, skip@ubvax.UUCP writes:
> In article <146@apr.UUCP> las@apr.UUCP (Larry Shurr) writes:
> > ...
> ... and if you ever plan on writing programs for the i80286, don't
> "normalize" pointers.  To paraphrase some Microsoft documentation:
> 
> 	When a program loads a segment register on an 8086, it is
> 	actually loading a displacement value.  On the 80286, what
> 	is loaded is actually a segment number indexing to an entry
> 	in the master segment table.  The actual displacement value
> 	is loaded from that table.  There is NO relation between
> 	segment N and segment N+1 (ie N+1:0 != N:10h).
>
So how will one compare pointers?  If you can't normalize them, two pointers
could reference the same address, but not contain the same values.  In
a language such as Pascal, equality is a valid test on pointers.  In a language
such as Modula-2, pointer arithmetic is valid.  How can items such as these
be implemented without normalization of pointers?

                                 Confused,
                                 Kevin Kleinfelter (kpk@gitpyr.UUCP)
 

dmt@mtuxt.UUCP (D.TUTELMAN) (06/06/86)

In article <1851@gitpyr.UUCP>, Kevin Kleinfelter writes:
Organization: AT&T Information Systems, Holmdel NJ
Lines: 46
CC: dmt



> In article <494@ubvax.UUCP>, skip@ubvax.UUCP writes:
> > 
> > 	When a program loads a segment register on an 8086, it is
> > 	actually loading a displacement value.  On the 80286, what
> > 	is loaded is actually a segment number indexing to an entry
> > 	in the master segment table.  The actual displacement value
> > 	is loaded from that table.  There is NO relation between
> > 	segment N and segment N+1 (ie N+1:0 != N:10h).
> >
> So how will one compare pointers?  If you can't normalize them, two pointers
> could reference the same address, but not contain the same values.  In
> a language such as Pascal, equality is a valid test on pointers.  In a language
> such as Modula-2, pointer arithmetic is valid.  How can items such as these
> be implemented without normalization of pointers?

A valid concern, but also a concern with the 8086.  Actually, the inequality
above should not say "not equal"; it should say "doesn't necessarily equal".
It is possible to design compilers (or even write your own ASM code -- ugh!)
for the 286 that (1) preserve the 8086 segment-offset relation, or
(2) use protection to allow any address to be accessible from only ONE	
possible segment value.  Note that:

   -	(1) is wasteful of the 286 power.  That doesn't mean it's a silly
	thing to do.  A recent Intel AppNote on the LOADALL instruction
	shows how to emulate REAL MODE in protected mode to run
	pre-existing programs from the 8086.
	
   -	(2) is a very safe way to allow pointer comparison.  It allows you
	to use the SEG:OFS as a REAL 32-bit pointer.  (Well, sort of "real".
	You still have to do your pointer arithmetic being consciously
	aware that it's two 16-bit words.)  I suspect it will
	be a popular choice for native-mode 286 code generators.

Note that (1) and (2) are only a couple of the myriad choices you have
for using segments in the 286.  How the segments are used is less
a function of the hardware than how your compiler chooses to generate
its code.

             		Dave Tutelman
                	Physical - AT&T Information Systems
                  		Room 1H120
                   		Juniper Plaza, Route 9
                  		Freehold, NJ 07728
                	Logical -  ...ihnp4!mtuxo!mtuxt!dmt
             		Audible -  (201) 577 4232
---------------------------------------------------------------

las@apr.UUCP (Larry Shurr) (06/10/86)

In article <1851@gitpyr.UUCP> kpk@gitpyr.UUCP (Kevin P. Kleinfelter) writes:
>In article <494@ubvax.UUCP>, skip@ubvax.UUCP writes:
>> In article <146@apr.UUCP> las@apr.UUCP (Larry Shurr) writes:
>> > ...
>> ... and if you ever plan on writing programs for the i80286, don't
>> "normalize" pointers.  To paraphrase some Microsoft documentation:
>> 
>> 	...There is NO relation between
>> 	segment N and segment N+1 (ie N+1:0 != N:10h).
>>
>So how will one compare pointers?  If you can't normalize them, two pointers
>could reference the same address, but not contain the same values...

Presumably, you would have to assure that you never set up a mapping which
creates overlapping segments.  I believe that this is an assumption made by
Intel (at least in their assumptions about how virtual memory -- i.e.,
paging -- works).  Of course this brings on the dreaded "64K addressing
horizon" problem which limits in-memory objects to a maximum size of 64K.
Normalization of pointers was introduced to extend the addressing horizon
using a software solution, but it is probably not consistant with Intel's
intentions.  It also limits pointer arithmetic to addressing within a
segment.

One could also imagine setting up a funky mapping in the segment table so
that arithmetic on segment:offset pairs was meaningful, but the results
would be very arcane and complicated.  Operating system and language
implementors would probably not want to introduce this sort of software
fix for the addressing horizon problem, especially with the 386 on the
way (though who knows what problems that will create?).

Regards,
Larry



-- 
------------------------------------------------------------------------------
BRITANNUS (shocked): Caesar, this is not proper.
THEODOTUS (outraged): How?
CAESAR (recovering his self-possession):
  Pardon him Theodotus: he is a barbarian, and thinks that
  the customs of his tribe and island are the laws of nature.
(_Caesar and Cleopatra_, Act II - G. B. Shaw)

Larry A. Shurr (osu-eddie!apr!las || 137c South Towne Ln; Delaware, OH 43015)

johnl@ima.UUCP (John R. Levine) (06/25/86)

In article <1851@gitpyr.UUCP> kpk@gitpyr.UUCP (Kevin P. Kleinfelter) writes:
>In article <494@ubvax.UUCP>, skip@ubvax.UUCP writes:
>> ... and if you ever plan on writing programs for the i80286, don't
>> "normalize" pointers.
> ...
>So how will one compare pointers?  If you can't normalize them, two pointers
>could reference the same address, but not contain the same values. ...

Not likely.  The 286 is a genuine segmented architecture (as opposed to the
8086 which is an unsatisfactory approximation.)  The operating system
controls where each segment is mapped.  Unless you have a strange
operating system, you can assume that the contents of different segments are
disjoint, so that if two pointers point into different segments, they're not
pointing at the same thing.

Just so you shouldn't think the 286 is a reasonable chip, though, you have
to take into account that the lowest two bits of the segment address are
not part of the segment number but rather tell what protection level the
segment is allegedly addressed at.  You really should mask off those bits
before comparing pointers, although in user programs, their value is
unimportant since the user can only reference user mode data, so a sensible
program will always make them zero.  I'm sure lots of great bugs will show
up, though.
-- 
John R. Levine, Javelin Software Corp., Cambridge MA +1 617 494 1400
{ ihnp4 | decvax | cbosgd | harvard | yale }!ima!johnl, Levine@YALE.EDU
The opinions expressed herein are solely those of a 12-year-old hacker
who has broken into my account and not those of any person or organization.

jrv@siemens.UUCP (06/26/86)

>In article <1851@gitpyr.UUCP> kpk@gitpyr.UUCP (Kevin P. Kleinfelter) writes:
>>In article <494@ubvax.UUCP>, skip@ubvax.UUCP writes:
>>> ... and if you ever plan on writing programs for the i80286, don't
>>> "normalize" pointers.
>> ...
>>So how will one compare pointers?  If you can't normalize them, two pointers
>>could reference the same address, but not contain the same values. ...
>
>Not likely.  The 286 is a genuine segmented architecture (as opposed to the
>8086 which is an unsatisfactory approximation.)  The operating system
>controls where each segment is mapped.  Unless you have a strange
>operating system, you can assume that the contents of different segments are
>disjoint, so that if two pointers point into different segments, they're not
>pointing at the same thing.
>

The problem with this is that the "thing" you want to point to can not be
larger than the size of one segment. Sometimes one needs a big data item.
Using normalization and conversion to linear addressing for calculations
with pointers the segment boundaries can be overcome.

These processors were designed when 48K was a lot of memory on a
microprocessor system. Maybe the basic problem with the 8086 and 80286 is
that their segment size is smaller than what is needed for many of the
current programming problems.

Question:
	On other machine architectures/operating systems which use
	segmentation are the segment sizes larger? Do the high level
	languages allow you a way to work around this size limit?


Jim Vallino
Siemens Research and Technology Lab.
Princeton, NJ
{allegra,ihnp4,seismo,philabs}!princeton!siemens!jrv

syncro@looking.UUCP (Tom Haapanen) (06/29/86)

In article <140@ima.UUCP> johnl@ima.UUCP (John R. Levine) writes:
>>> ... and if you ever plan on writing programs for the i80286, don't
>>> "normalize" pointers.

>>So how will one compare pointers?  If you can't normalize them, two pointers
>>could reference the same address, but not contain the same values. ...

>Not likely.  The 286 is a genuine segmented architecture (as opposed to the
>8086 which is an unsatisfactory approximation.)  The operating system
>controls where each segment is mapped.  Unless you have a strange
>operating system, you can assume that the contents of different segments are
>disjoint, so that if two pointers point into different segments, they're not
>pointing at the same thing.

Ummm, if you ask DOS to allocate memory for you, it returns a *segment* 
address, and you are to assume zero offset (see DOS function calls 48 and
49).  Now, unless you are allocating in 64K-sized chunks, the segments
won't likely be disjoint.  I am no great fan of DOS, but I'm not sure I'd
call it *strange* in net.micro.pc...

--
\tom haapanen					looking glass software ltd.
syncro@looking.UUCP				waterloo, ontario, canada
watmath!looking!syncro				(519) 884-7473

"These opinions are solely mine, although even I would like to deny them..."

ddb@starfire.UUCP (David Dyer-Bennet) (07/24/86)

> >>So how will one compare pointers?  If you can't normalize them, two pointers
> >>could reference the same address, but not contain the same values. ...
> 
> >Not likely.  The 286 is a genuine segmented architecture (as opposed to the
> >8086 which is an unsatisfactory approximation.)  The operating system
> >controls where each segment is mapped.  Unless you have a strange
> >operating system, you can assume that the contents of different segments are
> >disjoint, so that if two pointers point into different segments, they're not
> >pointing at the same thing.
> 
> Ummm, if you ask DOS to allocate memory for you, it returns a *segment* 
> address, and you are to assume zero offset (see DOS function calls 48 and
> 49).  Now, unless you are allocating in 64K-sized chunks, the segments
> won't likely be disjoint.  I am no great fan of DOS, but I'm not sure I'd
> call it *strange* in net.micro.pc...
> 
> --
> \tom haapanen					looking glass software ltd.
> syncro@looking.UUCP				waterloo, ontario, canada
> watmath!looking!syncro				(519) 884-7473

If you're running DOS, the 286 is running in real mode and its differences
are irrelevant.

If you're running Xenix V2, then it's in virtual mode.  Segments have a
length associated with them, so again you have non-overlapping segments
(many smaller than 64k).  There's no hardware to prevent you from having
segments pointing at the same place, but it doesn't happen naturally to
a user program.

		-- David Dyer-Bennet
		Usenet:  ...ihnp4!umn-cs!starfire!ddb
		Fido: sysop of fido 14/341, (612) 721-8967
		Telephone: (612) 721-8800
		USmail: 4242 Minnehaha Ave S
			Mpls, MN 55406