[comp.arch] 386/486 Virtual Memory Question...

hs0l+@andrew.cmu.edu (Hugh Brinkley Sprunt) (06/07/89)

The 386 and 486 architectures claim a virtual address space size of
2^46 bytes.  The virtual address is formed from a 14 bit selector
and a 32 bit offset.  What does this really mean?  Currently we have
two interpretations:

	a) This scheme gives us 2^14 different ways to map into
	   the same 32 bit address space.

	b) This scheme gives us 2^14 independent address spaces
	   of 2^32 bytes each.  In other words, the virtual memory
	   scheme consists of as many as 2^14 segments of 2^32
	   bytes each.

Which one is the correct interpretation?  If (a) is the answer, why was
it done this way?  If (b) is the answer, how does the processor
communicate
the 14 bits of selector information to the memory mapping
hardware/software?

I think (b) is the correct answer, but after spending a few minutes with
some Intel literature, I'm more confused than I was when I started.
Any comments would be appreciated.  Thanks.

Brinkley Sprunt
Elecetical & Computer Engineering
Carnegie Mellon University
sprunt@maxwell.ece.cmu.edu

cliff@ficc.uu.net (cliff click) (06/07/89)

In article <MYX2SzW00XoF42d1lf@andrew.cmu.edu>, hs0l+@andrew.cmu.edu (Hugh Brinkley Sprunt) writes:
> 
> The 386 and 486 architectures claim a virtual address space size of
> 2^46 bytes.  The virtual address is formed from a 14 bit selector
> and a 32 bit offset.  
> 	b) This scheme gives us 2^14 independent address spaces of 2^32 
>      bytes each.  In other words, the virtual memory scheme 
>      consists of as many as 2^14 segments of 2^32 bytes each.
> 
> If (b) is the answer, how does the processor communicate the 14 bits of 
> selector information to the memory mapping hardware/software?

The standard addressing hardware looks the selector up in a local (per 
process) or global selector table.  In this table each selector has a
base physical address, a size, and some privilege bits (r/w/x).  This info
is cached on-chip; when you load a new selector it all gets loaded.

When you get an page fault or other interrupt the faulting selector is 
made available to the interrupt task (it's on the stack or something).

Maybe some Intel guru should fill in the details...

-- 
Cliff Click, Software Contractor at Large
Business: uunet.uu.net!ficc!cliff, cliff@ficc.uu.net, +1 713 274 5368 (w).
Disclaimer: lost in the vortices of nilspace...       +1 713 568 3460 (h).

chasm@killer.DALLAS.TX.US (Charles Marslett) (06/08/89)

In article <MYX2SzW00XoF42d1lf@andrew.cmu.edu>, hs0l+@andrew.cmu.edu (Hugh Brinkley Sprunt) writes:
> 
> The 386 and 486 architectures claim a virtual address space size of
> 2^46 bytes.  The virtual address is formed from a 14 bit selector
> and a 32 bit offset.  What does this really mean?  . . .

[stuff omitted]

> 	a) This scheme gives us 2^14 different ways to map into
> 	   the same 32 bit address space.
This is it:  the chip has 32 address pins so everything has to be mapped
down into the 2^32 address space before it can go to memory.  Unlike some
architectures, with tag fields for caches and such, the Intel design does
not provide for anything else.  On the other hand, in a virtual memory
environment, the software can use those extra bits of information so the
virtual address space (on the very, very big swapping disk) can be IMMENSE.

> Brinkley Sprunt
> Elecetical & Computer Engineering
> Carnegie Mellon University
> sprunt@maxwell.ece.cmu.edu

Charles Marslett
chasm@killer.dallas.tx.us

munck@linus.UUCP (Robert Munck) (06/08/89)

In article <MYX2SzW00XoF42d1lf@andrew.cmu.edu> hs0l+@andrew.cmu.edu (Hugh Brinkley Sprunt) writes:
>
>The 386 and 486 architectures claim a virtual address space size of
>2^46 bytes.  The virtual address is formed from a 14 bit selector
>and a 32 bit offset.  What does this really mean?  Currently we have
>two interpretations:
>
>	a) This scheme gives us 2^14 different ways to map into
>	   the same 32 bit address space.
>
>	b) This scheme gives us 2^14 independent address spaces
>	   of 2^32 bytes each.  In other words, the virtual memory
>	   scheme consists of as many as 2^14 segments of 2^32
>	   bytes each.
>

No, both are wrong, though the second sentence of (b) is true.  Each
task has a (single) *two-dimensional* address space of 2^14 segments,
each of which can be as large as 2^32 bytes (2^13 segments shared with
all other tasks - Global Descriptor Table, 2^13 potentially private -
Local Descriptor Table).

Also, (a) is partially true.  All segments currently in use (ie with
their descriptors in segment registers or flagged "Present") must map
into a single, potentially private "linear address space" of 2^32 bytes. 
To allow use of the whole 2^46 address space, the OS must field segment
exceptions and manipulate the task's page directory and page tables to
change the mapping of the linear address space.

In addition to the global 2^13 segment descriptors, tasks can share
directories (second-level page tables) or individual page tables or
individual pages.  Segments can overlap each other in the linear address
space in arbitrary ways.  The number of design alternatives is immense.

For example, in the OS I'm building, I define a "process" as a set of
tasks sharing a single directory (therefore sharing a single linear
address space) and sharing a single LDT.  1024 of the LDT entries are
mapped 1:1 to entries in the directory and to individual page tables,
therefore defining 1024 4Mbyte segments.  Each of these can be connected
to a disk file of 1..1024 page images.  To run a program, connect one of
the segments to a file of executable code and set the CS (Code Segment)
register to it and the IP (Instruction Pointer) register to the offset
of the entry point.  Instruction fetch will cause a page fault, the OS
brings in the page, and the program is off and running.  To do "disk
I/O," connect the data file to another segment and access it with move
instructions.  (Consequences: files are 4Mbytes max and a program can
have no more than 1023 of them in use at once; not serious limits.)

(BTW, my OS implements B3 security, is written in Pascal, and will be
Public Domain when finished.)

                            -- Bob Munck, MITRE-Washington
                            -- munck@mitre.org, ...!linus!munck
                            -- 703/883-6688

sundar@mipos2.intel.com (Sundar Iyengar~) (06/12/89)

In article <MYX2SzW00XoF42d1lf@andrew.cmu.edu>, Hugh Brinkley Sprunt writes:
> 
> The 386 and 486 architectures claim a virtual address space size of
> 2^46 bytes.  The virtual address is formed from a 14 bit selector
> and a 32 bit offset.  
> 	b) This scheme gives us 2^14 independent address spaces of 2^32 
>      bytes each.  In other words, the virtual memory scheme 
>      consists of as many as 2^14 segments of 2^32 bytes each.
> 
> If (b) is the answer, how does the processor communicate the 14 bits of 
> selector information to the memory mapping hardware/software?

b) is the correct answer.  Page 2-2 of the 80386 Programmer's Ref Manual
says this: "Applications programmers view the logical address space of
the 80386 as a collection of 16,383 one-dimensional subspaces, each
with a specified length [ranging] from one byte upto a maximum
of 2^32 bytes".

The 14 bit selector information is held in a segment register.  For code
segments, the segment register is CS.  For data segments DS, ES, FS and GS
may contain the segment selector bits.  The stack segment selector is SS.

During code execution, the instructions are fetched from the code segment
selected by CS.  The data may come from any data segment selected by the
four data segment registers.  More than four data segments can be accessed
by appropriately loading the data segment registers with selector
information.

The default segment selection is:
   CS for code
   SS for stack
   DS for local data
   ES for string instruction destinations

Special instruction prefix elements may be used to override the default
segment selection.

So, to answer your question, "the processor [communicates] the
14 bits of selector information to the memory mapping hardware"
by looking up the corresponding selector register.

Sundar Iyengar                      Microprocessor Design

UUCP:  intelca!mipos3!mipos2!sundar Intel, SC4-59
ARPA:  sundar@mipos2.intel.com      2625, Walsh Avenue
CSNET: sundar@mipos2.intel.com      Santa Clara, CA 95051
AT&T:  O: (408) 765-5206

johnl@ima.ima.isc.com (John R. Levine) (06/12/89)

In article <243@mipos3.intel.com> sundar@mipos2.intel.com (Sundar Iyengar~) writes:
>Page 2-2 of the 80386 Programmer's Ref Manual
>says this: "Applications programmers view the logical address space of
>the 80386 as a collection of 16,383 one-dimensional subspaces, each
>with a specified length [ranging] from one byte upto a maximum
>of 2^32 bytes".

Unfortunately, since all 2^46 addresses are mapped through a page table with
only room for 2^32 addresses, the 386's segmentation is considerably less
useful than it might otherwise be.  If, for example, you did the obvious
unix trick of putting your static data and heap at the low end of a segment
where it can grow up, and your stack at the high end of the segment where it
can grow down, you have to map all 2^32 addresses in that segment into the
page table, but that doesn't leave room for anything else.  Oops.

A single process' address space is limited to 2^32 simultaneously mapped
addresses. In theory you could have a larger total address space and play
games with mapping segments in and out, but that puts strange limits on
applications as to which segments they can reference at the same time, so in
reality the limit translates to 2^32 total addresses per process. Unix systems
map all segments to the same place and just use the paging.
-- 
John R. Levine, Segue Software, POB 349, Cambridge MA 02238, +1 617 492 3869
{ bbn | spdcc | decvax | harvard | yale }!ima!johnl, Levine@YALE.something
Massachusetts has 64 licensed drivers who are over 100 years old.  -The Globe

davidsen@sungod.crd.ge.com (William Davidsen) (06/13/89)

From your limited description I suspect that you looked at the Multics
filesystem during the design phase.
	bill davidsen		(davidsen@crdos1.crd.GE.COM)
  {uunet | philabs}!crdgw1!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

munck@linus.UUCP (Robert Munck) (06/13/89)

In article <4056@ima.ima.isc.com> johnl@ima.UUCP (John R. Levine) writes:
>Unfortunately, ... the 386's segmentation is considerably less
>useful...  If, for example, you did the obvious unix trick of putting your 
> static data and heap at the low end of a (2^32-byte) segment
>and your stack at the high end ... 
>you have to map all 2^32 addresses in that segment into the
>page table, but that doesn't leave room for anything else.  Oops.
>
Boy, there's a limitation!  You can't separate your heap and stack by
the full four gigabytes!!  I suppose that if you could only leave two
gigabytes between them, there wouldn't be enough room for them to grow.

>A single process' address space is limited to 2^32 simultaneously mapped
>addresses. 
>
Not true.  The address space is limited to 2^14 segments, each of up to
2^32 bytes.  That's not quite what you said with "simultaneously mapped."

>In theory you could have a larger total address space and play
>games with mapping segments in and out, but that puts strange limits on
>applications as to which segments they can reference at the same time, so in
>reality the limit translates to 2^32 total addresses per process. 
>
The "games" are no stranger than mapping virtual pages into and out of
real memory; in fact, the two are done exactly the same way.  Your "in
reality" should read "when you don't need to or want to use the chip's
segmentation facilities."  Note that there are NO limitations
on the applications; all this is invisible to them.

>Unix systems map all segments to the same place and just use the paging.
>
I'm happy for you that UNIX is sufficiently primitive that you can leave
such a powerful facility unused, although I'm not sure you can do a
secure UNIX with that limitation.  I suggest you look into doing the coding
to use full segmentation; it'll make the higher-level code much simpler.

                          -- Bob Munck, MITRE Corporation
                          -- munck@mitre.org, [backbone]!linus!munck

bfranke@peun39.UUCP (06/13/89)

/* Written  9:36 pm  Jun  6, 1989 by hs0l+@andrew.cmu.edu.UUCP in peun39:comp.arch */
/* ---------- "386/486 Virtual Memory Question..." ---------- */

The 386 and 486 architectures claim a virtual address space size of
2^46 bytes.  The virtual address is formed from a 14 bit selector
and a 32 bit offset.  What does this really mean?  Currently we have
two interpretations:

	a) This scheme gives us 2^14 different ways to map into
	   the same 32 bit address space.

	b) This scheme gives us 2^14 independent address spaces
	   of 2^32 bytes each.  In other words, the virtual memory
	   scheme consists of as many as 2^14 segments of 2^32
	   bytes each.

Which one is the correct interpretation?  If (a) is the answer, why was
it done this way?  If (b) is the answer, how does the processor
communicate
the 14 bits of selector information to the memory mapping
hardware/software?

I think (b) is the correct answer, but after spending a few minutes with
some Intel literature, I'm more confused than I was when I started.
Any comments would be appreciated.  Thanks.

Brinkley Sprunt
Elecetical & Computer Engineering
Carnegie Mellon University
sprunt@maxwell.ece.cmu.edu
/* End of text from peun39:comp.arch */

munck@linus.UUCP (Robert Munck) (06/13/89)

In article <734@crdgw1.crd.ge.com> davidsen@crdos1.UUCP (bill davidsen) writes:
>From your limited description I suspect that you looked at the Multics
>filesystem during the design phase.

Why, yes, I did look at the Multics filesystem while it was being designed.
I was in the same building (545 Tech Sq) working on CP-67/CMS (now VM/370).
R. Kogut and I wrote a version of CP that used virtual memory to implement
CP minidisks that was in production use at Brown well into the 1980's.
                                -- Bob Munck, MITRE

boyne@hplvli.HP.COM (Art Boyne) (06/13/89)

munck@linus.UUCP (Robert Munck) writes:
> (Consequences: files are 4Mbytes max and a program can
> have no more than 1023 of them in use at once; not serious limits.)

Actually, for some of use who do simulations, the 4 Mbyte file limit
is a serious limitation.  I have seen a simulation whose output file
was 100 Mbyte (it took 3 days to run on a minicomputer).  I have also
seen CAD/CAM design files that frequently go 6-8 Mbyte each.

Art Boyne, boyne@hplvla.hp.com

stuart@bms-at.UUCP (Stuart Gathman) (06/14/89)

The 386 has one page register that points to tables mapping 2^32 bytes.
In addition, there are up to 2^14 entries in a segment table.  Each
segment has a 32 bit offset and size.  (The size is actually less than
32 bits, it uses kind of floating point with 1 bit exponents.)

The logical address space is therefore 2^46.

	However,

Changing the page register causes all kinds of TLB misses and extra memory
reads to load needed pieces of the new page tables.  This is very inefficient.

The most efficient way to use both segments and paging is to map the
segments onto a single 2^32 paged space and never change the page register.
The paged space should be treated as a scarce resource and only the most
recently used segments mapped to it.  Stale segments are unmapped and the
segment entry marked not present.  On a segment fault, the oldest segment
is mapped out, and the faulting segment mapped back in.  Since the processor
can have 6 segment registers loaded simultaneously, you have to guarrantee
that all 6 will fit into 2^32 bytes.  (Segment faults are detected when
loading, not when referencing segment registers.)  The simplest way to do
this is to limit the size of a single segment to 2^32/6.  This gives you
a virtual space of 2^46/6 bytes.  A max sement size of 2^32/8 has certain
advantages, so I would make the virtual space 2^43 bytes.

	Therefore, the maximum address space actual usable is somewhat
smaller than claimed by Intel, but still quite large.  I can see using
2^45 bytes with severe restrictions on application code.  (Only two
segments are valid at any one time.  Don't mess with the code segment.)
I can't see any reasonable way to use 2^46.

Because AT&T UNIX doesn't provide for two level VM without some hacking,
so *nix implementations simply ignore the segments.
-- 
Stuart D. Gathman	<stuart@bms-at.uucp>
			<..!{vrdxhq|daitc}!bms-at!stuart>

johnl@ima.ima.isc.com (John R. Levine) (06/15/89)

In article <55957@linus.UUCP> you write:
>In article <4056@ima.ima.isc.com> johnl@ima.UUCP (John R. Levine) writes:
>>Unfortunately, ... the 386's segmentation is considerably less
>>useful...  
>...  I suppose that if you could only leave two
>gigabytes between them, there wouldn't be enough room for them to grow.

Of course you can do that, and indeed that's exactly what real systems do,
but it's a hack forced on you by the poor implementation of the chip's
addressing architecture.

>>In theory you could have a larger total address space and play
>>games with mapping segments in and out, but that puts strange limits on
>>applications as to which segments they can reference at the same time, so in
>>reality the limit translates to 2^32 total addresses per process. 
>>
>The "games" are no stranger than mapping virtual pages into and out of
>real memory; in fact, the two are done exactly the same way.  Your "in
>reality" should read "when you don't need to or want to use the chip's
>segmentation facilities."  Note that there are NO limitations
>on the applications; all this is invisible to them.

Actually, the games are considerably stranger than mapping virtual pages in
and out, and there are real limitations. When you map pages, every page is the
same size and is tiny compared to the size of the address space. When you map
segments, they're all different sizes and each one is potentially as big as
the full address space. Assume you have two sparse segments, each slightly
bigger than 2GB, and you do a MOVS or something from one to the other. The
system has to map in both segments at the same time, but it can't because the
linear address space isn't big enough. This is a real problem, and the only
way I know of to ensure that programs don't die from address space starvation
is to limit the total addressable to less than 4GB. As noted by another
poster, the cost of mapping segments is also very high -- any time you change
a page table you have to flush the entire TLB.

Had Intel made the page tables per-segment rather than per-process your
comments would be true, but for some reason, they didn't. I suspect that
paging was shoved into the 386 rather late in the design.

>>Unix systems map all segments to the same place and just use the paging.
>>
>I'm happy for you that UNIX is sufficiently primitive that you can leave
>such a powerful facility unused, although I'm not sure you can do a
>secure UNIX with that limitation.  I suggest you look into doing the coding
>to use full segmentation; it'll make the higher-level code much simpler.

Lots of us have looked into segmented Unix, and 286 Unix implementations work
that way. It's not very satisfactory. One problem is that there seems to be no
programming language that matches the 386's 48-bit pointer semantics very
well. People have added near and far pointers to C, but it is a widely reviled
hack and in practice makes code more complicated, not simpler. The other is
that there is a dreadful performance hit from using segments; dereferencing an
in-segment pointer takes 4 cycles while dereferencing a 48-bit pointer takes
22 or 25.
-- 
John R. Levine, Segue Software, POB 349, Cambridge MA 02238, +1 617 492 3869
{ bbn | spdcc | decvax | harvard | yale }!ima!johnl, Levine@YALE.something
Massachusetts has 64 licensed drivers who are over 100 years old.  -The Globe

sbf10@uts.amdahl.com (Samuel Fuller) (06/16/89)

In article <48000001@peun39> bfranke@peun39.UUCP writes:
>
>/* Written  9:36 pm  Jun  6, 1989 by hs0l+@andrew.cmu.edu.UUCP in peun39:comp.arch */
>/* ---------- "386/486 Virtual Memory Question..." ---------- */
>
>The 386 and 486 architectures claim a virtual address space size of
>2^46 bytes.  The virtual address is formed from a 14 bit selector
>and a 32 bit offset.  What does this really mean?  Currently we have
>two interpretations:
>
>	a) This scheme gives us 2^14 different ways to map into
>	   the same 32 bit address space.
>
>	b) This scheme gives us 2^14 independent address spaces
>	   of 2^32 bytes each.  In other words, the virtual memory
>	   scheme consists of as many as 2^14 segments of 2^32
>	   bytes each.
>
>Which one is the correct interpretation?  If (a) is the answer, why was
>it done this way?  If (b) is the answer, how does the processor
>communicate
>the 14 bits of selector information to the memory mapping
>hardware/software?
>
>I think (b) is the correct answer, but after spending a few minutes with
>some Intel literature, I'm more confused than I was when I started.
>Any comments would be appreciated.  Thanks.
>

If they did it right the correct answer should be (a) or (b).  You want
(a) because it allows different processes operating in different address
spaces to still share data.  The kernel can be shared in this way.  (B)
is usually handled by tagging the page table entries in the TLB (MMU) with
the Address Space ID.  The real page number should be a function of both
the ASID and the Virtual Page number.  If the page translator is incapable
of seeing the ASID then it is useless except for flushing virtually
addressed caches.

>Brinkley Sprunt
>Elecetical & Computer Engineering
>Carnegie Mellon University
>sprunt@maxwell.ece.cmu.edu

Sam Fuller / Amdahl System Performance Architecture
amdahl!sbf10

munck@linus.UUCP (Robert Munck) (07/22/89)

In article <310003@hplvli.HP.COM> boyne@hplvli.HP.COM (Art Boyne) writes:
>Actually, for ... simulations, the 4 Mbyte file limit is serious.  I have
>seen a simulation whose output file was 100 MByte... also 
>CAD/CAM design files that frequently go 6-8 Mbyte each.
>

Sure, no matter what upper limit you chose, there'll be users who
need it to be higher.  The question is, can the job be done within
the limits?  For example, the simulation output could be directed
to a tape or into a sequence of files, since it's (probably)
sequential.  The CAD/CAM file is more of a problem, as it's probably
random access.  It happens that the higher-level OS that is to
be implemented on my secure kernel has an entity-relationship-attribute
file system that can put multiple tiny files in one of my 4K byte
minimum segments or build structures of larger files.  The CAD/CAM
file is probably a data structure that could be spread over many
nodes in the ERA structure.  In other words, the CAD/CAM system writers
could construct their data structure in the ERA filespace rather than
in a single linear file.

The idea that the file system be used to construct arbitrarily-complex
applications data structures is relatively new in the world, as is having
the components of such a file system be strongly typed, with full 
inheritance.
                                -- Bob Munck

munck@linus.UUCP (Robert Munck) (07/22/89)

In article <4067@ima.ima.isc.com> johnl@ima.UUCP (John R. Levine) writes:
>In article <55957@linus.UUCP> you write:
>>...  I suppose that if you could only leave two
>>gigabytes between them, there wouldn't be enough room for them to grow.
>
>Of course you can do that, and indeed that's exactly what real systems do,
>but it's a hack forced on you by the poor implementation of the chip's
>addressing architecture.

The sarcasm of "only two gigabytes" seems to have been missed.  Given
that there has to be an upper limit, it seems to me that one three
orders of magnitude higher than the largest code/stack that 99.9% of
applications currently use is "big enough."  I'm sure there are
pathological applications that need more.

In fact, since the 386 implements "grow-down" segments intended for
things like stacks, the cleanest implementation will give code and stack
separate segments.

>>>In theory you could have a larger total address space and play
>>>games with mapping segments in and out. 
>>>
>>The "games" are no stranger than mapping virtual pages into and out of
>>real memory; in fact, the two are done exactly the same way..
>
>Actually, the games are considerably stranger than mapping virtual pages in
>and out, and there are real limitations. When you map pages, every page is the
>same size and is tiny compared to the size of the address space. When you map
>segments, they're all different sizes and each one is potentially as big as
>the full address space.

Sure, but you still have page mapping to help.  By changing the mapping,
you can move segments around in the linear address space very easily,
and therefore it's easy to make them "fit."

>Assume you have two sparse segments, each slightly
>bigger than 2GB, and you do a MOVS or something from one to the other. The
>system has to map in both segments at the same time, but it can't because the
>linear address space isn't big enough. This is a real problem.

It's a real problem only if you and a lot of other people are constantly
in need of segments on the order of a gigabyte, and the nature of your
applications is such that there's no way to use multiple smaller
segments.  As I said, there has to be a upper limit.  Criticism of Intel
is valid only if that upper limit is demonstratibly too small.

>As noted by another poster, the cost of mapping segments is also very
>high -- any time you change a page table you have to flush the entire TLB.

In my experience, not a problem.  The cost is a handfull of microseconds
to reload the TLB -- having to map through the directory and page tables
a couple of dozen times, maybe as much as ten microseconds.  This is
several orders of magnitude less than other costs of a virtual memory
such as page read time.
                        -- Bob Munck, MITRE Corporation