hs0l+@andrew.cmu.edu (Hugh Brinkley Sprunt) (06/07/89)
The 386 and 486 architectures claim a virtual address space size of 2^46 bytes. The virtual address is formed from a 14 bit selector and a 32 bit offset. What does this really mean? Currently we have two interpretations: a) This scheme gives us 2^14 different ways to map into the same 32 bit address space. b) This scheme gives us 2^14 independent address spaces of 2^32 bytes each. In other words, the virtual memory scheme consists of as many as 2^14 segments of 2^32 bytes each. Which one is the correct interpretation? If (a) is the answer, why was it done this way? If (b) is the answer, how does the processor communicate the 14 bits of selector information to the memory mapping hardware/software? I think (b) is the correct answer, but after spending a few minutes with some Intel literature, I'm more confused than I was when I started. Any comments would be appreciated. Thanks. Brinkley Sprunt Elecetical & Computer Engineering Carnegie Mellon University sprunt@maxwell.ece.cmu.edu
cliff@ficc.uu.net (cliff click) (06/07/89)
In article <MYX2SzW00XoF42d1lf@andrew.cmu.edu>, hs0l+@andrew.cmu.edu (Hugh Brinkley Sprunt) writes: > > The 386 and 486 architectures claim a virtual address space size of > 2^46 bytes. The virtual address is formed from a 14 bit selector > and a 32 bit offset. > b) This scheme gives us 2^14 independent address spaces of 2^32 > bytes each. In other words, the virtual memory scheme > consists of as many as 2^14 segments of 2^32 bytes each. > > If (b) is the answer, how does the processor communicate the 14 bits of > selector information to the memory mapping hardware/software? The standard addressing hardware looks the selector up in a local (per process) or global selector table. In this table each selector has a base physical address, a size, and some privilege bits (r/w/x). This info is cached on-chip; when you load a new selector it all gets loaded. When you get an page fault or other interrupt the faulting selector is made available to the interrupt task (it's on the stack or something). Maybe some Intel guru should fill in the details... -- Cliff Click, Software Contractor at Large Business: uunet.uu.net!ficc!cliff, cliff@ficc.uu.net, +1 713 274 5368 (w). Disclaimer: lost in the vortices of nilspace... +1 713 568 3460 (h).
chasm@killer.DALLAS.TX.US (Charles Marslett) (06/08/89)
In article <MYX2SzW00XoF42d1lf@andrew.cmu.edu>, hs0l+@andrew.cmu.edu (Hugh Brinkley Sprunt) writes: > > The 386 and 486 architectures claim a virtual address space size of > 2^46 bytes. The virtual address is formed from a 14 bit selector > and a 32 bit offset. What does this really mean? . . . [stuff omitted] > a) This scheme gives us 2^14 different ways to map into > the same 32 bit address space. This is it: the chip has 32 address pins so everything has to be mapped down into the 2^32 address space before it can go to memory. Unlike some architectures, with tag fields for caches and such, the Intel design does not provide for anything else. On the other hand, in a virtual memory environment, the software can use those extra bits of information so the virtual address space (on the very, very big swapping disk) can be IMMENSE. > Brinkley Sprunt > Elecetical & Computer Engineering > Carnegie Mellon University > sprunt@maxwell.ece.cmu.edu Charles Marslett chasm@killer.dallas.tx.us
munck@linus.UUCP (Robert Munck) (06/08/89)
In article <MYX2SzW00XoF42d1lf@andrew.cmu.edu> hs0l+@andrew.cmu.edu (Hugh Brinkley Sprunt) writes: > >The 386 and 486 architectures claim a virtual address space size of >2^46 bytes. The virtual address is formed from a 14 bit selector >and a 32 bit offset. What does this really mean? Currently we have >two interpretations: > > a) This scheme gives us 2^14 different ways to map into > the same 32 bit address space. > > b) This scheme gives us 2^14 independent address spaces > of 2^32 bytes each. In other words, the virtual memory > scheme consists of as many as 2^14 segments of 2^32 > bytes each. > No, both are wrong, though the second sentence of (b) is true. Each task has a (single) *two-dimensional* address space of 2^14 segments, each of which can be as large as 2^32 bytes (2^13 segments shared with all other tasks - Global Descriptor Table, 2^13 potentially private - Local Descriptor Table). Also, (a) is partially true. All segments currently in use (ie with their descriptors in segment registers or flagged "Present") must map into a single, potentially private "linear address space" of 2^32 bytes. To allow use of the whole 2^46 address space, the OS must field segment exceptions and manipulate the task's page directory and page tables to change the mapping of the linear address space. In addition to the global 2^13 segment descriptors, tasks can share directories (second-level page tables) or individual page tables or individual pages. Segments can overlap each other in the linear address space in arbitrary ways. The number of design alternatives is immense. For example, in the OS I'm building, I define a "process" as a set of tasks sharing a single directory (therefore sharing a single linear address space) and sharing a single LDT. 1024 of the LDT entries are mapped 1:1 to entries in the directory and to individual page tables, therefore defining 1024 4Mbyte segments. Each of these can be connected to a disk file of 1..1024 page images. To run a program, connect one of the segments to a file of executable code and set the CS (Code Segment) register to it and the IP (Instruction Pointer) register to the offset of the entry point. Instruction fetch will cause a page fault, the OS brings in the page, and the program is off and running. To do "disk I/O," connect the data file to another segment and access it with move instructions. (Consequences: files are 4Mbytes max and a program can have no more than 1023 of them in use at once; not serious limits.) (BTW, my OS implements B3 security, is written in Pascal, and will be Public Domain when finished.) -- Bob Munck, MITRE-Washington -- munck@mitre.org, ...!linus!munck -- 703/883-6688
sundar@mipos2.intel.com (Sundar Iyengar~) (06/12/89)
In article <MYX2SzW00XoF42d1lf@andrew.cmu.edu>, Hugh Brinkley Sprunt writes: > > The 386 and 486 architectures claim a virtual address space size of > 2^46 bytes. The virtual address is formed from a 14 bit selector > and a 32 bit offset. > b) This scheme gives us 2^14 independent address spaces of 2^32 > bytes each. In other words, the virtual memory scheme > consists of as many as 2^14 segments of 2^32 bytes each. > > If (b) is the answer, how does the processor communicate the 14 bits of > selector information to the memory mapping hardware/software? b) is the correct answer. Page 2-2 of the 80386 Programmer's Ref Manual says this: "Applications programmers view the logical address space of the 80386 as a collection of 16,383 one-dimensional subspaces, each with a specified length [ranging] from one byte upto a maximum of 2^32 bytes". The 14 bit selector information is held in a segment register. For code segments, the segment register is CS. For data segments DS, ES, FS and GS may contain the segment selector bits. The stack segment selector is SS. During code execution, the instructions are fetched from the code segment selected by CS. The data may come from any data segment selected by the four data segment registers. More than four data segments can be accessed by appropriately loading the data segment registers with selector information. The default segment selection is: CS for code SS for stack DS for local data ES for string instruction destinations Special instruction prefix elements may be used to override the default segment selection. So, to answer your question, "the processor [communicates] the 14 bits of selector information to the memory mapping hardware" by looking up the corresponding selector register. Sundar Iyengar Microprocessor Design UUCP: intelca!mipos3!mipos2!sundar Intel, SC4-59 ARPA: sundar@mipos2.intel.com 2625, Walsh Avenue CSNET: sundar@mipos2.intel.com Santa Clara, CA 95051 AT&T: O: (408) 765-5206
johnl@ima.ima.isc.com (John R. Levine) (06/12/89)
In article <243@mipos3.intel.com> sundar@mipos2.intel.com (Sundar Iyengar~) writes: >Page 2-2 of the 80386 Programmer's Ref Manual >says this: "Applications programmers view the logical address space of >the 80386 as a collection of 16,383 one-dimensional subspaces, each >with a specified length [ranging] from one byte upto a maximum >of 2^32 bytes". Unfortunately, since all 2^46 addresses are mapped through a page table with only room for 2^32 addresses, the 386's segmentation is considerably less useful than it might otherwise be. If, for example, you did the obvious unix trick of putting your static data and heap at the low end of a segment where it can grow up, and your stack at the high end of the segment where it can grow down, you have to map all 2^32 addresses in that segment into the page table, but that doesn't leave room for anything else. Oops. A single process' address space is limited to 2^32 simultaneously mapped addresses. In theory you could have a larger total address space and play games with mapping segments in and out, but that puts strange limits on applications as to which segments they can reference at the same time, so in reality the limit translates to 2^32 total addresses per process. Unix systems map all segments to the same place and just use the paging. -- John R. Levine, Segue Software, POB 349, Cambridge MA 02238, +1 617 492 3869 { bbn | spdcc | decvax | harvard | yale }!ima!johnl, Levine@YALE.something Massachusetts has 64 licensed drivers who are over 100 years old. -The Globe
davidsen@sungod.crd.ge.com (William Davidsen) (06/13/89)
From your limited description I suspect that you looked at the Multics filesystem during the design phase. bill davidsen (davidsen@crdos1.crd.GE.COM) {uunet | philabs}!crdgw1!crdos1!davidsen "Stupidity, like virtue, is its own reward" -me
munck@linus.UUCP (Robert Munck) (06/13/89)
In article <4056@ima.ima.isc.com> johnl@ima.UUCP (John R. Levine) writes: >Unfortunately, ... the 386's segmentation is considerably less >useful... If, for example, you did the obvious unix trick of putting your > static data and heap at the low end of a (2^32-byte) segment >and your stack at the high end ... >you have to map all 2^32 addresses in that segment into the >page table, but that doesn't leave room for anything else. Oops. > Boy, there's a limitation! You can't separate your heap and stack by the full four gigabytes!! I suppose that if you could only leave two gigabytes between them, there wouldn't be enough room for them to grow. >A single process' address space is limited to 2^32 simultaneously mapped >addresses. > Not true. The address space is limited to 2^14 segments, each of up to 2^32 bytes. That's not quite what you said with "simultaneously mapped." >In theory you could have a larger total address space and play >games with mapping segments in and out, but that puts strange limits on >applications as to which segments they can reference at the same time, so in >reality the limit translates to 2^32 total addresses per process. > The "games" are no stranger than mapping virtual pages into and out of real memory; in fact, the two are done exactly the same way. Your "in reality" should read "when you don't need to or want to use the chip's segmentation facilities." Note that there are NO limitations on the applications; all this is invisible to them. >Unix systems map all segments to the same place and just use the paging. > I'm happy for you that UNIX is sufficiently primitive that you can leave such a powerful facility unused, although I'm not sure you can do a secure UNIX with that limitation. I suggest you look into doing the coding to use full segmentation; it'll make the higher-level code much simpler. -- Bob Munck, MITRE Corporation -- munck@mitre.org, [backbone]!linus!munck
bfranke@peun39.UUCP (06/13/89)
/* Written 9:36 pm Jun 6, 1989 by hs0l+@andrew.cmu.edu.UUCP in peun39:comp.arch */ /* ---------- "386/486 Virtual Memory Question..." ---------- */ The 386 and 486 architectures claim a virtual address space size of 2^46 bytes. The virtual address is formed from a 14 bit selector and a 32 bit offset. What does this really mean? Currently we have two interpretations: a) This scheme gives us 2^14 different ways to map into the same 32 bit address space. b) This scheme gives us 2^14 independent address spaces of 2^32 bytes each. In other words, the virtual memory scheme consists of as many as 2^14 segments of 2^32 bytes each. Which one is the correct interpretation? If (a) is the answer, why was it done this way? If (b) is the answer, how does the processor communicate the 14 bits of selector information to the memory mapping hardware/software? I think (b) is the correct answer, but after spending a few minutes with some Intel literature, I'm more confused than I was when I started. Any comments would be appreciated. Thanks. Brinkley Sprunt Elecetical & Computer Engineering Carnegie Mellon University sprunt@maxwell.ece.cmu.edu /* End of text from peun39:comp.arch */
munck@linus.UUCP (Robert Munck) (06/13/89)
In article <734@crdgw1.crd.ge.com> davidsen@crdos1.UUCP (bill davidsen) writes: >From your limited description I suspect that you looked at the Multics >filesystem during the design phase. Why, yes, I did look at the Multics filesystem while it was being designed. I was in the same building (545 Tech Sq) working on CP-67/CMS (now VM/370). R. Kogut and I wrote a version of CP that used virtual memory to implement CP minidisks that was in production use at Brown well into the 1980's. -- Bob Munck, MITRE
boyne@hplvli.HP.COM (Art Boyne) (06/13/89)
munck@linus.UUCP (Robert Munck) writes: > (Consequences: files are 4Mbytes max and a program can > have no more than 1023 of them in use at once; not serious limits.) Actually, for some of use who do simulations, the 4 Mbyte file limit is a serious limitation. I have seen a simulation whose output file was 100 Mbyte (it took 3 days to run on a minicomputer). I have also seen CAD/CAM design files that frequently go 6-8 Mbyte each. Art Boyne, boyne@hplvla.hp.com
stuart@bms-at.UUCP (Stuart Gathman) (06/14/89)
The 386 has one page register that points to tables mapping 2^32 bytes. In addition, there are up to 2^14 entries in a segment table. Each segment has a 32 bit offset and size. (The size is actually less than 32 bits, it uses kind of floating point with 1 bit exponents.) The logical address space is therefore 2^46. However, Changing the page register causes all kinds of TLB misses and extra memory reads to load needed pieces of the new page tables. This is very inefficient. The most efficient way to use both segments and paging is to map the segments onto a single 2^32 paged space and never change the page register. The paged space should be treated as a scarce resource and only the most recently used segments mapped to it. Stale segments are unmapped and the segment entry marked not present. On a segment fault, the oldest segment is mapped out, and the faulting segment mapped back in. Since the processor can have 6 segment registers loaded simultaneously, you have to guarrantee that all 6 will fit into 2^32 bytes. (Segment faults are detected when loading, not when referencing segment registers.) The simplest way to do this is to limit the size of a single segment to 2^32/6. This gives you a virtual space of 2^46/6 bytes. A max sement size of 2^32/8 has certain advantages, so I would make the virtual space 2^43 bytes. Therefore, the maximum address space actual usable is somewhat smaller than claimed by Intel, but still quite large. I can see using 2^45 bytes with severe restrictions on application code. (Only two segments are valid at any one time. Don't mess with the code segment.) I can't see any reasonable way to use 2^46. Because AT&T UNIX doesn't provide for two level VM without some hacking, so *nix implementations simply ignore the segments. -- Stuart D. Gathman <stuart@bms-at.uucp> <..!{vrdxhq|daitc}!bms-at!stuart>
johnl@ima.ima.isc.com (John R. Levine) (06/15/89)
In article <55957@linus.UUCP> you write: >In article <4056@ima.ima.isc.com> johnl@ima.UUCP (John R. Levine) writes: >>Unfortunately, ... the 386's segmentation is considerably less >>useful... >... I suppose that if you could only leave two >gigabytes between them, there wouldn't be enough room for them to grow. Of course you can do that, and indeed that's exactly what real systems do, but it's a hack forced on you by the poor implementation of the chip's addressing architecture. >>In theory you could have a larger total address space and play >>games with mapping segments in and out, but that puts strange limits on >>applications as to which segments they can reference at the same time, so in >>reality the limit translates to 2^32 total addresses per process. >> >The "games" are no stranger than mapping virtual pages into and out of >real memory; in fact, the two are done exactly the same way. Your "in >reality" should read "when you don't need to or want to use the chip's >segmentation facilities." Note that there are NO limitations >on the applications; all this is invisible to them. Actually, the games are considerably stranger than mapping virtual pages in and out, and there are real limitations. When you map pages, every page is the same size and is tiny compared to the size of the address space. When you map segments, they're all different sizes and each one is potentially as big as the full address space. Assume you have two sparse segments, each slightly bigger than 2GB, and you do a MOVS or something from one to the other. The system has to map in both segments at the same time, but it can't because the linear address space isn't big enough. This is a real problem, and the only way I know of to ensure that programs don't die from address space starvation is to limit the total addressable to less than 4GB. As noted by another poster, the cost of mapping segments is also very high -- any time you change a page table you have to flush the entire TLB. Had Intel made the page tables per-segment rather than per-process your comments would be true, but for some reason, they didn't. I suspect that paging was shoved into the 386 rather late in the design. >>Unix systems map all segments to the same place and just use the paging. >> >I'm happy for you that UNIX is sufficiently primitive that you can leave >such a powerful facility unused, although I'm not sure you can do a >secure UNIX with that limitation. I suggest you look into doing the coding >to use full segmentation; it'll make the higher-level code much simpler. Lots of us have looked into segmented Unix, and 286 Unix implementations work that way. It's not very satisfactory. One problem is that there seems to be no programming language that matches the 386's 48-bit pointer semantics very well. People have added near and far pointers to C, but it is a widely reviled hack and in practice makes code more complicated, not simpler. The other is that there is a dreadful performance hit from using segments; dereferencing an in-segment pointer takes 4 cycles while dereferencing a 48-bit pointer takes 22 or 25. -- John R. Levine, Segue Software, POB 349, Cambridge MA 02238, +1 617 492 3869 { bbn | spdcc | decvax | harvard | yale }!ima!johnl, Levine@YALE.something Massachusetts has 64 licensed drivers who are over 100 years old. -The Globe
sbf10@uts.amdahl.com (Samuel Fuller) (06/16/89)
In article <48000001@peun39> bfranke@peun39.UUCP writes: > >/* Written 9:36 pm Jun 6, 1989 by hs0l+@andrew.cmu.edu.UUCP in peun39:comp.arch */ >/* ---------- "386/486 Virtual Memory Question..." ---------- */ > >The 386 and 486 architectures claim a virtual address space size of >2^46 bytes. The virtual address is formed from a 14 bit selector >and a 32 bit offset. What does this really mean? Currently we have >two interpretations: > > a) This scheme gives us 2^14 different ways to map into > the same 32 bit address space. > > b) This scheme gives us 2^14 independent address spaces > of 2^32 bytes each. In other words, the virtual memory > scheme consists of as many as 2^14 segments of 2^32 > bytes each. > >Which one is the correct interpretation? If (a) is the answer, why was >it done this way? If (b) is the answer, how does the processor >communicate >the 14 bits of selector information to the memory mapping >hardware/software? > >I think (b) is the correct answer, but after spending a few minutes with >some Intel literature, I'm more confused than I was when I started. >Any comments would be appreciated. Thanks. > If they did it right the correct answer should be (a) or (b). You want (a) because it allows different processes operating in different address spaces to still share data. The kernel can be shared in this way. (B) is usually handled by tagging the page table entries in the TLB (MMU) with the Address Space ID. The real page number should be a function of both the ASID and the Virtual Page number. If the page translator is incapable of seeing the ASID then it is useless except for flushing virtually addressed caches. >Brinkley Sprunt >Elecetical & Computer Engineering >Carnegie Mellon University >sprunt@maxwell.ece.cmu.edu Sam Fuller / Amdahl System Performance Architecture amdahl!sbf10
munck@linus.UUCP (Robert Munck) (07/22/89)
In article <310003@hplvli.HP.COM> boyne@hplvli.HP.COM (Art Boyne) writes: >Actually, for ... simulations, the 4 Mbyte file limit is serious. I have >seen a simulation whose output file was 100 MByte... also >CAD/CAM design files that frequently go 6-8 Mbyte each. > Sure, no matter what upper limit you chose, there'll be users who need it to be higher. The question is, can the job be done within the limits? For example, the simulation output could be directed to a tape or into a sequence of files, since it's (probably) sequential. The CAD/CAM file is more of a problem, as it's probably random access. It happens that the higher-level OS that is to be implemented on my secure kernel has an entity-relationship-attribute file system that can put multiple tiny files in one of my 4K byte minimum segments or build structures of larger files. The CAD/CAM file is probably a data structure that could be spread over many nodes in the ERA structure. In other words, the CAD/CAM system writers could construct their data structure in the ERA filespace rather than in a single linear file. The idea that the file system be used to construct arbitrarily-complex applications data structures is relatively new in the world, as is having the components of such a file system be strongly typed, with full inheritance. -- Bob Munck
munck@linus.UUCP (Robert Munck) (07/22/89)
In article <4067@ima.ima.isc.com> johnl@ima.UUCP (John R. Levine) writes: >In article <55957@linus.UUCP> you write: >>... I suppose that if you could only leave two >>gigabytes between them, there wouldn't be enough room for them to grow. > >Of course you can do that, and indeed that's exactly what real systems do, >but it's a hack forced on you by the poor implementation of the chip's >addressing architecture. The sarcasm of "only two gigabytes" seems to have been missed. Given that there has to be an upper limit, it seems to me that one three orders of magnitude higher than the largest code/stack that 99.9% of applications currently use is "big enough." I'm sure there are pathological applications that need more. In fact, since the 386 implements "grow-down" segments intended for things like stacks, the cleanest implementation will give code and stack separate segments. >>>In theory you could have a larger total address space and play >>>games with mapping segments in and out. >>> >>The "games" are no stranger than mapping virtual pages into and out of >>real memory; in fact, the two are done exactly the same way.. > >Actually, the games are considerably stranger than mapping virtual pages in >and out, and there are real limitations. When you map pages, every page is the >same size and is tiny compared to the size of the address space. When you map >segments, they're all different sizes and each one is potentially as big as >the full address space. Sure, but you still have page mapping to help. By changing the mapping, you can move segments around in the linear address space very easily, and therefore it's easy to make them "fit." >Assume you have two sparse segments, each slightly >bigger than 2GB, and you do a MOVS or something from one to the other. The >system has to map in both segments at the same time, but it can't because the >linear address space isn't big enough. This is a real problem. It's a real problem only if you and a lot of other people are constantly in need of segments on the order of a gigabyte, and the nature of your applications is such that there's no way to use multiple smaller segments. As I said, there has to be a upper limit. Criticism of Intel is valid only if that upper limit is demonstratibly too small. >As noted by another poster, the cost of mapping segments is also very >high -- any time you change a page table you have to flush the entire TLB. In my experience, not a problem. The cost is a handfull of microseconds to reload the TLB -- having to map through the directory and page tables a couple of dozen times, maybe as much as ten microseconds. This is several orders of magnitude less than other costs of a virtual memory such as page read time. -- Bob Munck, MITRE Corporation