rose@uw-beaver (Scott Rose) (12/06/85)
I have some questions about how virtual memory is implemented in the 80386; the little "Introduction to the 80386" blurb that the local rep. mailed me does not have much detail, and he doesn't seem to have the time required to explain it to me on the phone. But netters have all the time in the world, and I am hoping that one of you will throw together a quick response for me. Please flame directly to me and I promise not to summarize to the net. Here is what I wanted to ask my overworked Intel rep.: % How is the TLB organized? Is it fully-associative, or direct mapped? % Doesn't the TLB need to be flushed on a task switch? How is this flushing % specified? Are there any instructions for overtly managing the TLB? % On a page fault, how is the TLB entry, if any, updated for the replaced % page? The present bit must be cleared, and the dirty bit must be read. % When the OS clears the accessed bits for an address space, does it % update the TLB entries? % Can one fault on a page table? If not, how are these locked into memory? % How can the processor be sure that a page table directory is present? % What if IBM had... -Scott Rose rose@uw-beaver
jss@sjuvax.UUCP (J. Shapiro) (12/09/85)
I got a bit long winded on this, and I am sorry, but it seems to me that some readers of net.arch may be newcomers, and this may be helpful to them. Since I have a fair amount of the docs, let me try to answer some of these. % How is the TLB organized? Is it fully-associative, or direct mapped? The TLB is a "four-way set associative 32 entry page table cache" (p. 50, 80386 Data Book) I suspect it uses the 32 most recently used addresses. % Doesn't the TLB need to be flushed on a task switch? How is this flushing % specified? Are there any instructions for overtly managing the TLB? Yes, the TLB is process specific. It therefore will need to be flushed when tasks switch. While it is not stated anywhere that I can find, the TLB would seem to be flushed automatically as part of the execution of the CALL task instruction. % On a page fault, how is the TLB entry, if any, updated for the replaced % page? The present bit must be cleared, and the dirty bit must be read. I think you may not have this right. For a page to be in the TLB, it must have been there at one point. The question is whether or not the invalidation of a page frame will cause appropriate invalidation in the TLB. The answer is yes, as the page is not given up voluntarily by that process. Some other process goes and invalidates the page, and the consequent TLB flush will cause the TLB to be properly maintained. On the first read from a page, the accessed bit of the page table entry gets set. on the first write to that page, the dirty bit gets set. Dirty and accessed bits are also kept in the TLB, so this is the only accessing necessary. In the translation tables. % When the OS clears the accessed bits for an address space, does it % update the TLB entries? No, the TLB entries for that address space have been flushed. They will be reestablished when the processor attempts to access the pages after returning to this process. % Can one fault on a page table? If not, how are these locked into memory? % How can the processor be sure that a page table directory is present? One can fault on a page table, so long as the page handling code and the page tables necessary to locate that code don't get swapped out. This is a general problem, and it is up to the pager to see to it that these things stay around. At the pager level, remember, you are looking in essence at physical memory, and it is therefore possible to see to it that the wrong things don't get swapped out by reserving locations for them and refusing to invalidate those page frames. Often enough, the pager will reserve different areas of memory for kernel paging and the rest of the world paging. This makes the maintainance easier. If the translation of segments is clear to you, the underlying page translation scheme is in essence identical to that used in the National chips, except that a few extra bits are available due to the 4k page size (instead of 512k on NatSemi's). These are made available to the user as "user defined bits". /*YOU MAY WANT TO "n" HERE - the remainder discusses virtual mem. fundamentals */ Forgive me if I am presumptuous, but it seems to me that your questions are most likely motivated by a lack of understanding of what happens in page faulting. The fault handler is not a part of the process in which the fault occurs. When a table lookup occurs, the running process won't be kicked out (except perhaps by something unrelated to paging), but will take a bit longer to fetch the translation. This translation is then added into the TLB, along with the accessed bits being set. Note that a read access automatically causes the accessed bits to be saet in both places. Access is implicit in the fact that you had to do the translation. If a write access occurs, the mmu (in this case) does another memory access to the page table entry to set the dirty bit. A fault occurs when it is discovered that the desired page isn't out there. In this event, the pager process is invoked, which probably queues up a disk request, sets your process to be nonrunnable, and sets the system back to multitasking. When the pager discovers that the disk request has come in, it finds a place for it, adds it into your process page tables, and sets your process state back to being runnable. The next time your number is up your process will be run beginning with the faulted instruction. Note that at least two process switches have occurred, and that the saving of your process's state causes the instruction to be retried as though (almost) nothing has happened. The translation lookaside buffer, however, has been completely invalidated, so it will be slowly reconstructed by your process. One could be careful, save the TLB entries, and have the pager be smart about invalidating them too, but in practice this takes as much if not more work as simply invalidating all of them, so it usually isn't done. For an adequate reference on paging, may I suggest chapters 5 and 6 of "Operating System Concepts" by Peterson and Silbershatz -- Jonathan S. Shapiro Haverford College "It doesn't compile pseudo code... What do you expect for fifty dollars?" - M. Tiemann
clif@intelca.UUCP (Clif Purkiser) (12/11/85)
> I have some questions about how virtual memory is implemented in the 80386; > the little "Introduction to the 80386" blurb that the local rep. mailed me > does not have much detail, and he doesn't seem to have the time required to > explain it to me on the phone. But netters have all the time in the world, > and I am hoping that one of you will throw together a quick response for me. > Please flame directly to me and I promise not to summarize to the net. > Here is what I wanted to ask my overworked Intel rep.: > > % How is the TLB organized? Is it fully-associative, or direct mapped? > 32 Entries. 4-way set associative. > % Doesn't the TLB need to be flushed on a task switch? How is this flushing > % specified? Are there any instructions for overtly managing the TLB? > No it doesn't . Either a MOV CR3, Reg instruction which loads the page directory root register or a Task Switch which causes the value of CR3 to be different. No explicit instructions exist for managing the TLB other than instructions used by a test engineer. > % On a page fault, how is the TLB entry, if any, updated for the replaced > % page? The present bit must be cleared, and the dirty bit must be read. > The new TLB entry replaces an old TLB entry using an Intel proprietary algorithim. The present, R/W, U/S bit in the TLB are set. > % When the OS clears the accessed bits for an address space, does it > % update the TLB entries? > Yes it should flush the TLB to ensure the TLB entries match the page table entries. > % Can one fault on a page table? If not, how are these locked into memory? > % How can the processor be sure that a page table directory is present? > Yes one can fault on a page table. The page table directory however must be always be present. > % What if IBM had... Used a 6502. My former company VisiCorp would still be in business because Lotus 1-2-3 wouldn't fit in 64K . > > -Scott Rose > rose@uw-beaver -- Clif Purkiser, Intel, Santa Clara, Ca. HIGH PERFORMANCE MICROPROCESSORS {pur-ee,hplabs,amd,scgvaxd,dual,idi,omsvax}!intelca!clif {standard disclaimer about how these views are mine and may not reflect the views of Intel, my boss , or USNET goes here. }
rfm@x.UUCP (Bob Mabee) (12/13/85)
In article <2619@sjuvax.UUCP> jss@sjuvax.UUCP (J. Shapiro) writes, in response to rose@uw-beaver (Scott Rose): >% On a page fault, how is the TLB entry, if any, updated for the replaced >% page? The present bit must be cleared, and the dirty bit must be read. > >I think you may not have this right. For a page to be in the TLB, it >must have been there at one point. The question is whether or not the >invalidation of a page frame will cause appropriate invalidation in >the TLB. The answer is yes, as the page is not given up voluntarily >by that process. Some other process goes and invalidates the page, >and the consequent TLB flush will cause the TLB to be properly >maintained. That isn't good enough. Reasonable systems may require invalidating a page within one process, rather than whenever a page-scrounging daemon runs. Consider exec, or suppose that sbrk is actually implemented to give back memory. Also, if you have multiple processors there is not necessarily a process exchange on this CPU every time the scrounger takes away a page. If Intel has not provided a clear-TLB command, then the OS will have to use the magic context-switch instruction even when only the TLB side effect is wanted. By the way, a clear-TLB pin would expedite multi-processor systems. -- Bob Mabee @ Charles River Data Systems decvax!frog!rfm
tim@ism780c.UUCP (Tim Smith) (12/14/85)
In article <150@intelca.UUCP> clif@intelca.UUCP (Clif Purkiser) writes: >> % What if IBM had... >Used a 6502. My former company VisiCorp would still be in business because >Lotus 1-2-3 wouldn't fit in 64K . No, because by now someone would come out with a 6508 which has multiple 64k segments .... :-) -- Tim Smith sdcrdcf!ism780c!tim || ima!ism780!tim || ihnp4!cithep!tim
henry@utzoo.UUCP (Henry Spencer) (12/16/85)
> >Used a 6502. My former company VisiCorp would still be in business because > >Lotus 1-2-3 wouldn't fit in 64K . > > No, because by now someone would come out with a 6508 which has multiple > 64k segments .... :-) Don't laugh. That's exactly what the 65816 (or whatever the silly number may be) is. Barf. -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
kds@intelca.UUCP (Ken Shoemaker) (12/17/85)
> In article <2619@sjuvax.UUCP> jss@sjuvax.UUCP (J. Shapiro) writes, > in response to rose@uw-beaver (Scott Rose): > >% On a page fault, how is the TLB entry, if any, updated for the replaced > >% page? The present bit must be cleared, and the dirty bit must be read. > > > >I think you may not have this right. For a page to be in the TLB, it > >must have been there at one point. The question is whether or not the > >invalidation of a page frame will cause appropriate invalidation in > >the TLB. The answer is yes, as the page is not given up voluntarily > >by that process. Some other process goes and invalidates the page, > >and the consequent TLB flush will cause the TLB to be properly > >maintained. > > That isn't good enough. Reasonable systems may require invalidating a page > within one process, rather than whenever a page-scrounging daemon runs. > Consider exec, or suppose that sbrk is actually implemented to give back > memory. Also, if you have multiple processors there is not necessarily a > process exchange on this CPU every time the scrounger takes away a page. > > If Intel has not provided a clear-TLB command, then the OS will have to use > the magic context-switch instruction even when only the TLB side effect is > wanted. By the way, a clear-TLB pin would expedite multi-processor systems. > -- > Bob Mabee @ Charles River Data Systems > decvax!frog!rfm Any time you write to CR3, the page table TLB is invalidated. Thus, you needn't switch the entire context of the processor to invalidate the TLB. As part of the context switch, CR3 gets reloaded, but there is also an instruction that only loads CR3. Also, one would hope that the operation in which the page tables are examined to toss out a couple here and there to virtual memory happens pretty infrequently, otherwise you spend quite a bit of time just making this determination. Relatively, the amount of time for all the processors in your system to respond to an interrupt that forces them to reload their respective CR3s is pretty insignificant. -- remember, if you do it yourself, sooner or later you'll need a bigger hammer Ken Shoemaker, Santa Clara, Ca. {pur-ee,hplabs,amd,scgvaxd,dual,qantel}!intelca!kds ---the above views are personal.