cprice@mips.COM (Charlie Price) (05/26/90)
In article <39006@mips.mips.COM> mark@mips.COM (Mark G. Johnson) writes: >In article <26274@super.ORG> rminnich@super.UUCP (Ronald G Minnich) writes: > >I attended a talk once by someone who had worked on porting a popular > >OS to an R2000. He mentioned a problem he had been bit by which he > >called TLB freezeup. I have since wondered exactly the sequence of > >events that could cause this. Can anyone out there give a good description > >this phenomenon? Is there a hardware basis for this problem or is it > >merely a result of bugs in the software which is responsible for loading > >the TLBs? I was unclear whether it was just a deadlock or a true hardware > >event. > > >"TLB freezeup" may be a common name for some kind software phenomenon, in >which case all of the hadrware discussion below is misleading. Apologies >in advance, do not hold in hand, use only under adult supervision. > > > The System Coprocessor (aka memory management unit) of the R2000 > contains a 32b register called the Status Register. Bit 21 of that > register is a special flag that is asserted in case the hardware > detects it is in danger. It's described on page 5-7 of the R2000/ > R3000 book by Gerry Kane. Locally among the chip designers this bit is > called the "TLB Burnout" protector. Perhaps you misremembered this as > "TLB freezeup". In the book it's called "TLB Shutdown". > > Suppose for a moment that the OS software went crazy and wrote utter > nonsense into the virtual-to-physical mapping entries. For example, > what would happen if the TLB was told that (all within one ProcessID): > Virtual Address 37 <====> Physical Address 6 > Virtual Address 37 <====> Physical Address 51 ... > If this sort of sillyness ever happens (and hopefully it doesn't), > the R2000 sets its TLB Shutdown bit, telling you that you tried to > use the TLB in a REALLY unexpected fashion. The reason for doing > this is both selfish and altruistic: it lets S/W know something went > wrong, and it protects the TLB hardware from mangling itself. Mark fails to mention what the hardware difficulty is. A TLB is generally just a small cache of physical-to-virtual translations. The TLB in the R2000 and R3000 is fully-associative, any translation can go into any of the cache locations. The lookup mechanism for a such a fully-associative cache is a content addressable memory for the virtual page number. In effect, you yell out the index (the page number) and if the translation is in any TLB entry, it shouts the physical page number back. The problem is when more than one location answers at a time, and that can happen if the same virtual address is mistakenly mapped to more than one physical address. If that happens, more than one value will be driven at the same time. (Here is where the software-understanding-of-hardware handwaving starts). If the hardware is trying to simultaneously drive a 0 and a 1 out for a particular bit position, this will cause highly-undesired current and this can actually destroy the circuits involved. The MIPS TLB detects the situation where it gets multiple answers to a lookup and disables the TLB before it can destroy itself. -- Charlie Price cprice@mips.mips.com (408) 720-1700 MIPS Computer Systems / 928 Arques Ave. / Sunnyvale, CA 94086