davet@oakhill.UUCP (Dave Trissel) (02/08/85)
Xref: seismo net.unix-wizards:11815 >> Back to the hypervisor simulating the virtual RTE. Assuming a bus retry was >> indicated then the hypervisor simply does its own RTE back to the emulated >> environment. The bus cycle will complete normally (remember the interception >> of the mapping fixup in the virtual OS) and the emulation will continue >> normally. >> >> Now, wasn't that easy? :-) >No, it's not that easy. There's no guarantee that the virtual RTEs occur >immediately after the virtual bus error; the real operating system must >be able to pair every virtual RTE to a preceding virtual/real bus error. >Any number of tasks in the virtual operating system may be waiting to have >their pages brought in so that the virtual RTEs may occur, so the real >operating system has to keep the real bus error information around for a long >time (forever). However, there is no guarantee the the virtual RTE will >ever occur (virtual segmentation fault), so either the real operating system >will slowly accumulate unused bus error information or else it must have some >disgusting heuristic for knowing when to get rid of it. As you point out, there are several methods for handling the long frame stack information for RTEs. The first is to keep the entire frames around for the duration of the virtual machine and insure that any long RTEs issued by the virtual machine match one for validity (after which it can be tossed.) This indeed means the ever expanding collection of unreturned bus error frames will occur. How many depend on the exact nature of the virtual system being implemented. (If that system is a non-virtual memory system then no frames need be saved since none will be returned.) There are several ways to minimize the overhead of saving frame states. The first is in recognizing that a virtual fault represents an immediate request to satisfy a reference and in most cases that request will be satisfied within a few seconds. Knowing this, only the last 5 or so frames need be saved in main memory. When the table fills stage the oldest onto a disk file. Notice that this disk file will seldom have to be searched because nearly always a match will be found in the resident list. In essence the disk file becomes a throw-away mechanism. (Even easier yet, if the hypervisor itself runs virtually it can just keep all entries in a large resident list and always search in "most recently entered" order.) Another way to minimize space is to checksum the frame. This would reduce each long stack frame to a simple 32-bit or 64-bit key. Staging keys to disk (or using a virtual array) would work for these cases as well. Alternatives to remembering things are mentioned next... >The way around the problem is to push the actual information onto the >virtual operating system's stack and let it do whatever with it. >Detection of invalidly modified info on RTEs can be done with a "signature" >such as checksumming or encryption. However, checksumming is easily fooled >by a malicious virtual operating system and encryption is expensive. Checksumming or signatures as a way to avoid holding state information won't work if you are going to support hypervisors running other hypervisors. The checksum itself cannot be stored within the longframe since the virtual machine hypervisor will also try to do the same overwritting the value. If one is not worried about supporting other hypervisors then this scheme will work assuming that a suitable position in the stack frame can be found to store the key. The remark about malicious operating systems is commented on later. The encryption or checksumming can be done quite effeciently with small instruction loops. Considering the significant amount of overhead required for each and every privileged simulation by the hypervisor (all TRAPs and Virtual I/O must be verified completely and then simulated) the occasional bus exception is practically unnoticed. When I was a systems programmer at a site where we ran VM/370, our measurement hardware showed that 30 percent of our mainframe processing time was spent just in the overhead of providing for virtual machine services. There is yet another way for the hypervisor to know when to throw away saved checking frames, and that is for the virtual operating system itself to indicate that a frame just recieved will not be returned. This method may have been alluded to in the following remark... > ...or else it must have some disgusting heuristic for knowing when to get > rid of it. In any case, VM/370 uses the DIAGNOSE machine instruction to communicate both with subordinates beneath it and hypervisors above it (including hypervisor versions of itself.) It turns out that running a demand paged operating system underneath yet another layer of virtual paging is, as can be expected, just not too swift. Because of this, paging status and other information is passed to VM/370 byvarious versions of IBM's operating systems which support demand paging. >I think Motorola blew it on this one; I don't think we'll see a virtual >machine environment as useful and reliable as VM/370 on the MC680x0. > Tom Lyon > Sun Microsystems, Inc. Well, we blew it in the sense that the situation has to be handled at all. But I don't think that this will have any bearing whatsoever on virtual machine environments for the M68000 family. And having been involved with VM/370 (admittedly from several years ago) I would not classify it as significantly useful or reliable. I was shocked when I found out that local users here at Motorola still have to do primitive psuedo card reading and printing just to swap files with other users. I thought such things would have been improved years ago. (Its rather interesting to hear them curse while attempting to edit files on our IBM systems, especially after they've just been on a Macintosh.) There are two primary reasons for virtual machines. The first is to assist in the debugging of an operating system itself or portions of an operating system where hardware is not yet available. Of course such an operating system may clobber the stack it is attempting an RTE on but this is just one of hundreds of thousands of things an operating system can do wrong. Minimal checking of the exception frame will catch most of these. The second reason for virtual machines is to allow the execution of diverse standard operating systems on one piece of hardware. In this regard it certainly can be expected that 1) the operating system has been suitably debugged in the first place and 2) such an operating system will not try to "malicious"ly fool a hypervisor which it knows nothing about to begin with. Therefore, the only reason for concern about the stack RTE validation is where there is an environment (such as at a university) where there may be an attempt by someone to intentionally foil the checksumming (or whatever) method to bomb the system. The answer here is simple. Use the saved stack and compare method suggested first. This will guarantee system integrity. As an added note, I think its interesting to mention that we at Motorola are unaware of any customers interested in virtual machine capability. If this is the case and continues to be the case, then this discussion is mostly academic. If you are out there lets hear from you. Any comments from the community are welcome! Motorola Semiconductor Inc. Dave Trissel Austin, Texas {ihnp4,seismo,ctvax,gatech}!ut-sally!oakhill!davet