chris@umcp-cs.UUCP (03/14/84)
Actually, if you have hardware support for registers that "fall off the end of the chip", then process context switch doesn't have to be difficult, even for thousands of registers. All you need is a little clever trickery (and probably lots more silicon). The basic idea is to treat registers as a funny kind of cache. Example: r0 r1 ... r9 r10 r11 r12 ... r225 r226 ... r781 --------mine-------- ---0x43210--- ---0x44444--- (therefore valid) mem addr mem addr (actually one per reg) If each register has a memory address (either physical memory or the virtual address of a given process, with a possible "no address") then process context switch is very simple: just fiddle with the "Valid Pointer" (who says in the above picture that r11 is the last valid register). The registers (r0-r11) are already attached to the right process (since they have to have a mem address already assigned in case they "fall off" because of n thousand subroutine calls). The Valid Pointer gets reset to include no registers and gets his mem address value changed to the new process. Whenever you touch a register, if it's not valid the following sequence of events happens: if was_assigned_to_other_process then store_at_other_process_memory_address read_from_my_memory_address elif was_assigned_to_my_process then fancy_trickery_to_try_to_avoid_memory_references else read_from_my_memory_address fi make_register_valid Anyway, the result should be a very inexpensive process context switch with a little bit of extra overhead per register fetch/store (which you can probably do in parallel with something else you have to do anyway). Hm, I wonder if I can patent this?
phipps@fortune.UUCP (Clay Phipps) (03/19/84)
I would like to pose a general question: how are multiple sets of registers best utilized ? Some architectures, e.g., ModComp and ELXSI, have one complete set of registers per process (up to some number of processes like 16), used to speed up process context switching. The RISC has its "overlapping register windows" scheme, where each level of routine call causes an overlapped (to allow parameter passing) window of registers to move, stack-like, over a much larger set (128 ? 256 ?) of registers, to speed up routine call context switching. Forest Baskett had his "register cube" or "3-D register" scheme, which combined the two ideas (although I don't believe the windows overlapped). If you could only use one approach (not Forest's combination), which would get the biggest speed win, and for what classes of programs ? For example, in UNIX, are process context switches more or less common that routine call context switches ? Just rattling a few more cages ... -- Clay Phipps -- {allegra,amd70,cbosgd,dsd,floyd,harpo,hpda,ihnp4, megatest,nsc,oliveb,sri-unix,twg,varian,VisiA,wdl1} !fortune!phipps
howard@metheus.UUCP (Howard A. Landman) (03/22/84)
The question in the referenced article was whether interprocess context switches were more or less common than procedure call context switches. Actually, we ASKED that question before designing RISC I. The measurements taken indicated that for VAX UNIX, interprocess switches were about 100 times less frequent than procedure calls. Thus the payoff for handling them in special hardware is also much less. We decided it wasn't worth the effort. I still think that was the correct decision, especially considering that some interprocess switching (e.g. simple interrupts) can be coerced into using the register window mechanism much like a procedure call. (One interrupt which CAN'T be so coerced is the window overflow interrupt, for obvious reasons.) Howard A. Landman ogcvax!metheus!howard