hascall@atanasoff.cs.iastate.edu (John Hascall) (02/10/89)
One of the benifits of a simple instruction set (RISC) is that it frees up chip area for more registers. I think some papers have proposed register counts > 100, what is the largest number of general purpose registers in an existing chip? What I am curious about is, what (if any) special techniques can be employed to prevent a large performance hit at context switch time (i.e., saving all those registers for the current process, and restoring them for the new process)? Do they just rely on the PCBs being in the (data) cache? What about a special cache for PCBs? Is it worth it? Is it workable? I seem to recall there was (is?) a TI processor which had all of its registers in memory except 1 register which pointed to the other registers, so a context switch was just save/restore that one register. Could a similar concept be implemented with all the registers in the chip? Consider a machine with say 32 GP registers, suppose further that the processor was built with say 544 (32 + (32*16)) GP registers and a special PID (process index) register. Process slots 0-15 are reserved for "real-time" processes (when a new process is created, it will not use one of those slots unless it requests it). Now, at context switch time if the "outgoing" process has an index of 0-15 no save is needed, and if the "incoming" process has an index also in the range of 0-15 no restore is needed either. For a process whose index is 16+ the 17th register set is used, and is saved/restored as in a "normal" system. It seems to me that such a scheme would take little extra hardware (other than the extra registers). I just pulled the number 16 out of the air, any power of 2 would be as easily implemented--perhaps enough that on a workstation most or all of the processes could have a "real-time" slot. PID register program specified register number +-+-+-+-+-+-+-+ +-+-+-+-+-+ +-+-+-+-+-+ | | | | | | | ... | | | | | | | | | | | +-+-+-+-+-+-+-+ +-+-+-+-+-+ +-+-+-+-+-+ | | | | | | | ... | | | | | | | | | | \ or together to / \ concatenate to form / \ form "use#17" / \ actual register # / \ signal / \ (if ~use#17) / A hair-brained scheme or what? John Hascall ISU Comp Center
robertb@june.cs.washington.edu (Robert Bedichek) (02/11/89)
In article <784@atanasoff.cs.iastate.edu> hascall@atanasoff.cs.iastate.edu (John Hascall) writes: > <some lines deleted> > > I seem to recall there was (is?) a TI processor which had all of > its registers in memory except 1 register which pointed to > the other registers, so a context switch was just save/restore > that one register. Could a similar concept be implemented > with all the registers in the chip? I believe that you are thinking of the TI9900, one of the first 16-bit microprocessors. It was very slow, I think at least partly because it kept its registers in memory. > > <proposal to have register banks which switch at context switch> > > It seems to me that such a scheme would take little extra hardware > (other than the extra registers). I just pulled the number 16 out > of the air, any power of 2 would be as easily implemented--perhaps > enough that on a workstation most or all of the processes could > have a "real-time" slot. Yes, but the extra hardware for the registers takes a lot of silicon area! Some Xerox machines had such a scheme. They had something like 8 register banks and could do a context switch in a few cycles. A large semiconductor company copied this idea in an IO processor that I think will never see the light of day. >A hair-brained scheme or what? Well, I don't think its hair-brained, it makes sense if you want very fast context switch time. But having lots of register banks is very expensive in silicon area and in register access time. Register files tend to be multiported, so each bit takes a lot of area. Increasing the size of the register file will often lead to an increased cycle time or an increase in the number of cycles to do a basic operation. To pick an example: The Motorola 88000 has 31 general registers. If you added your register bank idea to this machine you would get very little benefit when running UNIX. There are so many other things that have to be done on a context switch that the time to save the 31 general registers is insignificant. Btw, I think the designers of the 88k made *excellent* trade-offs in its design. After spending a year working on system software for the machine, there is almost nothing that I would change. Rob Bedichek
colwell@mfci.UUCP (Robert Colwell) (02/12/89)
In article <784@atanasoff.cs.iastate.edu> hascall@atanasoff.cs.iastate.edu (John Hascall) writes: > > One of the benifits of a simple instruction set (RISC) is that it > frees up chip area for more registers. I think some papers have > proposed register counts > 100, what is the largest number of > general purpose registers in an existing chip? You're apparently talking about single-chip micros. That's not the only domain in which RISC/CISC concepts are interesting, and I think once you leave the single-chip domain, your premise isn't obviously correct. > I seem to recall there was (is?) a TI processor which had all of > its registers in memory except 1 register which pointed to > the other registers, so a context switch was just save/restore > that one register. Could a similar concept be implemented > with all the registers in the chip? I think this was the TI 9900, the first 16-bit micro, which for some reason didn't seem to catch on very well. It did indeed have all its registers in main memory. And this isn't as dumb an idea as it first appears -- you need far fewer address bits to refer to a register than to memory addresses, so having "registers" that reside in memory is still better than no "registers" at all. The BellMac-8 microprocessor from Bell Labs, ca 1977-1979, borrowed this idea. I'm not sure of the TI chip, but Bell's also had the overlapped sliding register window for parameter-passing that later showed up again in the RISC-I from Berkeley. The Bellmac-8 also had one of the nicest assemblers I've seen -- had lots of high level constructs like if-then-else, while, do-until, switch, etc. If you really wanted one-for-one mapping of code to machine you didn't have to use those features, but it was often very nice to have them. Bob Colwell ..!uunet!mfci!colwell Multiflow Computer or colwell@multiflow.com 175 N. Main St. Branford, CT 06405 203-488-6090
kyriazis@rpics (George Kyriazis) (02/12/89)
In article <7239@june.cs.washington.edu> robertb@uw-june.UUCP (Robert Bedichek) writes: >In article <784@atanasoff.cs.iastate.edu> > hascall@atanasoff.cs.iastate.edu (John Hascall) writes: >> <some lines deleted> >> >> I seem to recall there was (is?) a TI processor which had all of >> its registers in memory except 1 register which pointed to >> the other registers, so a context switch was just save/restore >> that one register... > >I believe that you are thinking of the TI9900, one of the first >16-bit microprocessors. It was very slow, I think at least partly >because it kept its registers in memory. > No, it wasn't snow because of that. It wasn't optimised at all. It had 4 non-overlapping clocks, and the internal algorithms were terribly slow. If you are thinking of the TI99/4A, yes it was much slower simply because it was expanding each bus cycle into 6 (!!). An 8/16 bit succesor of the 9900 the 9995, was faster than the 8088, and the 99000 (built to fight the 68000), was benchmarking better that the 68000 (at least that's what they claim). I really liked that architecture, but I guess that it wasn't enough :-) Oh well.. George Kyriazis kyriazis@turing.cs.rpi.edu kyriazis@rdrc.rpi.edu ------------------------------
henry@utzoo.uucp (Henry Spencer) (02/12/89)
In article <784@atanasoff.cs.iastate.edu> hascall@atanasoff.cs.iastate.edu (John Hascall) writes: > I seem to recall there was (is?) a TI processor which had all of > its registers in memory except 1 register which pointed to > the other registers, so a context switch was just save/restore > that one register. Could a similar concept be implemented > with all the registers in the chip? You can use the AMD 29000 that way, in fact, although doing register windows is more popular in Unix environments. If you dedicate a set of 16 registers to each process, and dedicate most of the global registers saving the rest of the state for the processes, you can have 8 processes running with a context-switch time of something like 17 cycles. -- The Earth is our mother; | Henry Spencer at U of Toronto Zoology our nine months are up. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
bradb@ai.toronto.edu (Brad Brown) (02/12/89)
In article <7239@june.cs.washington.edu> robertb@uw-june.UUCP (Robert Bedichek) writes: >In article <784@atanasoff.cs.iastate.edu> > hascall@atanasoff.cs.iastate.edu (John Hascall) writes: >> <some lines deleted> >> >> I seem to recall there was (is?) a TI processor which had all of >> its registers in memory except 1 register which pointed to >> the other registers, so a context switch was just save/restore >> that one register. Could a similar concept be implemented >> with all the registers in the chip? This is not a bad idea if you have the silicon to do it (as other posters have pointed out.) Actually it's been used in some designs. The most interesting is actually the IBM 8100, a fast transaction processing machine which is kind of old and has now been discontinued. The 8100 had a total of 1024 registers, divided up into banks of 32 registers. That means 32 processes could each have their own context and you could switch between processors REALLY fast. There is a somewhat related problem when you make a subroutine call -- the calling function usually has to save its registers so it gets it's "context" restored when the function returns. Machines like MIPS have made use of their very large number of registers (192?) by having a pointer to one of the registers that is effectively the base pointer for the stack of registers that the currently executing function can use. When you want to make a function call you just advance the pointer past the registers that you are using, zap arguments into the registers just after the pointer, and branch to the function. (Of course it's more complicated than that, but you can see where the time savings comes from...) (-: Brad Brown :-) bradb@ai.toronto.edu
moore%cdr.utah.edu@wasatch.UUCP (Tim Moore) (02/13/89)
In article <89Feb12.125852est.10867@ephemeral.ai.toronto.edu> bradb@ai.toronto.edu (Brad Brown) writes:
)There is a somewhat related problem when you make a subroutine call --
)the calling function usually has to save its registers so it gets it's
)"context" restored when the function returns. Machines like MIPS have
)made use of their very large number of registers (192?) by having a pointer
)to one of the registers that is effectively the base pointer for the
)stack of registers that the currently executing function can use.
You're confusing MIPS and SPARC here. The MIPS chips have a fairly
conventional set of general registers; SPARC has a large file of
registers that are divided into "windows" in the manner you describe.
-Tim Moore
4560 M.E.B. internet:moore@cs.utah.edu
University of Utah ABUSENET:{ut-sally,hplabs}!utah-cs!moore
Salt Lake City, UT 84112
hascall@atanasoff.cs.iastate.edu (John Hascall) (02/13/89)
In article <89Feb12.125852est.10867@ephemeral.ai.toronto.edu> bradb@ai.toronto.edu (Brad Brown) writes: >In article <7239@june.cs.washington.edu> robertb@uw-june.UUCP (Robert Bedichek) writes: >>In article <784@atanasoff.cs.iastate.edu> >> hascall@atanasoff.cs.iastate.edu (John Hascall) writes: >>> <some lines deleted> >There is a somewhat related problem when you make a subroutine call -- >the calling function usually has to save its registers so it gets it's >"context" restored when the function returns. Machines like MIPS have >made use of their very large number of registers (192?) by having a pointer >to one of the registers that is effectively the base pointer for the >stack of registers that the currently executing function can use. .... This was part of my question... I take it, at context switch the MIPS processor has to save and restore all those registers (at least as far "up" as the "topmost" register in use--potentially all of them). Doesn't that mean roughly 400 memory accesses (assuming 192 is correct), all at once--just the sort of thing RISC is supposed to avoid? What effect (if any) does this have on the suitability of these processors for "real-time" systems? John Hascall ISU Comp Center
mikes@oakhill.UUCP (Mike Schultz) (02/13/89)
In article <640@m3.mfci.UUCP> colwell@mfci.UUCP (Robert Colwell) writes: >In article <784@atanasoff.cs.iastate.edu> hascall@atanasoff.cs.iastate.edu (John Hascall) writes: > >> I seem to recall there was (is?) a TI processor which had all of >> its registers in memory except 1 register which pointed to >> the other registers, so a context switch was just save/restore >> that one register. Could a similar concept be implemented >> with all the registers in the chip? > >I think this was the TI 9900, the first 16-bit micro, which for some reason >didn't seem to catch on very well. Probably because they couldn't figure out that if you gave hardware away to universities, then you grow people who knew TI when they graduated and took that to the market place. They also tended to be very business and industrial oriented. IMHO. >It did indeed have all its registers in >main memory. And this isn't as dumb an idea as it first appears -- you need >far fewer address bits to refer to a register than to memory addresses, so >having "registers" that reside in memory is still better than no "registers" >at all. Also consider that the 9900 was simply a single chip version of the TI 990 mini computer. I'm not sure of all my facts here, but when it was introduced, the 990's CPU speed was not all that far from the memory speed, thus the penality wasn't that much. Later, as memory became slower compared to the CPU, they cached the current register set into fast static RAM on the CPU board and flushed them to memory as needed. (I'm told that it made for some interesting hardware considering that programs could, and did, go to the memory address of a register to fiddle with the low order byte of the register.) Mike Schultz mikes@oakhill.UUCP
bradb@ai.toronto.edu (Brad Brown) (02/14/89)
In article <792@atanasoff.cs.iastate.edu> hascall@atanasoff.cs.iastate.edu (John Hascall) writes: >In article <> bradb@ai.toronto.edu (Brad Brown) writes: >>There is a somewhat related problem when you make a subroutine call -- >>the calling function usually has to save its registers so it gets it's >>"context" restored when the function returns. Machines like MIPS have >>made use of their very large number of registers (192?) by having a pointer >>to one of the registers that is effectively the base pointer for the >>stack of registers that the currently executing function can use. .... > > This was part of my question... I take it, at context switch the MIPS > processor has to save and restore all those registers (at least as > far "up" as the "topmost" register in use--potentially all of them). > Doesn't that mean roughly 400 memory accesses (assuming 192 is correct), > > > What effect (if any) does this have on the suitability of these processors > for "real-time" systems? [As some people have pointed out, I got mixed up between MIPS and SPARC -- my comments above should apply to SPARC...] I think the idea is that in most systems there are a *lot* more function calls than full context switches, which are quite different from the point of view of the amount of work that has to be done. If you can save some time on the function calls then you can afford to waste a little more on the time to save the registers for a full context switch. I don't know whether this would be a big performance hit for real-time systems. Perhaps there are ways of knowing how many registers are actually in use and saving them in a burst. Perhaps there are ways of handling some kinds of real-time events by just allocating a new register window. Perhaps this would work form some "lightweight" inter- rupts, though it's obviously unsuitable for a full context switch. (-: Brad Brown :-) bradb@ai.toronto.edu
tim@crackle.amd.com (Tim Olson) (02/14/89)
In article <1101@wasatch.UUCP> moore%cdr.utah.edu.UUCP@wasatch.UUCP (Tim Moore) writes: | In article <89Feb12.125852est.10867@ephemeral.ai.toronto.edu> bradb@ai.toronto.edu (Brad Brown) writes: | | )There is a somewhat related problem when you make a subroutine call -- | )the calling function usually has to save its registers so it gets it's | )"context" restored when the function returns. Machines like MIPS have | )made use of their very large number of registers (192?) by having a pointer | )to one of the registers that is effectively the base pointer for the | )stack of registers that the currently executing function can use. | | You're confusing MIPS and SPARC here. The MIPS chips have a fairly | conventional set of general registers; SPARC has a large file of | registers that are divided into "windows" in the manner you describe. I think he was talking about the Am29000 (192 registers). The 29k has 64 globals and 128 locals, all of which are accessible by the instructions. An internal stack pointer allows a register-window implementation that uses variable-sized windows (tailored to the size of each individual function's needs), rather than the fixed-sized windows of the SPARC. -- Tim Olson Advanced Micro Devices (tim@crackle.amd.com)
schmitz@fas.ri.cmu.edu (Donald Schmitz) (02/14/89)
In article Robert Bedichek writes: >In article John Hascall writes: >> <proposal to have register banks which switch at context switch> >> >> It seems to me that such a scheme would take little extra hardware >> (other than the extra registers). I just pulled the number 16 out >> of the air, any power of 2 would be as easily implemented--perhaps >> enough that on a workstation most or all of the processes could >> have a "real-time" slot. > >Yes, but the extra hardware for the registers takes a lot of silicon >area! Some Xerox machines had such a scheme. They had something like >8 register banks and could do a context switch in a few cycles. A >large semiconductor company copied this idea in an IO processor that I >think will never see the light of day. > >>A hair-brained scheme or what? A similar thread went around a year ago, and I came up with the idea of CPUs with externally addressable register/state files, plus a "scheduling CPU". The "scheduler" would make the CPUs context switch by exterally halting them, dumping/updating their register file via a DMA or block xfer operation to fast memory used as a PCB cache (via the hardware interface to the register file), and then restarting them. The real win is not so much the reduced context switch time, but the ability to run the scheduling process on a dedicated CPU in parallel with the "real" processes. The extra cycles available for scheduling can (hopefully) be used for more sophisticated scheduling algorithms. This would be a real win in a multi CPU system, as "real" processes could be scheduled to avoid conflicts for system resources, such as main memory bandwidth and disk accesses. The hardware cost of this is an extra data/address path to the register file, plus some additional multiplexing of the chip pins - not insignificant in a really high perf CPU but much less costly than multiple register files. If you don't want to build mutant chips, you can do a similar thing with conventional processors, shared memory, interrupts and software, without quite the savings in the raw context switch time (but still a win in scheduling time and hopefully a big win in overall utilization). Anyway, I got 2 or 3 responses from places working on such systems, although I still haven't seen one released. Don Schmitz (schmitz@fas.ri.cmu.edu) --
robertb@june.cs.washington.edu (Robert Bedichek) (02/15/89)
In article <4274@pt.cs.cmu.edu> schmitz@fas.ri.cmu.edu (Donald Schmitz) writes: > >A similar thread went around a year ago, and I came up with the idea of CPUs >with externally addressable register/state files, plus a "scheduling CPU". >The "scheduler" would make the CPUs context switch by exterally halting >them, dumping/updating their register file via a DMA or block xfer operation >to fast memory used as a PCB cache (via the hardware interface to the >register file), and then restarting them. > >The real win is not so much the reduced context switch time, but the ability >to run the scheduling process on a dedicated CPU in parallel with the "real" >processes. The extra cycles available for scheduling can (hopefully) be >used for more sophisticated scheduling algorithms. This would be a real win >in a multi CPU system, as "real" processes could be scheduled to avoid >conflicts for system resources, such as main memory bandwidth and disk >accesses. The hardware cost of this is an extra data/address path to the >register file, plus some additional multiplexing of the chip pins - not >insignificant in a really high perf CPU but much less costly than multiple >register files. If the processor is halted while the dumping of registers is going on then you don't need any extra data paths to the registers. The CDC 6600 did what you describe, its PP (Peripheral Processors) made the processor do an "exchange jump", where the registers were swapped with an image in memory. I don't know where the scheduling algorithm was done though. The 6600 was considerably easier to program than the PP's, so I suspect that it was done on the 6600. (The relative difficultly of programming is generally a problem with dedicated special purpose attached processors, such as IO processors. It can be done, of course, but faced with the decision of where to implement some new feature, system programmers tend to put it on the main CPU.) And if the processor is going to be waiting while its registers are dumped, why not just have the processor do the dumping ... and now the scheme has degenerated to the software solution. I don't see any advantage to your scheme in current general purpose systems. If you want to run the scheduling algorithm in parallel, then why not just run it on another "real processor"? Why statically allocate a machine to an activity unless it is a big win in doing so? >If you don't want to build mutant chips, you can do a >similar thing with conventional processors, shared memory, interrupts and >software, without quite the savings in the raw context switch time (but >still a win in scheduling time and hopefully a big win in overall >utilization). Right, but what's the difference between this (degenerating to having everything done in software) and what is done "conventionally" on shared memory multiprocessors (e.g., Sequent)? > >Anyway, I got 2 or 3 responses from places working on such systems, although >I still haven't seen one released. > >Don Schmitz (schmitz@fas.ri.cmu.edu) >-- Rob "Live to code Code to live" beg, plea to all: run spell on your text before posting
petolino%joe@Sun.COM (Joe Petolino) (02/15/89)
>> I seem to recall there was (is?) a TI processor which had all of >> its registers in memory except 1 register which pointed to >> the other registers, so a context switch was just save/restore >> that one register. Could a similar concept be implemented >> with all the registers in the chip? >You can use the AMD 29000 that way, in fact, although doing register >windows is more popular in Unix environments. If you dedicate a set of >16 registers to each process, and dedicate most of the global registers >saving the rest of the state for the processes, you can have 8 processes >running with a context-switch time of something like 17 cycles. This same trick could be used with SPARC, too, for example if you were writing a real-time OS that needed fast, predictable context switch timing. The 'Current Window Pointer' (CWP) is a field of the PSR - writing a new value into the PSR gives you a whole new set of window registers, preserving the old register values. For those not familiar, here's a quick overview of the way SPARC registers work: there are eight global registers (one of them, g0, is hard-wired as a constant 0), plus a circular file of windowed registers. The size of this register file is implementation-dependent (it's 112 registers on the Sun4 chip). At any one time, the processor has access to a 'window' of 24 of these registers, starting at the one pointed to by the CWP field of the PSR (the CWP always points to a register whose number is a multiple of 16). The CWP can change (- or +, mod the size of the register file) in increments of 16 registers, in response to two instructions (save and restore) which are normally used in conjunction with the instructions that do procedure calls and returns. Thus, the 24-register window of a called routine overlaps its caller's 24-register window by eight registers. When a trap occurs, the CWP automatically moves up by sixteen registers. You can think of it as a poor-man's stack cache - the poverty part is that the stack pointer can only move in increments of 16, the CPU can only look at the top 24 words of the stack, and it has a finite size that must be managed by the OS. That management is facilitated by the 'Window Invalid Mask' (WIM), a special register with a bit for each possible value of the CWP. If a save or restore instruction would cause the CWP to decrement or increment to a value whose corresponding WIM bit is 1, then that instruction traps, and the OS must free up some registers (and update the WIM) before continuing. Note that, in an application where fast context switches between a small number of processes was the most important factor, you wouldn't even use the WIM. You'd write all the code without save and restore instructions (note that these operations are *not* part of the call/return instructions), and instead use normal loads and stores to save the state of the registers across procedure calls. The OS could then allocate 32 consecutive registers (i.e. two adjacent CWP values) for each process: one 24-register window to run in, and another 8 (in the next window above) for trap handlers to use. -Joe Petolino "I don't work for Marketing. Nobody told me to write this. As far as I know, it's all true!."
andrew@eve.oz (Andrew McRae) (02/15/89)
From article <89Feb12.125852est.10867@ephemeral.ai.toronto.edu>, by bradb@ai.toronto.edu (Brad Brown): >> [ Discussion about multiple register sets ] > >This is not a bad idea if you have the silicon to do it (as other posters >have pointed out.) Actually it's been used in some designs. The most >interesting is actually the IBM 8100, a fast transaction processing machine >which is kind of old and has now been discontinued. The Concurrent Computer Corp. 3200 series has multiple register sets (up to 16), the operative set selected by a 4 bit field in the processor status register. Generally these are not used to speed context switching between processes, but to allocate one set to the user level processes, and the other sets to the OS. Each interrupt level had a different register set, so that no register saving had to occur at interrupt service time, and there was no need to save user registers during kernel operations. This register swapping was tied up with the architecture (e.g. at interrupt time some of the registers had useful values stored in them by the microcode such as vector number, device address, previous status/program counter etc). Not being a stack based machine, the register swapping tended to be fundamental to the way the OS did things (I'm speaking of course for the native OS/32, not Unix). I'm suprised that this idea (different register sets for different interrupt levels) has not been taken up in some of the more modern architectures, but there are problems (try doing a splX, and see if your registers hold the same values...). Andrew McRae inet: andrew@megadata.oz{.au} Megadata Pty Ltd, uucp: ..!uunet!munnari!megadata.oz!andrew NSW AUSTRALIA D D
des@inmos.co.uk (David Shepherd) (02/18/89)
In article <640@m3.mfci.UUCP> colwell@mfci.UUCP (Robert Colwell) writes: >In article <784@atanasoff.cs.iastate.edu> hascall@atanasoff.cs.iastate.edu (John Hascall) writes: >> I seem to recall there was (is?) a TI processor which had all of >> its registers in memory except 1 register which pointed to >> the other registers, so a context switch was just save/restore >> that one register. Could a similar concept be implemented >> with all the registers in the chip? > >I think this was the TI 9900, the first 16-bit micro, The INMOS transputer has a similar idea. It has a 3 deep register stack, an instruction pointer and a workspace pointer that points into memory and (currently) 4k of on chip RAM. Loading and storing to on chip RAM relative to the workspace gives you 16 fast (1 cycle store, 2 cycle load) "registers" and 256 slightly slower (2 cycle store, 3 cycle load) "registers". Switching from one concurrent process to another only involves storing the instruction pointer, workspace pointer, adding the descheduled process to the end of the scheduling queue and taking the new one off the front. > which for some reason >didn't seem to catch on very well. hmmm ... perhaps it didn't have a decent C compiler either ;-) david shepherd INMOS ltd disclaimer: any opinions expressed above are mine -- so don't steal them
khb%chiba@Sun.COM (chiba) (02/18/89)
In article <103@eve.oz> andrew@eve.oz (Andrew McRae) writes: >From article <89Feb12.125852est.10867@ephemeral.ai.toronto.edu>, by bradb@ai.toronto.edu (Brad Brown): >>> [ Discussion about multiple register sets ] >> ..... >The Concurrent Computer Corp. 3200 series has multiple register >sets (up to 16), the operative set selected by a 4 bit field in the >processor status register. Generally these are not used to speed ... >tied up with the architecture (e.g. at interrupt time some of >the registers had useful values stored in them by the microcode >such as vector number, device address, previous status/program >counter etc). Not being a stack based machine, the register >swapping tended to be fundamental to the way the OS did things (I'm >speaking of course for the native OS/32, not Unix). > >I'm suprised that this idea (different register sets for different >interrupt levels) has not been taken up in some of the more modern >architectures, but there are problems (try doing a splX, and see >if your registers hold the same values...). The IBM Series/1 also had this sort of set up (4 sets, as best I can recall). More modern architectures haven't done this because most folks aren't trying to optimize that kind of context switch (at least none of the popular benchmarks are testing for it). It _is_ handy for certain types of real time tasking...so perhaps folks deeply involved in that area can comment. Wearing my application programmer hat, I'd rather have all those registers for _my_ task, if they can be usefully employed. Keith H. Bierman It's Not My Fault ---- I Voted for Bill & Opus