[net.arch] Use Of Multiple Register Sets / Re: RISC perspective

chris@umcp-cs.UUCP (03/14/84)

Actually, if you have hardware support for registers that "fall
off the end of the chip", then process context switch doesn't have
to be difficult, even for thousands of registers.  All you need is
a little clever trickery (and probably lots more silicon).  The
basic idea is to treat registers as a funny kind of cache.

Example:

r0 r1 ... r9 r10 r11 r12 ... r225  r226 ... r781
--------mine-------- ---0x43210--- ---0x44444---
  (therefore valid)    mem addr       mem addr (actually one per reg)

If each register has a memory address (either physical memory or
the virtual address of a given process, with a possible "no address")
then process context switch is very simple:  just fiddle with the
"Valid Pointer" (who says in the above picture that r11 is the last
valid register).  The registers (r0-r11) are already attached to
the right process (since they have to have a mem address already
assigned in case they "fall off" because of n thousand subroutine
calls).  The Valid Pointer gets reset to include no registers and
gets his mem address value changed to the new process.

Whenever you touch a register, if it's not valid the following
sequence of events happens:

	if was_assigned_to_other_process then
		store_at_other_process_memory_address
		read_from_my_memory_address
	elif was_assigned_to_my_process then
		fancy_trickery_to_try_to_avoid_memory_references
	else
		read_from_my_memory_address
	fi
	make_register_valid

Anyway, the result should be a very inexpensive process context
switch with a little bit of extra overhead per register fetch/store
(which you can probably do in parallel with something else you
have to do anyway).

Hm, I wonder if I can patent this?

phipps@fortune.UUCP (Clay Phipps) (03/19/84)

I would like to pose a general question: 
how are multiple sets of registers best utilized ?

Some architectures, e.g., ModComp and ELXSI, have one complete set of registers 
per process (up to some number of processes like 16),
used to speed up process context switching.

The RISC has its "overlapping register windows" scheme,
where each level of routine call causes an overlapped
(to allow parameter passing) window of registers to move, stack-like,
over a much larger set (128 ? 256 ?) of registers,
to speed up routine call context switching.

Forest Baskett had his "register cube" or "3-D register" scheme,
which combined the two ideas (although I don't believe the windows overlapped).

If you could only use one approach (not Forest's combination), 
which would get the biggest speed win, and for what classes of programs ?
For example, in UNIX, are process context switches more or less common
that routine call context switches ?

Just rattling a few more cages ...

-- Clay Phipps

-- 
   {allegra,amd70,cbosgd,dsd,floyd,harpo,hpda,ihnp4,
    megatest,nsc,oliveb,sri-unix,twg,varian,VisiA,wdl1}
   !fortune!phipps

howard@metheus.UUCP (Howard A. Landman) (03/22/84)

The question in the referenced article was whether interprocess context
switches were more or less common than procedure call context switches.

Actually, we ASKED that question before designing RISC I.  The measurements
taken indicated that for VAX UNIX, interprocess switches were about 100 times
less frequent than procedure calls.  Thus the payoff for handling them in
special hardware is also much less.  We decided it wasn't worth the effort.
I still think that was the correct decision, especially considering that some
interprocess switching (e.g. simple interrupts) can be coerced into using the
register window mechanism much like a procedure call.  (One interrupt which
CAN'T be so coerced is the window overflow interrupt, for obvious reasons.)

	Howard A. Landman
	ogcvax!metheus!howard