[comp.arch] separate integer and float register

joe@modcomp.UUCP (08/15/88)

>... what were the tradeoffs that went into the decision to make the
>floating point and integer registers the same?

Special floating point registers also slow down context switching, due
to the extra time needed to save/restore them.  This can be an important
factor in real-time or communication applications that have high context
switch rates.

mac3n@babbage.acc.virginia.edu (Alex Colvin) (08/16/88)

> Special floating point registers also slow down context switching, due
> to the extra time needed to save/restore them.

Depends.

In some systems the FP registers are only saved if they're going
to be re-used.
In a simple interrupt handler, where you promise not to touch FP, you leave
them active.
In other environments you make certain registers global, shared by all
tasks & handlers, never saved.

This assumes a lot of control over the code.

sah@mips.COM (Steve Hanson) (08/16/88)

>
>>... what were the tradeoffs that went into the decision to make the
>>floating point and integer registers the same?
In article <6800002@modcomp> joe@modcomp.UUCP writes:
>
>Special floating point registers also slow down context switching, due
>to the extra time needed to save/restore them.  This can be an important
>factor in real-time or communication applications that have high context
>switch rates.



	Real-time executives should only save floating point context
for processes that use the FPU. For example, for the MIPS R2000 you
would simply disable the FPU prior to a context switch and only save
floating point context for the last FPU owner if you get a coprocessor
unusable exception. A coprocessor unusable exception occurs due to an
attempt to execute a coprocessor instruction when the corresponding
coprocesor unit has not been marked usable. Variations of this game
are also available for other processors.


-- 
UUCP: {ames,decwrl,prls,pyramid}!mips!sah
USPS: MIPS Computer Systems, 930 Arques Ave, Sunnyvale CA, 94086

walter@garth.UUCP (Walter Bays) (08/17/88)

In article <6800002@modcomp> joe@modcomp.UUCP writes:
>Special floating point registers also slow down context switching, due
>to the extra time needed to save/restore them.

More registers slow down context switching, whether integer or floating
point, windowed or conventional.  The net effect depends on the
workload; as you point out, some real-time applications may have a very
high context switch rate.  In choosing/designing a real-time executive
for such an application on a machine with many registers I might adopt
register usage conventions that treated most registers as volatile.
-- 
------------------------------------------------------------------------------
My opinions are my own.  Objects in mirror are closer than they appear.
E-Mail route: ...!pyramid!garth!walter		(415) 852-2384
USPS: Intergraph APD, 2400 Geng Road, Palo Alto, California 94303
------------------------------------------------------------------------------

keith@mips.COM (Keith Garrett) (08/17/88)

In article <6800002@modcomp> joe@modcomp.UUCP writes:
>
>Special floating point registers also slow down context switching, due
>to the extra time needed to save/restore them.  This can be an important
>factor in real-time or communication applications that have high context
>switch rates.

At MipsCo loads and stores to fp registers take the same amount of time as
for integer registers.
-- 
Keith Garrett        "This is *MY* opinion, OBVIOUSLY"
UUCP: keith@mips.com  or  {ames,decwrl,prls}!mips!keith
USPS: Mips Computer Systems,930 Arques Ave,Sunnyvale,Ca. 94086

rcd@ico.ISC.COM (Dick Dunn) (08/17/88)

> Special floating point registers also slow down context switching, due
> to the extra time needed to save/restore them...

How?  If you have m integer registers and n floating-point registers, it
ought to take about the same time as the same number (m+n) of general
registers.  Let's not confuse the issue of separate registers with the
matter of how many registers you have total...you'd expect that if you have
more registers, it's going to take longer to save/restore.

However, there are two other considerations which make things a little more
complicated:
	- Separate FP registers may actually require that you have more
	  total registers to get the same performance, since you can't
	  trade off the use of a register between int and fp.  If your
	  registers are completely GP, you might use half of them as fp in
	  one task but none of them as fp in another.
			but
	- You can often skip the FP register save/restore by keeping track
	  of whether the registers get used...for example, if you can
	  arrange to get an exception on the first fp op, you don't have to
	  save fp registers for a process that has never gotten that
	  exception.
-- 
Dick Dunn      UUCP: {ncar,nbires}!ico!rcd           (303)449-2870
   ...I'm not cynical - just experienced.

cik@l.cc.purdue.edu (Herman Rubin) (08/18/88)

In article <1241@garth.UUCP>, walter@garth.UUCP (Walter Bays) writes:
> In article <6800002@modcomp> joe@modcomp.UUCP writes:
> >Special floating point registers also slow down context switching, due
> >to the extra time needed to save/restore them.
> 
> More registers slow down context switching, whether integer or floating
> point, windowed or conventional.  The net effect depends on the
> workload; as you point out, some real-time applications may have a very
> high context switch rate.  In choosing/designing a real-time executive
> for such an application on a machine with many registers I might adopt
> register usage conventions that treated most registers as volatile.

If good arithmetic is involved, any convention is bad is not too rare 
situations.  Treating registers as volatile is bad if there are conditional
subroutine calls or even conditional branches where the condition is rare.
Also, it is a nuisance to have separate integer and floating point registers.
I am frequently doing arithmetic which uses both simultaneously, and I would
suggest that pack and unpack instructions be part of the general set.
Since loads and stores are undesirable, the sets of registers should be
available simultaneously.

Unless there is a big time improvement in having separate register sets,
one gains from the ability to trade off between registers used for integers
and floating point registers.  I doubt that enough can be gained by having
separate register sets to compensate for the losses.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)

colwell@mfci.UUCP (Robert Colwell) (08/18/88)

In article <879@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:
>In article <1241@garth.UUCP>, walter@garth.UUCP (Walter Bays) writes:
>> In article <6800002@modcomp> joe@modcomp.UUCP writes:
>> >Special floating point registers also slow down context switching, due
>> >to the extra time needed to save/restore them.
>> 
>> More registers slow down context switching, whether integer or floating
>> point, windowed or conventional.
>Unless there is a big time improvement in having separate register sets,
>one gains from the ability to trade off between registers used for integers
>and floating point registers.  I doubt that enough can be gained by having
>separate register sets to compensate for the losses.

For conventional machines you have a point.  For a machine with
multiple independent functional units (a VLIW, for instance) I don't
think it's even debatable.  Ideally, the compiler would most like a
huge register file, shared by all functional units.  In our case,
that would be something like 28 functional units sharing a register
file that was 32 bits wide and something like 1K deep.  That's 28
read ports, assuming you don't want more to help with stores.  And
unless you're willing to put crossbars or shared ports you'd need
about that many write ports.  Even slicing the register file by 1
bit at a time you'd never fit the pins needed (28*10 just for reg
read selection).  Never mind how the heck you'd get all those
functional units anywhere near the register file to try to get some
usable cycle time.

For some machine architectures, you have to split up the registers
into sets, and once you've done that, I think an I/F split makes
better sense than some IF/IF partitioning.  In our machine, the F
side can do integer ops anyway.



Bob Colwell            mfci!colwell@uunet.uucp
Multiflow Computer
175 N. Main St.
Branford, CT 06405     203-488-6090

walter@garth.UUCP (Walter Bays) (08/19/88)

As others have pointed out, the fastest way to handle floating point
registers on a context switch is to save them only if they're used.
Clipper uses a dirty-bit on the FP register file for this purpose.

>In article <1241@garth.UUCP>, I wrote:
>> In choosing/designing a real-time executive
>> for such an application on a machine with many registers I might adopt
>> register usage conventions that treated most registers as volatile.

In article <879@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:
>If good arithmetic is involved, any convention is bad is not too rare 
>situations.  Treating registers as volatile is bad if there are conditional
>subroutine calls or even conditional branches where the condition is rare.

I was speaking of a hypothetical real-time executive for the original
posters' embedded application.  I certainly wouldn't do that in UNIX!

>Also, it is a nuisance to have separate integer and floating point registers.
>I am frequently doing arithmetic which uses both simultaneously, and I would
>suggest that pack and unpack instructions be part of the general set.
>Since loads and stores are undesirable, the sets of registers should be
>available simultaneously.

Clipper has integrated on-chip floating point, and the integer and
floating point registers are simultaneously available.  I think that
some CPU's with separate floating point co-processors access those
registers through loads and stores.  I don't know how long it takes
such CPU's to transfer data from floating point to integer registers,
but I assume it's faster than main memory.

>Unless there is a big time improvement in having separate register sets,
>one gains from the ability to trade off between registers used for integers
>and floating point registers.  I doubt that enough can be gained by having
>separate register sets to compensate for the losses.

Actually you do get quite a win from separate access paths, avoiding
arbitrary resource conflicts for what are often independent instructions.
Whether that outweighs more flexible allocation from a single register
pool is debatable.  It depends on how much you get from smoother pipeline
flow, how the compiler uses its registers, and the workload.  In the
absense of profiling information, it would be difficult for a compiler
to decide whether a register would be better spent for (say) remembering
a floating point array value or remembering its' address.  A worse loss
is probably pre-allocation of registers for specific purposes before
compilation.
-- 
------------------------------------------------------------------------------
My opinions are my own.  Objects in mirror are closer than they appear.
E-Mail route: ...!pyramid!garth!walter		(415) 852-2384
USPS: Intergraph APD, 2400 Geng Road, Palo Alto, California 94303
------------------------------------------------------------------------------