[comp.arch] how big are the queues on the SPARC floating point units?

torek@elf.ee.lbl.gov (Chris Torek) (03/16/91)

The SPARC documentation describes the FP queue as `one or more 64 bit
[values]' and also says that there will be at least one entry per
`parallel functional unit'.  It puts no architectural limit on the
queue length.  I need to be able to save these queues across context
switches and signals, however, so I to know how much space to
allocate.  The larger this is, the more expensive signal handling
becomes (although I can, and will, alleviate the problem somewhat for
processes that are not doing FP work; but I still need to allocate
space for this).
-- 
In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427)
Berkeley, CA		Domain:	torek@ee.lbl.gov

krueger@tms390.Micro.TI.COM (Steve Krueger) (03/22/91)

Since no one seems to have answered this, I'll take a cut.

torek@elf.ee.lbl.gov (Chris Torek) writes:

>The SPARC documentation describes the FP queue as `one or more 64 bit
>[values]' and also says that there will be at least one entry per
>`parallel functional unit'.  It puts no architectural limit on the
>queue length.  I need to be able to save these queues across context
>switches and signals, however, so I to know how much space to
>allocate.  The larger this is, the more expensive signal handling
>becomes (although I can, and will, alleviate the problem somewhat for
>processes that are not doing FP work; but I still need to allocate
>space for this).
>-- 
>In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427)
>Berkeley, CA		Domain:	torek@ee.lbl.gov

I take it you would like to know the lengths of the queues in current
implementations.

I thought this was contained in Appendix L, Implementation
Characteristics, in the new SPARC Architecture version 8 manual,
available soon from SPARC International.  The information is, alas,
not there.  So I will answer for the TI SPARC FPU, the TMS390C602A.
This FPU has a 2 entry Floating-Point Deferred Trap Queue which our
designers chose as optimum for our FP pipe length.  Each entry is a
double word and they may only be accessed with an STDFQ instruction.
STDFQ is a privileged instruction.

Quoting from the version 8 manual in Appendix L (Implementation
Characteristics):

	After an fp_exception trap occurs, the first entry in the
	queue is the address of the FP instruction that caused the
	exception, together with instruction itself.  Any remaining
	entries [(for the TI part, at most one more entry)] in the
	queue contain the address/instruction pairs for other FP
	instructions that have not yet finished execution when the
	fp_exception trap occurred.

Also in the manual in Appendix L:

	Note that the L64814, TMS390C602A, and WTL3171 FPU's are
	architecturally (and pin-) equivalent implementations.

From this you might guess that the queue lengths are the same but I
don't really know about the LSI Logic or Weitek chips.

I believe that none of the SPARC FPU's has had a queue length longer
than 3.  Future chips might have longer FQ's, but I know of none that
do.

While in version 7 the queue had a minimum length of one, in version 8
a queue length of zero is possible iff the implementation architecture
executes FP instructions synchronously.

Taken together, FP trap software that must run on all current SPARC
FPUs only needs to deal with queue depths in the range of 1 to 3.  In
the future, the range will extend from 0 to 3 and possibly higher.

When using STDFQ to read the FQ, you must monitor FSR.QNE bit to know
when to stop as it is an error (fp_sequence_error on the 'C602) to
execute STDFQ when FQ is empty.  Version 8 allows the format of the
data in the queue and much of the semantics of STDFQ to be
implementation dependant.

I hope this has been of some help.

	-Steve Krueger				krueger@micro.ti.com
	 Texas Instruments
	 SPARC Applications
	 Houston, Texas
	 (713) 274-2479

guy@auspex.auspex.com (Guy Harris) (03/27/91)

>Taken together, FP trap software that must run on all current SPARC
>FPUs only needs to deal with queue depths in the range of 1 to 3.  In
>the future, the range will extend from 0 to 3 and possibly higher.

Note that SunOS 4.x allows for a depth of up to 16.

ps@fps.com (Patricia Shanahan) (04/03/91)

In article <6829@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes:
 >>Taken together, FP trap software that must run on all current SPARC
 >>FPUs only needs to deal with queue depths in the range of 1 to 3.  In
 >>the future, the range will extend from 0 to 3 and possibly higher.
 >
 >Note that SunOS 4.x allows for a depth of up to 16.


The FPS 500 Series SPARC uses an FPU with a queue depth of 5.
--
	Patricia Shanahan
	ps@fps.com
        uucp : ucsd!celerity!ps
	phone: (619) 271-9940