fsset@bach.lerc.nasa.gov (Scott E. Townsend) (04/25/91)
This sounds too strange to be true, but is it possible for the FPSR to return 'too fresh' data? Or put another way, why should the following two code fragments behave differently? #1 fdiv.ddd r8,r2,r4 fldcr r12,fcr62 ; fcr62 == FPSR bb1 0,r12,@L21 ; AFINX bit #2 fdiv.ddd r8,r2,r4 tb1 0,r0,0 ; trap not taken, but system 'synced' fldcr r12,fcr62 bb1 0,r12,@L21 This code is buried a bit, so I don't have the exact different results, but the behaviour _is_ different. Is it possible the fldcr gets whatever is 'current' rather than the result of the fdiv? (which will take a while) Please show me I'm sniffing glue!
kenton@abyss.zk3.dec.com (Jeff Kenton OSG/UEG) (04/25/91)
In article <1991Apr24.200412.7483@eagle.lerc.nasa.gov>, fsset@bach.lerc.nasa.gov (Scott E. Townsend) writes: |> This sounds too strange to be true, but is it possible for the FPSR to |> return 'too fresh' data? Or put another way, why should the following two |> code fragments behave differently? |> |> #1 |> fdiv.ddd r8,r2,r4 |> fldcr r12,fcr62 ; fcr62 == FPSR |> bb1 0,r12,@L21 ; AFINX bit |> |> #2 |> fdiv.ddd r8,r2,r4 |> tb1 0,r0,0 ; trap not taken, but system 'synced' |> fldcr r12,fcr62 |> bb1 0,r12,@L21 |> |> This code is buried a bit, so I don't have the exact different results, |> but the behaviour _is_ different. Is it possible the fldcr gets whatever |> is 'current' rather than the result of the fdiv? (which will take a while) |> Yes -- that is exactly what's happening. As a matter of fact, most of the status bits in the FPSR are set by exception handlers, not hardware. The imprecise exceptions (AFINX is one) are not detected before the next cycle, which is what your example #1 is expecting. ----------------------------------------------------------------------------- == jeff kenton Consulting at kenton@decvax.dec.com == == (617) 894-4508 (603) 881-0011 == -----------------------------------------------------------------------------
hamilton@siberia.rtp.dg.com (Eric Hamilton) (04/25/91)
In article <1991Apr24.200412.7483@eagle.lerc.nasa.gov>, fsset@bach.lerc.nasa.gov (Scott E. Townsend) writes: |> This sounds too strange to be true, but is it possible for the FPSR to |> return 'too fresh' data? Or put another way, why should the following two |> code fragments behave differently? |> |> #1 |> fdiv.ddd r8,r2,r4 |> fldcr r12,fcr62 ; fcr62 == FPSR |> bb1 0,r12,@L21 ; AFINX bit |> |> #2 |> fdiv.ddd r8,r2,r4 |> tb1 0,r0,0 ; trap not taken, but system 'synced' |> fldcr r12,fcr62 |> bb1 0,r12,@L21 |> |> This code is buried a bit, so I don't have the exact different results, |> but the behaviour _is_ different. Is it possible the fldcr gets whatever |> is 'current' rather than the result of the fdiv? (which will take a while) Yes. These two code fragments behave differently, and for exactly the reason that you suspect. The fdiv instruction has started but not completed. The floating point imprecise exceptions (overflow, underflow, and imprecise) are signalled when the operation is complete, so code fragment #1 is reading the FPSR prematurely. The trap-not-taken drains teh pipelines (at the cost of waiting sixty-odd cycles for the fdiv to complete) so that code fragment #2 will show the effect of any imprecise exceptions provoked by the fdiv. Note that any attempt to use r8 or r9 will have the same effect of waiting for the fdiv to complete - there will be a scoreboard hold. The oddity is that the FPSR, which is also a "destination" register for the fdiv is not interlocked with the floating point pipe.
marvin@oakhill.sps.mot.com (Marvin Denman) (04/26/91)
In article <1991Apr24.200412.7483@eagle.lerc.nasa.gov> fsset@bach.lerc.nasa.gov (Scott E. Townsend) writes: >This sounds too strange to be true, but is it possible for the FPSR to >return 'too fresh' data? Or put another way, why should the following two >code fragments behave differently? > >#1 > fdiv.ddd r8,r2,r4 > fldcr r12,fcr62 ; fcr62 == FPSR > bb1 0,r12,@L21 ; AFINX bit > >#2 > fdiv.ddd r8,r2,r4 > tb1 0,r0,0 ; trap not taken, but system 'synced' > fldcr r12,fcr62 > bb1 0,r12,@L21 > On the 88100 there is no builtin interlock between floating point instructions and reads or writes to the FPSR. The "safe" way of modifying the FPSR is to sync the processor just as your example did. Future implementations of the 88000 architecture may or may not need this synchronization, but it should never cause you any problems other than an additional instruction. Note that if this was an fstcr to the FPSR, your value might be overwritten when the fdiv completes. I looked briefly through the 88100 User's Manual and I could not find this behavior documented either, but I will continue to look. -- Marvin Denman Motorola 88000 Design cs.utexas.edu!oakhill!marvin