[comp.arch] IEEE floating point & various approaches; long

mash@mips.COM (John Mashey) (11/01/90)

In article <2451@charon.cwi.nl> dik@cwi.nl (Dik T. Winter) writes:
...
>The true Cray and Convex users do not care about IEEE.  Their machines do
>not have IEEE conformant arithmetic; far from it.  They want results; fast.
>They do not care about correctness.  :-)

Note, also, that this somewhat applies to the IBM RS/6000 series.
Legally, but unlike almost all other IEEE FP implementations,
FP OPERATIONS DO NOT NORMALLY TRAP ON EXCEPTION CONDITIONS;
i.e., SIGFPE doesn't normally do anything.

You choices are:
	a) Run the code in a mode where FP operations are sequentialized,
	which of course seriously degrades performance, but gives
	precise exceptions. (good for debug)
	b) Make explicit calls in your code to routines that test the
	status of the various flags. (probably not favored by programmers,
	especially since it de-portabilizes otherwise-portable code.)
	c) (I'm not sure if this is shipped yet, or not; does anybody KNOW?):
	use compiler switches that
	generate the relevant tests, like at end of statement, end of
	function, or program exit.

There are some reasonable reasons from this, and I attach the relevant
quotes from the IBM J of R & D that describes the RS/6000.
HOWEVER, people may want to be warned that code ported from other machines,
may possibly NOT trap exceptions the way that most other IEEE machines do.
(I would assume that if you trap SIGFPE, that the runitme at least, on
program exit, tells you that there has been an exception "sometime".
If it doesn't do this, I suspect there is a bunch of code out there
that thought it was protecting itself, and didn't.)

Here's what the IBM man page says:
  fp_clr_flag, fp_set_flag, fp_read_flag, or fp_swap_flag 
.....
    Description

      The RISC System/6000 currently does not generate an interrupt
  for floating-point exceptions.  Therefore, the common  method  of
  catching  the  signal SIGFPE  and  calling  an  appropriate  trap
  handler to identify a floating-point exception is not supported.

      These subroutines aid in determining when an exception has
  occurred and the exception type.  These subroutines can be called
  explicitly  around blocks of code that may cause a floating-point
  exception.

Then, here is some explanation (long):

From IBM J Res & Dev, Vol 34, No 1, January 1990 (IBM RS/6000 issue)

p.33-34 (CAPITALS MINE)
"Another very important aspect of fully exploiting floating-point performance
is the method of presentation of floating-point exceptions and the precision
in identifying the instructions that cause floating-point exceptions.
Exceptions are a natural and perhaps expected consequence of floating-point
operaitons, and most can be handled by default rules.  (Default exception
handling is defined by the IEEE standard.)  These default rules can be
managed completely in hardware and require no program intervention after
initialization.

The IEEE default rules do not always provide the desired result, however.
Since the standard allows for a program fix-up after an exception, the
architecture problem is then to define a mechanism to permit program fixu-up.
The most straightforward approach is to specify that a floating-point
interrupt at the failing instruction will occur whenever there is a
floating-point exception that is not defaulted.  The hardware implication
of this is that all instructions after a floating point instruction must
be conditional until it is know that no exceptions are possible on that
instruction.  Some floating point instructions take many cycles, and exceptions
may not be known until the last cycle of the instruction.  Therefore, most
implementations would serialize on floating-point instructions-if not the
first, then the second; if not all, then some.  The inclusion of a floating-
point interrupt would sacrifice much of the potential floating-point
performance.

AN ALTERNATIVE STRATEGY IS NOT TO REPORT AN INTERRUPT AT ALL, BUT SIMPLY TO
SET A BIT INDICATING THAT A FLOATING-POINT EXCEPTION HAS OCCURRED.  IT IS
THEN UP TO A PROGRAM TO TEST FOR FLOATING-POINT EXCEPTIONS.  Different
compiler strategies can be used as to where it is appropriate to test for
these exceptions.  Since the definition of the exception also includes the
setting of summary information, it is possible to test at the end of a program,
at the end of a subprogram, or at the end of a statement where a floating-point
operation was used.  This level of precision can be controlled by
linker/compiler option.  None of these tell exactly where the exception
occurred; they simply identify where it occurred.  In most cases, this
information is sufficient.

However, if the exact failing instruction must be known, there are two
possible strategies.  One can insert a test for the exception after each
floating-point instruction, or one can tag each queued and/or executing
floating-point instruction with its address.  Inserting code to test for
every possible exception is yet another mode for the compiler to manage,
necessitates recompilation, and can significantly expand execution time.
Address tagging of "active" floating-point instructions identifies the
failing instruction exactly.  However, it does require that the
implementation keep track of the address tags.  Moreover, it is not
synchronous; that is, if an exception occurs, the location of the
failing instruction is reported, but not before the program
has gone beyond that point.  Fix-up may still be possible, but in general
this method only permits localization of the failing instruction.
Consider the case ofthe inner-loop product described in Figure 7.
This loop consists of two floating-point loads, one floating-point
multiply-add, and one branch.  The "active" floating-point instructions
will all be instances of the same multiply-add instruction.  If an
exception occurs, what is know is the address of the instruction, not
the iteration number.  The benefit of this approach is speed; floating-point
performance is not limited by exception recognition.  The drawback, as
outline above, is the precision with which the fault is determined.

RISC System/6000 architecture adopted a two-part strategy.  THE PRINCIPAL
APPROACH WOULD BE TEST-CODE INSERTION, with the compilers able to insert
such code at the statement or (sub)program level.  The linker also
supports the enabling of test code at program exit, ensuring the ability 
to report a floating-point exception if it occurs anywhere within the
program.

To avoid recompilation in order to identify the failing operation exactly,
the architecture also adopted a synchronize mode, in which an interrupt
can be generated, identifying the failing instruction by running the
machine with one floating-point instruction dispatched at a time.  This
technqiue has the same weakness as code insertion; THAT IS, FLOATING-POINT
PERFORMANCE IS GREATLY REDUCED.  However, it may not be as bad as code
insertion, because the synchronization can be managed by hardware rather than
by extra code inserted by software.  It is expected that the mode will only
by extra code inserted by software.  IT IS EXPECTED THAT THE MODE WILL ONLY
BE USED BY CERTAIN PROGRAMS AND THEN ONLY TO DEBUG THEIR ALGORITHMS."
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	 mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash 
DDD:  	408-524-7015, 524-8253 or (main number) 408-720-1700
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (11/01/90)

In article <42597@mips.mips.COM>, mash@mips.COM (John Mashey) writes:
> Note, also, that this somewhat applies to the IBM RS/6000 series.
> Legally, but unlike almost all other IEEE FP implementations,
> FP OPERATIONS DO NOT NORMALLY TRAP ON EXCEPTION CONDITIONS;
> i.e., SIGFPE doesn't normally do anything.

Am I missing something?  Any floating-point system where the *default*
mode of operation is to generate traps does *NOT* in that mode conform
to the IEEE 754 standard.  If you _want_ traps, then you have to call
some system-specific extension to get them.  So the complaint appears
to be that the RS/6000 (like the Sun implementations of floating point)
conforms to the letter of the standard.
-- 
The problem about real life is that moving one's knight to QB3
may always be replied to with a lob across the net.  --Alasdair Macintyre.

mash@mips.COM (John Mashey) (11/02/90)

In article <4174@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:
>In article <42597@mips.mips.COM>, mash@mips.COM (John Mashey) writes:
>> Note, also, that this somewhat applies to the IBM RS/6000 series.
>> Legally, but unlike almost all other IEEE FP implementations,
>> FP OPERATIONS DO NOT NORMALLY TRAP ON EXCEPTION CONDITIONS;
>> i.e., SIGFPE doesn't normally do anything.
>
>Am I missing something?  Any floating-point system where the *default*
>mode of operation is to generate traps does *NOT* in that mode conform
>to the IEEE 754 standard.  If you _want_ traps, then you have to call
>some system-specific extension to get them.  So the complaint appears
>to be that the RS/6000 (like the Sun implementations of floating point)
>conforms to the letter of the standard.

Sorry,  I guess I wasn't clear enough. Let me say what I said again,
but explain further:

1) In IEEE 754:
	1a
	"The implementor may, at his option, implement the following
	modes: traps disabled/enabled, to handle exceptions."

	1b
	"There are five types of exceptions that shall be signaled when
	detected.  The signal entails setting a status flag, taking a
	trap, or possibly doing both. With each exception should be
	associated a trap under user control, as specified in Section 8.
	THE DEFAULT RESPONSE TO AN EXCEPTION SHALL BE TO PROCEED WITHOUT
	A TRAP....For each type of exception the implementation shall
	provide a status flag that shall be set on any occurrence of
	the corresponding exception when no corresponding trap occurs."

	1c
	"A user should be able to request a trap on any of the five
	exceptions by specifying a handler for it......When an
	exception whose trap is disabled is signaled, it shall be
	handled in the manner specified in Section 7. [mash: i.e.,
	by setting setting flags]  WHEN AN EXCEPTION WHOSE TRAP IS ENABLED
	IS SIGNALED THE EXECUTION OF THE PROGRAM IN WHICH THE EXCEPTION
	OCCURRED SHALL BE SUSPENDED, THE TRAP HANDLER PREVIOUSLY SPECIFIED
	BY THE USER SHALL BE ACTIVATED, AND A RESULT, IF SPECIFIED IN
	SECTION 7, SHALL BE DELIVERED TO IT."  goes on to describe what
	trap handlers can do.

2) Now, I take from this that what IBM did IS legal, from 1a above:
they basically made trapping exceptions impossible in normal code
(unless you run in "sequentialize" mode, which is NOT something you'd
do except for debugging.) 

3) Now, what I meant in the original quote is:
	a) The default (for anybody) is to ignore exceptions,
	if nobody says anything.
	b) On most, if not all IEEE-compliant machines, if you put
	a trap in for SIGFPE, you get feature 1c; at least, you get
	an exception signalled, and at worst, you go to a part of the
	run-time that tells you something went wrong, and where, and how.
	At best, you get a trap handler than can do what 1c says,
	fixing up values, etc. (I don't know how well all the various
	implementations are at doing this.)
	You get this on the same binary that you would normally distribute,
	that runs at full speed, and you get it by including 1 statement.
	You DO NOT GET this effect from the same IBM binary that
	runs at full speed. 

4) So, anyway, the point is: it's perfectly legal, (and the committee
carefully allowed for such things), but the effect is different:
if you just compile the same portable code that may run on many other
machines, and on those machines, will at least report exceptions
fairly precisely, in the binary you'd distribute, you will NOT get
that effect on the IBMs.  Clearly, the supercomputer world has proved
that there exists a market that wants max speed, perhaps with some
loss in error handling. People should understand what the difference
is.

Now, how about input from people who actually develop FP programs:
a) Do you use traps for SIGFPE, or not?
b) If SIGFPE doesn't do anything, and you have to change the source code
widely to check for traps, is this: no problem, slightly painful,
or excruciating.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	 mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash 
DDD:  	408-524-7015, 524-8253 or (main number) 408-720-1700
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

mcdonald@aries.scs.uiuc.edu (Doug McDonald) (11/02/90)

In article <42618@mips.mips.COM> mash@mips.COM (John Mashey) writes:
>
>Now, how about input from people who actually develop FP programs:
>a) Do you use traps for SIGFPE, or not?

No. In my scientific codes, the correct answer for an underflow is zero, 
and I test for zero before I divide, if this could ever happen
in normal use. In very rare cases I have seen NaNs and Infs. These
imply to me that there is a bug in my code. So I debug it.

Doug McDonald

jonah@dgp.toronto.edu (Jeff Lee) (11/02/90)

mash@mips.COM (John Mashey) writes:
>Now, how about input from people who actually develop FP programs:
>a) Do you use traps for SIGFPE, or not?

I'm not a numerical analyst, but I did recently take a graduate course
on numerical software and the advice given by the professor was to
avoid traps and make explicit tests for +INF, -INF, NaN, and 0 where
appropriate.  Usually only one or two are appropriate in any given
case and explicitly checking makes sure that you *think* about the
possibilities for overflow/underflow, how to avoid it (if possible),
and how to retain accuracy.  The opinion on traps was that they were
generally more trouble than they were worth, especially if the trap
handler overhead was any significant amount.

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (11/02/90)

In article <42618@mips.mips.COM>, mash@mips.COM (John Mashey) writes:
> 	1b
> 	"There are five types of exceptions that shall be signaled when
		   ****
> 	detected.  The signal entails setting a status flag, taking a
> 	trap, or possibly doing both. With each exception should be
> 	associated a trap under user control, as specified in Section 8.
> 	THE DEFAULT RESPONSE TO AN EXCEPTION SHALL BE TO PROCEED WITHOUT
> 	A TRAP....For each type of exception the implementation shall
> 	provide a status flag that shall be set on any occurrence of
> 	the corresponding exception when no corresponding trap occurs."
> 
> 	1c
> 	"A user should be able to request a trap on any of the five
> 	exceptions by specifying a handler for it......When an
					       **
> 	exception whose trap is disabled is signaled, it shall be
> 	handled in the manner specified in Section 7. [mash: i.e.,
> 	by setting setting flags]  WHEN AN EXCEPTION WHOSE TRAP IS ENABLED
> 	IS SIGNALED THE EXECUTION OF THE PROGRAM IN WHICH THE EXCEPTION
> 	OCCURRED SHALL BE SUSPENDED, THE TRAP HANDLER PREVIOUSLY SPECIFIED
> 	BY THE USER SHALL BE ACTIVATED, AND A RESULT, IF SPECIFIED IN
> 	SECTION 7, SHALL BE DELIVERED TO IT."  goes on to describe what
> 	trap handlers can do.

> 	b) On most, if not all IEEE-compliant machines, if you put
> 	a trap in for SIGFPE, you get feature 1c;

It's worth pointing out that SIGFPE does not meet the IEEE criteria.
SIGFPE is a blanket "some kind of floating point exception, or maybe
some kind of integer exception, _I_ know but I'm not going to tell _you_".
(That is, according to SVID 2.  I don't know what SVr4 does.  4.xBSD
does have the courtesy to tell you what kind of exception you got.)
The IEEE standard calls for *separate* enabling and disabling of each
of the five traps, with separate handlers of the user's choice for each.
I'm going to cite the SVID rather than 1003.1, and that's for two reasons.
First, my copy of 1003.1 is in another city and I don't remember exactly
what it says.  Second, the degree to which the RS/6000 behaviour has
something to do with the degree to which you have been able to rely on
SIGFPE in the past, and 1003.1 is not the past.  The SVID says that
	SIGFPE		floating-point exception
			[What does this mean?  Which things generate
			exceptions?  The SVID does not say.  You have
			never been given _any_ guarantee.]
when you supply a handler:
	"the receiving process is to execute the signal-catching function...
	The signal number sig [here SIGFPE] will be passed as the only
	argument to the signal-catching function.
	Additional arguments may be passed to the signal-catching
	function for hardware-generated signals."
	[The last two sentences seem contradictory.  I take it that if
	you have hardware floating point it may pass additional arguments
	to your signal handler, but a software emulation may only pass
	SIGFPE and nothing else.  Absurd, but that seems to be what it says.]
	"Upon return from the signal-catching funciton, the receiving
	process will resume execution at the point at which it was
	interrupted, except for implementation defined signals where this
	may not be true."
	[I.e. it's anyone's guess what return from a handler may do, but
	the system documentation should _say_.]
			
So UNIX code written to the SVID
    -- never had any guarantee about what floating-point operations
       could generate signals
    -- had no way of telling the difference between underflow and overflow
       or for that matter between overflow and INTEGER divide by 0
    -- had no way of finding out where the exception had occurred
    -- could not resume or bypass the operation, but had either to
       halt the program or longjmp() out

UNIX systems providing IEEE-conforming traps do exist, but differ.
-- 
The problem about real life is that moving one's knight to QB3
may always be replied to with a lob across the net.  --Alasdair Macintyre.

akhiani@ricks.enet.dec.com (Homayoon Akhiani) (11/02/90)

Can anyone mail me the following article, I can not find it.

[IEEE floating point & various approaches;long]

thanks in advance,

akhiani@ricks.enet.dec.com

henry@zoo.toronto.edu (Henry Spencer) (11/03/90)

In article <1990Nov1.232508.18287@jarvis.csri.toronto.edu> jonah@dgp.toronto.edu (Jeff Lee) writes:
>...The opinion on traps was that they were
>generally more trouble than they were worth, especially if the trap
>handler overhead was any significant amount.

All the more so if you run into a situation like Mike O'Dell tells about
at Prisma, in which catching the trap drastically slows down your code
(because it demands that the trap occur at well-defined times, which a
heavily pipelined blazing-fast machine has real trouble with).
-- 
"I don't *want* to be normal!"         | Henry Spencer at U of Toronto Zoology
"Not to worry."                        |  henry@zoo.toronto.edu   utzoo!henry

mash@mips.COM (John Mashey) (11/03/90)

In article <4186@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:

>It's worth pointing out that SIGFPE does not meet the IEEE criteria.
>SIGFPE is a blanket "some kind of floating point exception, or maybe
>some kind of integer exception, _I_ know but I'm not going to tell _you_".
>(That is, according to SVID 2.  I don't know what SVr4 does.  4.xBSD
>does have the courtesy to tell you what kind of exception you got.)
>The IEEE standard calls for *separate* enabling and disabling of each
>of the five traps, with separate handlers of the user's choice for each.
>I'm going to cite the SVID rather than 1003.1, and that's for two reasons.
>First, my copy of 1003.1 is in another city and I don't remember exactly
>what it says.  Second, the degree to which the RS/6000 behaviour has
>something to do with the degree to which you have been able to rely on
>SIGFPE in the past, and 1003.1 is not the past.  The SVID says that

1) SVID, to be honest, was fairly irrelevant to this, in that it
described an interface, but gave as a future direction IEEE
exception-handling.  (note, of course, that FP was not exactly a major
concern of that issue of the SVID, and I'd certainly guess that the bulk
of the UNIX-based FP computing has been done on BSD-derived OSs,
or merged variants.)

2) People who've built IEEE-based computers who worried about FP
have long ago included some approximation or other to full IEEE-signal
handlers.  I'm sure Sun did; we did, and I'd assume others did also.
Maybe somebody from Sun & HP would say what they do.  We of course let you
turn on/off specific traps, and when you get a SIGFPE, there's a field
to tell you which one it was.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	 mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash 
DDD:  	408-524-7015, 524-8253 or (main number) 408-720-1700
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (11/05/90)

In article <42677@mips.mips.COM>, mash@mips.COM (John Mashey) writes:
> In article <4186@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:

> 1) SVID, to be honest, was fairly irrelevant to this,

Eh?  Oh gosh, this means I have been fantastically stupid.
You see, when I tried to write numeric code that would run under "UNIX",
I used the SVID as a portability guide.  

> in that it described an interface, but gave as a future direction IEEE
> exception-handling.

Darn it, so _that_ was the answer.  Instead of trying to write programs
that would use the facilities available *now* (actually, *then*) I should
have waited for the future (no IEEE signal handling in V.3).

BSD UNIX *does* provide adequate information to an SIGFPE handler,
for my former purposes, except that the documentation could do with
improving (what is the difference between FPE_FLTOVF_FAULT and
FPE_FLTOVF_TRAP, for example?).  But if your code is to run on a V.2
box, that really doesn't help, it merely adds the pain of Tantalus
to that of Sisyphus.

> 2) People who've built IEEE-based computers who worried about FP
> have long ago included some approximation or other to full IEEE-signal
> handlers.  I'm sure Sun did; we did, and I'd assume others did also.

(a) Not all.  (I suppose it depends on how worried they were.)
(b) Certainly Sun did.  Twice.  And Sun's software FP on the 3/50 never
generated no signals nohow (I don't know about 4.x). People who wanted
their code to be portable *had* to write it so that it would work
without IEEE exception handling, even if they only wanted their code to
work on IEEE + 4.xBSD UNIX boxes, even if they further restricted
portability to MC680x0-based machines, because there wasn't a
*standard*, not even a de-facto standard, interface.  The really
horrible thing was that Sun were one of a very small number of companies
that obeyed the letter of IEEE 754 law and gave you the default
behaviour required by that standard, so not only could you not rely on
*getting* interrupts, you couldn't rely on *not* getting them either.

I suppose the contrast is between companies/universities/research units
that had serious number-crunching requirements and picked a UNIX system
to suit and inviduals who having a UNIX system ready to hand wanted to
get some number-crunching done and wanted it to work on whichever system
they'd be able to get time on next.

-- 
The problem about real life is that moving one's knight to QB3
may always be replied to with a lob across the net.  --Alasdair Macintyre.

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (11/06/90)

  While this discussion is interesting, I suspect that most of the FP
code in the world runs without exceptions, trapped, untrapped, defined
or anonymous. That's what they pay numerical analysts to insure.

  I think the IEEE behavior will come in the future, but I don't see a
huge demand for it. Partially because the big CPU users are usually
running on some box with non-iEEE math anyway.
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
      The Twin Peaks Halloween costume: stark naked in a body bag

khb@chiba.Eng.Sun.COM (Keith Bierman fpgroup) (11/06/90)

In article <42677@mips.mips.COM> mash@mips.COM (John Mashey) writes:
...
   2) People who've built IEEE-based computers who worried about FP
   have long ago included some approximation or other to full IEEE-signal
   handlers.  I'm sure Sun did; we did, and I'd assume others did also.
   Maybe somebody from Sun & HP would say what they do.  We of course let you
   turn on/off specific traps, and when you get a SIGFPE, there's a field
   to tell you which one it was.


This is a topic worthy of a long discussion, however I will cop out
and note that

	800-3555-10	Numerical Computation Guide

was our (speaking as member of the fpgroup, I did not actually touch
the text much) last best shot at it. The Reader's Digest version is

	istat=ieee_handler("set",ieee_execption_of_your_choice,handler)

The NCG is part of the "normal" languages release, so f77v1.3+, c1.0+,
etc. releases should have it tucked into the binder. It is in
"binder2" of the Fortran docs, for example.

--
----------------------------------------------------------------
Keith H. Bierman    kbierman@Eng.Sun.COM | khb@chiba.Eng.Sun.COM
SMI 2550 Garcia 12-33			 | (415 336 2648)   
    Mountain View, CA 94043