[comp.unix.aix] Help catching floating point exceptions

gl8f@astsun.astro.Virginia.EDU (Greg Lindahl) (05/28/91)

After perusing the info reader (man -k fp produces a nice list of man
pages which man can't find...), I am attempting to use fp_enable_all
to generate floating point traps. It isn't clear from the manual what
an FP trap is on an RS/6000. Nevertheless, I'm still praying that
there's a way to get my programs to die gracelessly when they hit a
problem.

The program below calls fp_enable_all() and then divides by zero. No
signal is generated, but an explicit test for divide by zero produces
a positive result.

a is 1.000000
b is 0.000000
floating point traps ARE enabled.
divide by zero trap IS enabled.
a/b is 1.000000
divide by zero error.

How do I get AIX to generate a signal? And why isn't this discussed in
the Fortran manual?

Advance thanks. Program below.

-- greg

------------------------------ cut here ------------------------------

#include <stdio.h>
#include <signal.h>
#include <fptrap.h>

double foo( double b );

void myfunc( int i )
{
  printf( "myfunc hit with argument %d\n", i );
}

main()
{
  int i;
  fp_enable_all();
  fp_enable( TRP_DIV_BY_ZERO );
  for( i = 1 ; i < 64 ; i++ )   /* catch all signals */
    (void) signal( i, myfunc );

  (void)foo( (double) 0.);
}

double foo( double b )
{
  double a = 1.;

/*  raise( SIGFPE );*/ /* works as expected if you uncomment */

  printf( "a is %f\n", a );
  printf( "b is %f\n", b );

  if( fp_any_enable() )
    printf( "floating point traps ARE enabled.\n" );
  if( fp_is_enabled( TRP_DIV_BY_ZERO ) )
    printf( "divide by zero trap IS enabled.\n" );

  printf( "a/b is %f\n", a/b ); /* should generate exception */
  if( fp_divbyzero() )
    printf( "divide by zero error.\n" ); /* this sees the flag set */

  return( a/b ); /* should generate another exception */
}

sdl@adagio.austin.ibm.com (sdl) (05/29/91)

>>>>> gl8f@astsun.astro.Virginia.EDU (Greg Lindahl) writes:


In article <1991May27.213751.24223@murdoch.acc.Virginia.EDU> gl8f@astsun.astro.Virginia.EDU (Greg Lindahl) writes:

greg> After perusing the info reader (man -k fp produces a nice list of man
greg> pages which man can't find...), I am attempting to use fp_enable_all
greg> to generate floating point traps. It isn't clear from the manual what
greg> an FP trap is on an RS/6000. Nevertheless, I'm still praying that
greg> there's a way to get my programs to die gracelessly when they hit a
greg> problem.

greg> The program below calls fp_enable_all() and then divides by zero. No
greg> signal is generated, but an explicit test for divide by zero produces
greg> a positive result.

AIX 3.1 does not support taking a trap on floating point exceptions.

The Risc System/6000 hardware will not generate a floating point trap
unless the machine is put into serialized mode (changing the FE bit in
the machine status register (MSR) to 1).  The instruction to do this
is privledged, and in AIX 3.1 there is no kernel service to do this.

BTW, in order to to generate a trap you must also enable the
specific floating point exceptions that you want to trap, which is
what the fp_enable() and fp_enable_all() routines do.  However, as
I said above, you have to change MSR(FE) as well.

We're aware that there is need for floating point exception traping.
Be aware, however, that if/when it is available that it only works
when the machine is in serialized mode, and this means a performance
impact.  For "production", or "performance critical", or whatever you
call your programs that have to run fast, you can get much better
performance by checking the floating point status bits using the
routines provided for that purpose (such as fp_divbyzero() in the case
of zero division) at "appropriate" times..  Fortran provides an inline
instruction for checking the floating point status and control
register for this same purpose, but I don't remember what it is
called.

If you're dealing with such issues, if you don't have this manual you
should: IBM RISC System/6000 POWERstation and POWERserver Hardware
Technical Reference General Information, SA23-2643.  The chapter on
the floating point processor has much more information than any of the
on-line pubs.

Usual disclaimer:  I write code, not make policy;
this is not an official IBM pronouncement, etc.
--
--------------------------------------------------------------------
Stephen Linam   PSP Austin   T/L: 793-3674  Bell-net: (512) 823-3674
IBM Internet: sdl@adagio.austin.ibm.com        VNET: LINAM at AUSTIN
From outside IBM:  sdl@glasnost.austin.ibm.com

gl8f@astsun7.astro.Virginia.EDU (Greg Lindahl) (05/29/91)

In article <SDL.91May28222608@adagio.austin.ibm.com> sdl@glasnost.austin.ibm.com writes:

>We're aware that there is need for floating point exception traping.

That's good. If I had known in advance I would have strongly
recommended against purchasing these machines. Thanks for your help;
the on-line manuals don't make it nearly as clear as your posting.

AER7101@TECHNION.BITNET (Zvika Bar-Deroma) (05/30/91)

In article <1991May27.213751.24223@murdoch.acc.Virginia.EDU>,
gl8f@astsun.astro.Virginia.EDU (Greg Lindahl) says:
>
>The program below calls fp_enable_all() and then divides by zero. No
>signal is generated, but an explicit test for divide by zero produces
>a positive result.
>
>a is 1.000000
>b is 0.000000
>floating point traps ARE enabled.
>divide by zero trap IS enabled.
>a/b is 1.000000
>divide by zero error.
>
>How do I get AIX to generate a signal? And why isn't this discussed in
>the Fortran manual?

I've been told by my IBM rep.  that as IEEE didn't require that division
by zero  be trapped and  signalled, then (some/many/most)  RISC machines
don't, and he thinks, that unless  there's such a requirement from IEEE,
they won't also in the forseeable future.

I very strongly think, that a company  that is trying to give some added
value to  its products should provide  the users with it.  The least I'd
expect is some flag,  that will enable such traps, even  if they are not
from the h/w but from the compiler.  I realize that this mode would mean
much slower performance,  but we are talking about  debugging (other IBM
compilers trap division  by zero/overflow, etc.). There are  quiet a few
people  here, willing  to pay  for  a WATFOR77  (WATFIV's successor  for
FORTRAN-77, from WATCOM) compiler for the rs/6k.

So - my suggestion - let's open DCR's; let's have IBM at least provide
a s/w option for the purpose discussed above |

/Zvika

Zvika Bar-Deroma                                  Phone: (+972)-4-292706
Faculty of Aerospace Engineering,                 Fax  : (+972)-4-231848
Technion
Haifa 32000
Israel

BITNET        :   AER7101@TECHNION
Internet      :   AER7101@TECHNION.TECHNION.AC.IL
UUCP          :   ...!uunet!pucc.princeton.edu!technion!aer7101

sdl@adagio.austin.ibm.com (sdl) (05/30/91)

In article <91149.150333AER7101@TECHNION.BITNET> AER7101@TECHNION.BITNET (Zvika Bar-Deroma) writes:


Zvika> I've been told by my IBM rep.  that as IEEE didn't require that division
Zvika> by zero  be trapped and  signalled, then (some/many/most)  RISC machines
Zvika> don't, and he thinks, that unless  there's such a requirement from IEEE,
Zvika> they won't also in the forseeable future.

Your rep. is correct.  From IEEE 854-1987:  "There are five types of
exceptions that shall be signaled when detected.  The signal entails
setting a status flag, taking a trap, or possibly doing both".
Moreover, IEEE requires that if you implement IEEE trapping, that
"A trap handler should have the capabilities of a subroutine that can
return a value to be used in lieu of the exceptional operation's
results."  In the case of invalid operation and divide by zero
exceptions, the signal handler must be delivered the operand values
of the operation.  However, in the RISC System/6000's pipelined mode,
the floating point processor will have already destroyed the
operands by the time the branch processor can take the branch to
the trap handler.  Thus, it can only generate IEEE floating point
traps in sychronous execution mode.

Usual Disclaimer:  I write code, not make policy.

















--
--------------------------------------------------------------------
Stephen Linam   PSP Austin   T/L: 793-3674  Bell-net: (512) 823-3674
IBM Internet: sdl@adagio.austin.ibm.com        VNET: LINAM at AUSTIN
From outside IBM:  sdl@glasnost.austin.ibm.com

csrdh@marlin.jcu.edu.au (Rowan Hughes) (05/31/91)

In <SDL.91May29181249@adagio.austin.ibm.com> sdl@adagio.austin.ibm.com (sdl) writes:
>In article <91149.150333AER7101@TECHNION.BITNET> AER7101@TECHNION.BITNET (Zvika Bar-Deroma) writes:
>Zvika> I've been told by my IBM rep.  that as IEEE didn't require that division
>Zvika> by zero  be trapped and  signalled, then (some/many/most)  RISC machines
>Zvika> don't, and he thinks, that unless  there's such a requirement from IEEE,
>Zvika> they won't also in the forseeable future.
>Your rep. is correct.  From IEEE 854-1987:  "There are five types of
>exceptions that shall be signaled when detected.  The signal entails
>setting a status flag, taking a trap, or possibly doing both".
>Moreover, IEEE requires that if you implement IEEE trapping, that
>"A trap handler should have the capabilities of a subroutine that can

DEC have released their new f77v3.0 compiler for the DECStations
(=R3000) and it traps all floating exceptions; the job aborts
with a core dump. I've done some benchmarking between the new and old
compilers (with and without fpe) and the performance has dropped by
less than 1/2%.  Fpe is enabled ALL the time, irrelevent of
compiler options. There is an option for the job to continue after
a fault, and the user is notified at the end of execution.
Integer overflow can be trapped also.

It can be done easily on RISC, with no loss in performance.

-- 
Rowan Hughes                                James Cook University
Marine Modelling Unit                       Townsville, Australia.
Dept. Civil and Systems Engineering         csrdh@marlin.jcu.edu.au

beddini@uxh.cso.uiuc.edu (Robert A Beddini) (06/05/91)

 requested info on this from IBM a year ago, and voiced my opinions about the lack of FPP traps at that time to our local IBM tech rep. He, in turn , requested further info internally.  The response at that time was along the lines of "here's the way the FPP can be checked,... why do you need to trap?"

We eventually purchased a 530 and are pleased with its speed on *PRODUCTION* calculations. However it is still not being used for major code development, and its use in this capacity will be limited until IBM recognizes the need for user conveniences (necessities) such as FPP traps as a compiler options/defaults.

While we are on the subject, (sub)program addresses and line #'s should also be loaded into a register and a stack maintained as a compiler option. The cost is far less expensive than the simplest floating point operation. This would enable line info from traps with  traceback - another very helpful feature which has been available on more user-oriented workstations. This compiler option could also be used to help gmon pin down CPU-intensive portions of a large subprogram.


A
It is regretable that the subject of the need for FPP traps continues to be discussed at length without IBM recognizing that this capability should be a priority. Once a code under development begins to work properly, it is an easy matter to enable all the compiler optimization options.

jsalter@ibmpa.awdpa.ibm.com (06/06/91)

In article <1991May31.071822.9206@marlin.jcu.edu.au> csrdh@marlin.jcu.edu.au (Rowan Hughes) writes:
>In <SDL.91May29181249@adagio.austin.ibm.com> sdl@adagio.austin.ibm.com (sdl) writes:
>DEC have released their new f77v3.0 compiler for the DECStations
>(=R3000) and it traps all floating exceptions; the job aborts
>with a core dump. I've done some benchmarking between the new and old
>compilers (with and without fpe) and the performance has dropped by
>less than 1/2%.  Fpe is enabled ALL the time, irrelevent of
>compiler options. [...]
>It can be done easily on RISC, with no loss in performance.

True, it *can* be done on RISC architectures.  That's not the point.
The point is whether it should be done on the POWER architecture the
'6000s employ.  I don't believe the R3000 is super-scalar, thereby
limiting the effects of trapping.  Since the '6000 is, instructions are
scheduled out-of-order by the compilers to ensure the pipeline remains
full for as long as possible.

What trapping does is *serialize* the execution, forcing the instructions
to be in-order, thus greatly limiting the performance by, as been noted
many times, an order-of-magnitude.  This is not a trivial performance hit.

The first-generation SPARC may be only 2-3 MFLOPS, but to bring the '6000
down to 2-3 MFLOPS really hurts.  I think everyone would agree that the
FP performance of the '6000 is one of it's best points.  To turn trapping
on all the time would be marketing suicide.

>Rowan Hughes                                James Cook University
>Marine Modelling Unit                       Townsville, Australia.
>Dept. Civil and Systems Engineering         csrdh@marlin.jcu.edu.au

jim/jsalter  IBM PSP, Palo Alto  T465/(415)855-4427  VNET: JSALTER at AUSVMQ
Internet: jsalter@slo.awdpa.ibm.com         UUCP: ..!uunet!ibmsupt!jsalter 
"IBM part #23521, aka Lt. Commander Data"    The stuff above is on my own.

jsalter@ibmpa.awdpa.ibm.com (06/06/91)

In article <1991Jun5.030043.3520@ux1.cso.uiuc.edu> beddini@uxh.cso.uiuc.edu (Robert A Beddini) writes:
> requested info on this from IBM a year ago, and voiced my opinions about
>the lack of FPP traps at that time to our local IBM tech rep. He, in turn,
>requested further info internally.  The response at that time was along the
>lines of "here's the way the FPP can be checked,... why do you need to trap?"

He didn't talk to the right person.  We knew before release that IEEE-754
exception handling and trapping was necessary to truly make it in the
marketplace.

>We eventually purchased a 530 and are pleased with its speed on *PRODUCTION*
>calculations. However it is still not being used for major code development,
>and its use in this capacity will be limited until IBM recognizes the need
>for user conveniences (necessities) such as FPP traps as a compiler
>options/defaults.

We've recognized this need.  We recognized this before release, as noted
above, but because of time-to-market pressures, we were not able to make
this functionality available.  If you'll notice, the <fptrap.h> and <FP.h>
and <fpxcp.h> header files in AIXv3 all point to the fact that we knew
exception handling and trapping was important.

>It is regretable that the subject of the need for FPP traps continues to
>be discussed at length without IBM recognizing that this capability should
>be a priority.

This is false.  We've recognized it, and are working hard to implement it.
The <fptrap.h> file proves we recognized the need a long time ago.  The
implementation, of course, is not trivial.  But, it should be available in
the next major AIX release.

jim/jsalter  IBM PSP, Palo Alto  T465/(415)855-4427  VNET: JSALTER at AUSVMQ
Internet: jsalter@slo.awdpa.ibm.com         UUCP: ..!uunet!ibmsupt!jsalter 
"IBM part #23521, aka Lt. Commander Data"    The stuff above is on my own.

gl8f@astsun.astro.Virginia.EDU (Greg Lindahl) (06/07/91)

In article <1991Jun6.032233.10629@ibmpa.awdpa.ibm.com> jsalter@slo.awdpa.ibm.com (Jim Salter) writes:

>What trapping does is *serialize* the execution, forcing the instructions
>to be in-order, thus greatly limiting the performance by, as been noted
>many times, an order-of-magnitude.  This is not a trivial performance hit.

This is known as a design flaw. If it was too expensive to give a trap
on the "correct" instruction, then give me a trap that happens
sometime after the fault happens. That way I can run full-speed, and
have the code crash *near* the offending instruction. Or I could run
slower and be exact.

Crays have done this for years. If you can't give the best
environment, second-best is much better than nothing.

rcd@ico.isc.com (Dick Dunn) (06/07/91)

jsalter@ibmpa.awdpa.ibm.com responds to beddini@uxh.cso.uiuc.edu (Robert A
Beddini)...several exchanges with about the same tone, here's one:

Beddini:
> >We eventually purchased a 530 and are pleased with its speed on *PRODUCTION*
> >calculations. However it is still not being used for major code development,
> >and its use in this capacity will be limited until IBM recognizes the need
> >for user conveniences (necessities) such as FPP traps as a compiler
> >options/defaults.

Salter:
> We've recognized this need.  We recognized this before release, as noted
> above, but because of time-to-market pressures, ...[etc]...
[cites evidence in header files that they know there's a problem]

> The <fptrap.h> file proves we recognized the need a long time ago.  The
> implementation, of course, is not trivial...

Looks to me like the issue in dispute is what it means to "recognize" a
problem.

Perhaps what Beddini is really saying is that he wants the problem "fixed",
rather than just "recognized"?

I can appreciate that the problem might be difficult to fix after the
fact...but it *is* a fairly fundamental need.  Seems odd that it would be
something that must be added on a year or so later, as opposed to being
designed in.
-- 
Dick Dunn     rcd@ico.isc.com -or- ico!rcd       Boulder, CO   (303)449-2870
   ...Simpler is better.

Zvika Bar-Deroma <AER7101@TECHNION.BITNET> (06/07/91)

In article <1991Jun6.033716.10920@ibmpa.awdpa.ibm.com>,
jsalter@ibmpa.awdpa.ibm.com says:
>
>
>
>This is false.  We've recognized it, and are working hard to implement it.
>The <fptrap.h> file proves we recognized the need a long time ago.  The
>implementation, of course, is not trivial.  But, it should be available in
>the next major AIX release.
One of the next major releases ??? It is crucial, and considered by
people who purchased/want to buy it  as a number cruncher (and that's
what it is, right ?) as the single biggest thing they are missing,
at least at the s/w level (as an option to xlf) |
>
>jim/jsalter  IBM PSP, Palo Alto  T465/(415)855-4427  VNET: JSALTER at AUSVMQ
>Internet: jsalter@slo.awdpa.ibm.com         UUCP: ..!uunet!ibmsupt!jsalter
>"IBM part #23521, aka Lt. Commander Data"    The stuff above is on my own.

/Zvika

csrdh@marlin.jcu.edu.au (Rowan Hughes) (06/08/91)

In <1991Jun6.032233.10629@ibmpa.awdpa.ibm.com> jsalter@ibmpa.awdpa.ibm.com writes:

>In article <1991May31.071822.9206@marlin.jcu.edu.au> csrdh@marlin.jcu.edu.au (Rowan Hughes) writes:
>>In <SDL.91May29181249@adagio.austin.ibm.com> sdl@adagio.austin.ibm.com (sdl) writes:
>>DEC have released their new f77v3.0 compiler for the DECStations
>>(=R3000) and it traps all floating exceptions; the job aborts
>>with a core dump. I've done some benchmarking between the new and old
>>compilers (with and without fpe) and the performance has dropped by
>>less than 1/2%.  Fpe is enabled ALL the time, irrelevent of
>>compiler options. [...]

>True, it *can* be done on RISC architectures.  That's not the point.
>The point is whether it should be done on the POWER architecture the
>'6000s employ.  I don't believe the R3000 is super-scalar, thereby
>limiting the effects of trapping.  Since the '6000 is, instructions are
>scheduled out-of-order by the compilers to ensure the pipeline remains
>full for as long as possible.

Correct, the R3000 is not super-scalar.

There's still a few points missing. If fpe was used does it mean that
the memory fetches would HAVE to be serial, or would the fpe trap be
uncertain as to the location of the error in super-scalar. If its the
latter case then fpe should be on all the time. If its the former case
then fpe should be allowabel in debug mode, with the resulting
drop in performance.

I consider fpe traps to be essential in all my work. It should be a 
compiler option, even if the performance drop is significant. I'm
getting an IBM540 in a few days, so I hope you IBM'ers are working
on this right now !!! The DECheads have seen the light.

-- 
Rowan Hughes                                James Cook University
Marine Modelling Unit                       Townsville, Australia.
Dept. Civil and Systems Engineering         csrdh@marlin.jcu.edu.au