gl8f@astsun.astro.Virginia.EDU (Greg Lindahl) (05/28/91)
After perusing the info reader (man -k fp produces a nice list of man pages which man can't find...), I am attempting to use fp_enable_all to generate floating point traps. It isn't clear from the manual what an FP trap is on an RS/6000. Nevertheless, I'm still praying that there's a way to get my programs to die gracelessly when they hit a problem. The program below calls fp_enable_all() and then divides by zero. No signal is generated, but an explicit test for divide by zero produces a positive result. a is 1.000000 b is 0.000000 floating point traps ARE enabled. divide by zero trap IS enabled. a/b is 1.000000 divide by zero error. How do I get AIX to generate a signal? And why isn't this discussed in the Fortran manual? Advance thanks. Program below. -- greg ------------------------------ cut here ------------------------------ #include <stdio.h> #include <signal.h> #include <fptrap.h> double foo( double b ); void myfunc( int i ) { printf( "myfunc hit with argument %d\n", i ); } main() { int i; fp_enable_all(); fp_enable( TRP_DIV_BY_ZERO ); for( i = 1 ; i < 64 ; i++ ) /* catch all signals */ (void) signal( i, myfunc ); (void)foo( (double) 0.); } double foo( double b ) { double a = 1.; /* raise( SIGFPE );*/ /* works as expected if you uncomment */ printf( "a is %f\n", a ); printf( "b is %f\n", b ); if( fp_any_enable() ) printf( "floating point traps ARE enabled.\n" ); if( fp_is_enabled( TRP_DIV_BY_ZERO ) ) printf( "divide by zero trap IS enabled.\n" ); printf( "a/b is %f\n", a/b ); /* should generate exception */ if( fp_divbyzero() ) printf( "divide by zero error.\n" ); /* this sees the flag set */ return( a/b ); /* should generate another exception */ }
sdl@adagio.austin.ibm.com (sdl) (05/29/91)
>>>>> gl8f@astsun.astro.Virginia.EDU (Greg Lindahl) writes: In article <1991May27.213751.24223@murdoch.acc.Virginia.EDU> gl8f@astsun.astro.Virginia.EDU (Greg Lindahl) writes: greg> After perusing the info reader (man -k fp produces a nice list of man greg> pages which man can't find...), I am attempting to use fp_enable_all greg> to generate floating point traps. It isn't clear from the manual what greg> an FP trap is on an RS/6000. Nevertheless, I'm still praying that greg> there's a way to get my programs to die gracelessly when they hit a greg> problem. greg> The program below calls fp_enable_all() and then divides by zero. No greg> signal is generated, but an explicit test for divide by zero produces greg> a positive result. AIX 3.1 does not support taking a trap on floating point exceptions. The Risc System/6000 hardware will not generate a floating point trap unless the machine is put into serialized mode (changing the FE bit in the machine status register (MSR) to 1). The instruction to do this is privledged, and in AIX 3.1 there is no kernel service to do this. BTW, in order to to generate a trap you must also enable the specific floating point exceptions that you want to trap, which is what the fp_enable() and fp_enable_all() routines do. However, as I said above, you have to change MSR(FE) as well. We're aware that there is need for floating point exception traping. Be aware, however, that if/when it is available that it only works when the machine is in serialized mode, and this means a performance impact. For "production", or "performance critical", or whatever you call your programs that have to run fast, you can get much better performance by checking the floating point status bits using the routines provided for that purpose (such as fp_divbyzero() in the case of zero division) at "appropriate" times.. Fortran provides an inline instruction for checking the floating point status and control register for this same purpose, but I don't remember what it is called. If you're dealing with such issues, if you don't have this manual you should: IBM RISC System/6000 POWERstation and POWERserver Hardware Technical Reference General Information, SA23-2643. The chapter on the floating point processor has much more information than any of the on-line pubs. Usual disclaimer: I write code, not make policy; this is not an official IBM pronouncement, etc. -- -------------------------------------------------------------------- Stephen Linam PSP Austin T/L: 793-3674 Bell-net: (512) 823-3674 IBM Internet: sdl@adagio.austin.ibm.com VNET: LINAM at AUSTIN From outside IBM: sdl@glasnost.austin.ibm.com
gl8f@astsun7.astro.Virginia.EDU (Greg Lindahl) (05/29/91)
In article <SDL.91May28222608@adagio.austin.ibm.com> sdl@glasnost.austin.ibm.com writes: >We're aware that there is need for floating point exception traping. That's good. If I had known in advance I would have strongly recommended against purchasing these machines. Thanks for your help; the on-line manuals don't make it nearly as clear as your posting.
AER7101@TECHNION.BITNET (Zvika Bar-Deroma) (05/30/91)
In article <1991May27.213751.24223@murdoch.acc.Virginia.EDU>, gl8f@astsun.astro.Virginia.EDU (Greg Lindahl) says: > >The program below calls fp_enable_all() and then divides by zero. No >signal is generated, but an explicit test for divide by zero produces >a positive result. > >a is 1.000000 >b is 0.000000 >floating point traps ARE enabled. >divide by zero trap IS enabled. >a/b is 1.000000 >divide by zero error. > >How do I get AIX to generate a signal? And why isn't this discussed in >the Fortran manual? I've been told by my IBM rep. that as IEEE didn't require that division by zero be trapped and signalled, then (some/many/most) RISC machines don't, and he thinks, that unless there's such a requirement from IEEE, they won't also in the forseeable future. I very strongly think, that a company that is trying to give some added value to its products should provide the users with it. The least I'd expect is some flag, that will enable such traps, even if they are not from the h/w but from the compiler. I realize that this mode would mean much slower performance, but we are talking about debugging (other IBM compilers trap division by zero/overflow, etc.). There are quiet a few people here, willing to pay for a WATFOR77 (WATFIV's successor for FORTRAN-77, from WATCOM) compiler for the rs/6k. So - my suggestion - let's open DCR's; let's have IBM at least provide a s/w option for the purpose discussed above | /Zvika Zvika Bar-Deroma Phone: (+972)-4-292706 Faculty of Aerospace Engineering, Fax : (+972)-4-231848 Technion Haifa 32000 Israel BITNET : AER7101@TECHNION Internet : AER7101@TECHNION.TECHNION.AC.IL UUCP : ...!uunet!pucc.princeton.edu!technion!aer7101
sdl@adagio.austin.ibm.com (sdl) (05/30/91)
In article <91149.150333AER7101@TECHNION.BITNET> AER7101@TECHNION.BITNET (Zvika Bar-Deroma) writes:
Zvika> I've been told by my IBM rep. that as IEEE didn't require that division
Zvika> by zero be trapped and signalled, then (some/many/most) RISC machines
Zvika> don't, and he thinks, that unless there's such a requirement from IEEE,
Zvika> they won't also in the forseeable future.
Your rep. is correct. From IEEE 854-1987: "There are five types of
exceptions that shall be signaled when detected. The signal entails
setting a status flag, taking a trap, or possibly doing both".
Moreover, IEEE requires that if you implement IEEE trapping, that
"A trap handler should have the capabilities of a subroutine that can
return a value to be used in lieu of the exceptional operation's
results." In the case of invalid operation and divide by zero
exceptions, the signal handler must be delivered the operand values
of the operation. However, in the RISC System/6000's pipelined mode,
the floating point processor will have already destroyed the
operands by the time the branch processor can take the branch to
the trap handler. Thus, it can only generate IEEE floating point
traps in sychronous execution mode.
Usual Disclaimer: I write code, not make policy.
--
--------------------------------------------------------------------
Stephen Linam PSP Austin T/L: 793-3674 Bell-net: (512) 823-3674
IBM Internet: sdl@adagio.austin.ibm.com VNET: LINAM at AUSTIN
From outside IBM: sdl@glasnost.austin.ibm.com
csrdh@marlin.jcu.edu.au (Rowan Hughes) (05/31/91)
In <SDL.91May29181249@adagio.austin.ibm.com> sdl@adagio.austin.ibm.com (sdl) writes: >In article <91149.150333AER7101@TECHNION.BITNET> AER7101@TECHNION.BITNET (Zvika Bar-Deroma) writes: >Zvika> I've been told by my IBM rep. that as IEEE didn't require that division >Zvika> by zero be trapped and signalled, then (some/many/most) RISC machines >Zvika> don't, and he thinks, that unless there's such a requirement from IEEE, >Zvika> they won't also in the forseeable future. >Your rep. is correct. From IEEE 854-1987: "There are five types of >exceptions that shall be signaled when detected. The signal entails >setting a status flag, taking a trap, or possibly doing both". >Moreover, IEEE requires that if you implement IEEE trapping, that >"A trap handler should have the capabilities of a subroutine that can DEC have released their new f77v3.0 compiler for the DECStations (=R3000) and it traps all floating exceptions; the job aborts with a core dump. I've done some benchmarking between the new and old compilers (with and without fpe) and the performance has dropped by less than 1/2%. Fpe is enabled ALL the time, irrelevent of compiler options. There is an option for the job to continue after a fault, and the user is notified at the end of execution. Integer overflow can be trapped also. It can be done easily on RISC, with no loss in performance. -- Rowan Hughes James Cook University Marine Modelling Unit Townsville, Australia. Dept. Civil and Systems Engineering csrdh@marlin.jcu.edu.au
beddini@uxh.cso.uiuc.edu (Robert A Beddini) (06/05/91)
requested info on this from IBM a year ago, and voiced my opinions about the lack of FPP traps at that time to our local IBM tech rep. He, in turn , requested further info internally. The response at that time was along the lines of "here's the way the FPP can be checked,... why do you need to trap?" We eventually purchased a 530 and are pleased with its speed on *PRODUCTION* calculations. However it is still not being used for major code development, and its use in this capacity will be limited until IBM recognizes the need for user conveniences (necessities) such as FPP traps as a compiler options/defaults. While we are on the subject, (sub)program addresses and line #'s should also be loaded into a register and a stack maintained as a compiler option. The cost is far less expensive than the simplest floating point operation. This would enable line info from traps with traceback - another very helpful feature which has been available on more user-oriented workstations. This compiler option could also be used to help gmon pin down CPU-intensive portions of a large subprogram. A It is regretable that the subject of the need for FPP traps continues to be discussed at length without IBM recognizing that this capability should be a priority. Once a code under development begins to work properly, it is an easy matter to enable all the compiler optimization options.
jsalter@ibmpa.awdpa.ibm.com (06/06/91)
In article <1991May31.071822.9206@marlin.jcu.edu.au> csrdh@marlin.jcu.edu.au (Rowan Hughes) writes: >In <SDL.91May29181249@adagio.austin.ibm.com> sdl@adagio.austin.ibm.com (sdl) writes: >DEC have released their new f77v3.0 compiler for the DECStations >(=R3000) and it traps all floating exceptions; the job aborts >with a core dump. I've done some benchmarking between the new and old >compilers (with and without fpe) and the performance has dropped by >less than 1/2%. Fpe is enabled ALL the time, irrelevent of >compiler options. [...] >It can be done easily on RISC, with no loss in performance. True, it *can* be done on RISC architectures. That's not the point. The point is whether it should be done on the POWER architecture the '6000s employ. I don't believe the R3000 is super-scalar, thereby limiting the effects of trapping. Since the '6000 is, instructions are scheduled out-of-order by the compilers to ensure the pipeline remains full for as long as possible. What trapping does is *serialize* the execution, forcing the instructions to be in-order, thus greatly limiting the performance by, as been noted many times, an order-of-magnitude. This is not a trivial performance hit. The first-generation SPARC may be only 2-3 MFLOPS, but to bring the '6000 down to 2-3 MFLOPS really hurts. I think everyone would agree that the FP performance of the '6000 is one of it's best points. To turn trapping on all the time would be marketing suicide. >Rowan Hughes James Cook University >Marine Modelling Unit Townsville, Australia. >Dept. Civil and Systems Engineering csrdh@marlin.jcu.edu.au jim/jsalter IBM PSP, Palo Alto T465/(415)855-4427 VNET: JSALTER at AUSVMQ Internet: jsalter@slo.awdpa.ibm.com UUCP: ..!uunet!ibmsupt!jsalter "IBM part #23521, aka Lt. Commander Data" The stuff above is on my own.
jsalter@ibmpa.awdpa.ibm.com (06/06/91)
In article <1991Jun5.030043.3520@ux1.cso.uiuc.edu> beddini@uxh.cso.uiuc.edu (Robert A Beddini) writes: > requested info on this from IBM a year ago, and voiced my opinions about >the lack of FPP traps at that time to our local IBM tech rep. He, in turn, >requested further info internally. The response at that time was along the >lines of "here's the way the FPP can be checked,... why do you need to trap?" He didn't talk to the right person. We knew before release that IEEE-754 exception handling and trapping was necessary to truly make it in the marketplace. >We eventually purchased a 530 and are pleased with its speed on *PRODUCTION* >calculations. However it is still not being used for major code development, >and its use in this capacity will be limited until IBM recognizes the need >for user conveniences (necessities) such as FPP traps as a compiler >options/defaults. We've recognized this need. We recognized this before release, as noted above, but because of time-to-market pressures, we were not able to make this functionality available. If you'll notice, the <fptrap.h> and <FP.h> and <fpxcp.h> header files in AIXv3 all point to the fact that we knew exception handling and trapping was important. >It is regretable that the subject of the need for FPP traps continues to >be discussed at length without IBM recognizing that this capability should >be a priority. This is false. We've recognized it, and are working hard to implement it. The <fptrap.h> file proves we recognized the need a long time ago. The implementation, of course, is not trivial. But, it should be available in the next major AIX release. jim/jsalter IBM PSP, Palo Alto T465/(415)855-4427 VNET: JSALTER at AUSVMQ Internet: jsalter@slo.awdpa.ibm.com UUCP: ..!uunet!ibmsupt!jsalter "IBM part #23521, aka Lt. Commander Data" The stuff above is on my own.
gl8f@astsun.astro.Virginia.EDU (Greg Lindahl) (06/07/91)
In article <1991Jun6.032233.10629@ibmpa.awdpa.ibm.com> jsalter@slo.awdpa.ibm.com (Jim Salter) writes: >What trapping does is *serialize* the execution, forcing the instructions >to be in-order, thus greatly limiting the performance by, as been noted >many times, an order-of-magnitude. This is not a trivial performance hit. This is known as a design flaw. If it was too expensive to give a trap on the "correct" instruction, then give me a trap that happens sometime after the fault happens. That way I can run full-speed, and have the code crash *near* the offending instruction. Or I could run slower and be exact. Crays have done this for years. If you can't give the best environment, second-best is much better than nothing.
rcd@ico.isc.com (Dick Dunn) (06/07/91)
jsalter@ibmpa.awdpa.ibm.com responds to beddini@uxh.cso.uiuc.edu (Robert A Beddini)...several exchanges with about the same tone, here's one: Beddini: > >We eventually purchased a 530 and are pleased with its speed on *PRODUCTION* > >calculations. However it is still not being used for major code development, > >and its use in this capacity will be limited until IBM recognizes the need > >for user conveniences (necessities) such as FPP traps as a compiler > >options/defaults. Salter: > We've recognized this need. We recognized this before release, as noted > above, but because of time-to-market pressures, ...[etc]... [cites evidence in header files that they know there's a problem] > The <fptrap.h> file proves we recognized the need a long time ago. The > implementation, of course, is not trivial... Looks to me like the issue in dispute is what it means to "recognize" a problem. Perhaps what Beddini is really saying is that he wants the problem "fixed", rather than just "recognized"? I can appreciate that the problem might be difficult to fix after the fact...but it *is* a fairly fundamental need. Seems odd that it would be something that must be added on a year or so later, as opposed to being designed in. -- Dick Dunn rcd@ico.isc.com -or- ico!rcd Boulder, CO (303)449-2870 ...Simpler is better.
Zvika Bar-Deroma <AER7101@TECHNION.BITNET> (06/07/91)
In article <1991Jun6.033716.10920@ibmpa.awdpa.ibm.com>, jsalter@ibmpa.awdpa.ibm.com says: > > > >This is false. We've recognized it, and are working hard to implement it. >The <fptrap.h> file proves we recognized the need a long time ago. The >implementation, of course, is not trivial. But, it should be available in >the next major AIX release. One of the next major releases ??? It is crucial, and considered by people who purchased/want to buy it as a number cruncher (and that's what it is, right ?) as the single biggest thing they are missing, at least at the s/w level (as an option to xlf) | > >jim/jsalter IBM PSP, Palo Alto T465/(415)855-4427 VNET: JSALTER at AUSVMQ >Internet: jsalter@slo.awdpa.ibm.com UUCP: ..!uunet!ibmsupt!jsalter >"IBM part #23521, aka Lt. Commander Data" The stuff above is on my own. /Zvika
csrdh@marlin.jcu.edu.au (Rowan Hughes) (06/08/91)
In <1991Jun6.032233.10629@ibmpa.awdpa.ibm.com> jsalter@ibmpa.awdpa.ibm.com writes: >In article <1991May31.071822.9206@marlin.jcu.edu.au> csrdh@marlin.jcu.edu.au (Rowan Hughes) writes: >>In <SDL.91May29181249@adagio.austin.ibm.com> sdl@adagio.austin.ibm.com (sdl) writes: >>DEC have released their new f77v3.0 compiler for the DECStations >>(=R3000) and it traps all floating exceptions; the job aborts >>with a core dump. I've done some benchmarking between the new and old >>compilers (with and without fpe) and the performance has dropped by >>less than 1/2%. Fpe is enabled ALL the time, irrelevent of >>compiler options. [...] >True, it *can* be done on RISC architectures. That's not the point. >The point is whether it should be done on the POWER architecture the >'6000s employ. I don't believe the R3000 is super-scalar, thereby >limiting the effects of trapping. Since the '6000 is, instructions are >scheduled out-of-order by the compilers to ensure the pipeline remains >full for as long as possible. Correct, the R3000 is not super-scalar. There's still a few points missing. If fpe was used does it mean that the memory fetches would HAVE to be serial, or would the fpe trap be uncertain as to the location of the error in super-scalar. If its the latter case then fpe should be on all the time. If its the former case then fpe should be allowabel in debug mode, with the resulting drop in performance. I consider fpe traps to be essential in all my work. It should be a compiler option, even if the performance drop is significant. I'm getting an IBM540 in a few days, so I hope you IBM'ers are working on this right now !!! The DECheads have seen the light. -- Rowan Hughes James Cook University Marine Modelling Unit Townsville, Australia. Dept. Civil and Systems Engineering csrdh@marlin.jcu.edu.au