[comp.sys.sgi] strange floating point exception interrupt behavior

sdempsey@UCSD.EDU (Steve Dempsey) (07/26/90)

The following discussion pertains to a 4D/25TG running 3.2.1 and a
4D/340VGX running 3.3.

Recently I have been doing a performance analysis of a number cruncher
program that runs much more slowly on IRISes than one would expect.
I fired up gr_osview and ran the program, expecting to see lots of
system calls or swapping, and an indication of where the cpu time was being
wasted.  What I saw was something quite strange!  The cpu was spending
99% of its time in user mode, just like any decent number cruncher should.
The shock came from the interrupt rate, which went from a background level
of 200-400 per second up to ~20K per second (35K on the 340VGX!)

Ultimately, I discovered that the extra interrupts were occuring whenever
floating point operations resulted in underflow.  This behavior can be
demonstrated by compiling and running this code:

    #include <math.h>

    main()
	{
	double x, y, z;
	int	i;

	y = MINDOUBLE;
	z = 0.5;
	i = 10000000;
	while(i--) x = y * z;
	}

Both C and Fortran versions of this code produce the same results.
I tried similar tests, forcing overflows and divide-by-zero, but no extra
interrupts were found for these floating exceptions.

Can anybody explain what's so special about underflows, and why do I get
interrupts even though the floating point exception interrupts are not enabled?
--------------------------------------------------------------------------------
Steve Dempsey						       (619) 534-0208
Dept. of Chemistry Computer Facility, 0314	   INTERNET:   sdempsey@ucsd.edu
University of California, San Diego		     BITNET:   sdempsey@ucsd
La Jolla, CA 92093-0314				       UUCP:   ucsd!sdempsey

bron@bronze.wpd.sgi.com (Bron Campbell Nelson) (07/27/90)

In article <9007260022.AA03729@chem.chem.ucsd.edu>, sdempsey@UCSD.EDU (Steve Dempsey) writes:
> The shock came from the interrupt rate, which went from a background level
> of 200-400 per second up to ~20K per second (35K on the 340VGX!)
> 
> Ultimately, I discovered that the extra interrupts were occuring whenever
> floating point operations resulted in underflow.  This behavior can be
> demonstrated by compiling and running this code:
[deleted]

The MIPS R3010 floating point hardware does not handle the "exceptional"
conditions of IEEE floating point, including underflow.  Whenever an f.p.
operation would result in underflow, the chip generates an interrupt, and
the f.p. operation is done in software, correctly dealing with all the
obscure conventions of IEEE arithmetic.  This is one of the reasons that
the chip is (normally) so fast: all that silicon that would normally be
devoted to this stuff is removed and is instead invested in making the
normal case go faster.

Of course, this is also the reason why it is so slow in your particular
case.  The reason why underflow is particularly bad is that once you
get an IEEE denorm, subsequent operations on that denorm will also cause
interrupts, et cetera.

You say that you have 3.3; if you are not too worried about exact IEEE
semantics for your f.p. operations, then you can use the "sigfpe(3C)"
package (or "fsigfpe(3F)" for the Fortran interface).  This allows you to
specify what you want done when these sorts of exceptions occur.  The
fast simple thing to do is that when an underflow (_UNDERFL) exception
occurs, instead of computing the correct denorm value, just use zero
as the result value (note that non-IEEE machines typically do just that).
You will still take an interrupt when a denorm value is *first* generated,
but by replacing it with zero, you prevent that denorm interrupt from
propagating into subsequent calculations.  This normally gets rid of
the vast majority of these interrupts.

Sadly, if you *really* need the exact correct IEEE denormalized values,
you are stuck.  As I said, the R3010 does not have hardware support for
denorms, and so operations on denorms must be done in software.

--
Bron Campbell Nelson
bron@sgi.com  or possibly  ..!ames!sgi!bron
These statements are my own, not those of Silicon Graphics.