[comp.lang.fortran] Sun Fortran v1.4 bugs

john@spectre.unm.edu (John Prentice) (05/29/91)

I am posting this for someone else.  Please respond directly to him.
-----------

OK folks, guess its time to start enumerating the bugs in the new
release of SunFortran [ SunFortran v1.4 ].

From what I've seen so far the compiler is much better than 1.3.1 and
1.4beta.

That said, my first test was to compile the SLATEC Math library, and
there were several failures.  [ for those of you not familiar with
SLATEC, it currently stands at about 270000 lines of Fortran, and contains
a set of Quick Checks that amount to another 44000 lines of code.
It is used on everything from PCs to CRAYs ]

From what I've seen so far, there still seem to be optimizer problems,
so if you want the right answer (as opposed to the wrong answer
but much faster) you should NOT turn on the optimizer.
Pity, since some of the QC were speeded up anywhere from 2x to 5x by
saying -O or -fast.

So, to date I have seen THREE failures.  Two are related to COMPLEX ARITHMETIC,
and the third is not.  The two bugs involving COMPLEX cause the codes
to blow off the machine [ at least you know you have a problem ], the
other one only occurs with -fast and just gives the wrong answer...
I have not investigated it in detail yet, but it could be just due to the
(documented) fact that -fast handles underflow differently.

Examples of the two COMPLEX ARITHMETIC bugs follow.  The first will occur
when the code is compiled with either -O or -fast.  The other one only
with -fast.

*** BUG 1 ***

The following piece of code demos a problem with AIMAG.  In the  actual
code AIMAG was not on a WRITE statement, but putting it theres makes things
easier to demo.  Note that the code `does the right thing' for -g and
[no optimization] but dies for -O or -fast.  The problem seems related to
the fact that `C' is dimensioned, since if you un-dimension C (four places)
this test will run correctly for all optimization levels.

Environment: SS2, SUNOS4.1.1, SunFortran v1.4

--------------------------------------------
      program test
      complex c(1)
c
c
      open (unit=6, file='output', form='formatted')
c
      c(1) = (-2.0, -1.0)
      write (6, '(2e20.6)') c
c
      write (6, '(2e20.6)') real(c(1)), aimag(c(1))
      stop
      end
--------------------------------------------

clemens% f77 -g test.f
test.f:
 MAIN test:
clemens% a.out
clemens% cat output
       -0.200000E+01       -0.100000E+01
       -0.200000E+01       -0.100000E+01
--------------------------------------------
clemens% f77 test.f
test.f:
 MAIN test:
clemens% a.out
clemens% cat output
       -0.200000E+01       -0.100000E+01
       -0.200000E+01       -0.100000E+01
--------------------------------------------
clemens% f77 -O test.f
test.f:
 MAIN test:
clemens% a.out
*** Segmentation Violation = signal 11 code 3 
Traceback has been recorded in file:
         /home/reg/FORTRAN/./a.out.trace 
Note: Line numbers for system and library calls may be incorrect 
Abort (core dumped)
clemens% cat output
       -0.200000E+01       -0.100000E+01
       -0.200000E+01clemens% 
		    ^^^^^^^^^^^^^^^^^^^^ note missing output
--------------------------------------------
clemens% f77 -fast test.f
test.f:
 MAIN test:
clemens% a.out
*** Segmentation Violation = signal 11 code 3 
 Note: this program was linked with -fast or -fnonstd 
 and so may have produced nonstandard floating-point results. 
 Sun's implementation of IEEE arithmetic is discussed in 
 the Numerical Computation Guide.
Traceback has been recorded in file:
         /home/reg/FORTRAN/./a.out.trace 
Note: Line numbers for system and library calls may be incorrect 
Abort (core dumped)
clemens% cat output
       -0.200000E+01       -0.100000E+01
       -0.200000E+01clemens% 
		    ^^^^^^^^^^^^^^^^^^^^ note missing output
--------------------------------------------
cat a.out.trace
Note: Line numbers for system and library calls may be incorrect 
Begin traceback...
Called from [func: (null)], at 0xf76b2c4c, args=0xb 0x3 0xf7793a44 0xf78022a8
Called from [func: _MAIN_], at 0x2370, args=0xc0000000 0xf7fffc4c 0x6 0xf779ece4
Called from [func: (null)], at 0xf7740c38, args=0x0 0x2ab0c 0x1 0x2e000000
Called from [func: start], at 0x2064, args=0x0 0x10 0xf7fffd9c 0x28000
End traceback...
Note: Line numbers for system and library calls may be incorrect 
Begin traceback...
Called from [func: (null)], at 0xf76b2c4c, args=0xb 0x3 0xf7793a44 0xf78022a8
Called from [func: _MAIN_], at 0x23f4, args=0xc0000000 0xf7fffc4c 0x6 0xf779ece4
Called from [func: (null)], at 0xf7740c38, args=0x0 0x28a04 0x1 0x2e000000
Called from [func: start], at 0x2064, args=0x0 0x10 0xf7fffd9c 0x26000
End traceback...
--------------------------------------------

*** BUG 2 ***

Here is another bug in COMPLEX ARITHMETIC that appears to involve doing
a logical comparison to a pure imaginary number, and only occurs when the
compile is done with -fast.

Consider the following test problem

-----------------------------

      program test
      complex a, czero

      data a / (0., 1.e-6) /
      data czero / (0., 0.) /
c
c
      if (a.ne.czero) goto 100
      stop
  100 stop 100
      end

-----------------------------

This program will run to completion if the code is NOT compiled with -fast.
If compiled with -fast, the code blows off the machine with

*** Bus Error = signal 10 code 2

If you change the data statement for `a' to read:-

      data a / (1., 1.e-6) /

So that `a' is no longer PURE imaginary, it no longer dies... Don't know
if this is the whole story, but its a start.

Environment: SS2, SUNOS4.1.1, SunFortran v1.4
--

				Reg.Clemens
				clemens@afwl.af.mil

khb@chiba.Eng.Sun.COM (Keith Bierman fpgroup) (05/30/91)

Thanks for bringing this to our attention John. We are contacting him
directly, but one assumes that the rest of the readers have some
interest also.

>   That said, my first test was to compile the SLATEC Math library, and
>

We do not have this code in house. We do employ NAG, IMSL and a wide
variety of other codes in our validation suite (many of which cannot
be discussed as they belong to folks who wish to remain anonymous). As
is mentioned from time to time, we are always interested in acquiring
more codes which have _good_test_suites_. Please contact me directly if
you have code you'd like to submit. 

>   (documented) fact that -fast handles underflow differently.

The way -fast handles it, while non-ieee, was strongly requested by a
large number of customers (and they've been able to get it with calls
to abrupt_underflow() for quite some time; see the Numerical
Computation Guide for details).

   *** BUG 1 ***

   The following piece of code demos a problem with AIMAG.  In the  actual

 This is an optimizer bug (which will appear
 at opt levels -O2 -> -O4).  It has been fixed in the next  
 release.  Here is what is going on:

  It occurs when the function aimag is called with
  an array element as the parameter (an erroneous deref
  causes a segmentation violation). A workaround  is to put the array
  element in a scalar and then  make the call.

 For example:

      program test
      complex c(1), a
c
c
      open (unit=6, file='output', form='formatted')
c
      c(1) = (-2.0, -1.0)
      write (6, '(2e20.6)') c
c
      a = c(1)
      write (6, '(2e20.6)') real(c(1)), aimag(a)
      stop
      end


This appears to be bug 1058033, which was just reported a few days
ago. Aside from changing the code, you may use the following compiler
switch to disable the offending bit of the optimizer

       -Qoption iropt -On,complex 

Put this after -fast -O4, and whathaveyou. This disables a bunch of
complex arithmetic transformations, which are quite effective. So, my
personal preference is to:

	a) change the code as above
	b) compile just the afflicted modules this way (make sure
	   they reside in a file by themselves)
	c) use the option mentioned above
	d) give up on optimization

(d) means tossing away factors of 2 in performance, so I am personally
loathe to chose it.


   *** BUG 2 ***

   Here is another bug in COMPLEX ARITHMETIC that appears to involve doing
   a logical comparison to a pure imaginary number, and only occurs when the
   compile is done with -fast.

This is a bug in the libm.il file. The quick fix is to compile
-nolibmil (-fast implies libmil). A slightly harder workaround is to
edit your libmil file, deleting the offending function

	!int
	!_Fc_ne(x, y)
	...

Alternatively, wait a bit and contact your Answer Center and they
should be able to provide you with a proper fix. The bug id is
1060916, if you'd like to follow up with them, this is handy to have.


>				   Reg.Clemens
>				   clemens@afwl.af.mil

I trust that the author will either track down the source of the
numerical problem, or will chat with me directly.
----

Thank you for bringing these problems to our attention. Feel free to
communicate directly with us (via the AC or email).

--
----------------------------------------------------------------
Keith H. Bierman    keith.bierman@Sun.COM| khb@chiba.Eng.Sun.COM
SMI 2550 Garcia 12-33			 | (415 336 2648)   
    Mountain View, CA 94043