[comp.sys.apollo] Fortran 9.95 problem

derstad@CIM-VAX.HONEYWELL.COM ("DAVE ERSTAD") (11/24/88)

Here's one for the die-hard Fortran users out there.  Surely, someone
has run into this.

We're running a proprietary version of the UC Berkely SPICE program, 
a large Fortran program for analysis of integrated circuits.  The 
part we're having problems with is part of the public Domain code,
not our extensions, and has functioned properly on many previous 
releases of Aegis, the fortran compiler, etc., not to mention
virtually every other computer in the world.

With the 9.95 version of the compiler, we have problems.  Essentially,
there is a (actually many) segments of the code which look like

    A = something
    B = DSQRT(A*A + C*C)
    D = A + B - C
    IF D = 0 jump over some code

What is happening is that A will many times evaluate to identically
0.  B is then set to the same value as C without any problem.  However,
due to the way the optimized code works, the D evaluation is performed
with B in an 80 bit floating point register, but C comes from a 64
bit memory location.  (all variables are REAL*8).  Thus, D becomes 
non-zero.  Later, it gets used in an expression like
    E = 1.0D0 + D*something
    F = 1.0 / (1 - E)

which of course aborts with a divide by 0 error.

Now, I don't necessarily like the way this is coded, but it has worked
successfully at thousands of different sites on probably dozens of
different computer systems.  It devi
It definitely works if, with a given platform, DSQRT(C*C) is identically
C.  I think (but have not gone through and verified) that if it is
evaluated with a fixed precision throughout, whether it be 64 bits
or 80 bits, that it works as well.  Even if DSQRT(C*C) - C has 
round-off error, E becomes non-one and the code works.  What 
is causing the problem here is that DSQRT(C*C) - C is determined
to be non-zero based on 80 bit arithmetic, while the later
evaluation of E uses a 64 bit representation of DSQRT(C*C) - C
which rounds to zero.

I should mention that when I say the code "works" when E is non-one,
it may still yield screwy results.  However, the SPICE algorithms 
are capable of correcting for those screwy results.  Essentially,
SPICE makes some pretty strange guesses as to how the circuit is
operating, but is able to recover from them.

Anyone run into anything similar?  Any help would be appreciated.  So
far, the only definite workaround we know is to shut off cpu optimization;
however, then we lose much of the advantage of the hardware floating 
point support.

Dave Erstad
Honeywell SSED

ok@quintus.uucp (Richard A. O'Keefe) (11/24/88)

In article <8811231951.AA29470@umix.cc.umich.edu> derstad@CIM-VAX.HONEYWELL.COM ("DAVE ERSTAD") writes:
>    A = something
>    B = DSQRT(A*A + C*C)
>    D = A + B - C
>    IF D = 0 jump over some code

Just a word in defence of Apollo:  I ran into the same problem with a
PR1ME 400 years ago.  (Single-precision calculations were done in a
double-precision register, then truncated, except optimisation skipped
the truncation.)  A similar problem also exists on Suns, which is why
they have the "-fstore" option.  I think Apollo have a similar option.

That isn't a terribly brilliant way of computing B, but that's another story.

derstad@CIM-VAX.HONEYWELL.COM ("DAVE ERSTAD") (11/28/88)

> Yes, I think the option is -frnd

The -frnd option affects a different, but related problem.  frnd forces
rounding to occur before comparing two floating pointer numbers (Otherwise,
a 64 bit quantity might be compared to an 80 bit quantity).

In this particular instance, a 64 bit quantity is being subtracted from
an 80 bit quantity, then compared to zero.  I actually did try the -frnd
switch to see if there was any difference, and there wasn't.  When the
comparcomparison to zero occurs, there 
comparison to zero occurs, the quantity does get rounded, but it is still
non-zero (since the mantissa bits getting chopped off are in a different
place than if you round the operands).

Put another way,
   round(a - b) != round(a) - round(b)

I admit I don't like the way spice implements these conditional branches
either, but on the other hand how many sites run spice or a variant without
any problems?

krowitz@RICHTER.MIT.EDU (David Krowitz) (11/28/88)

I believe that there is a new switch to the FTN compiler (-frnd, I think)
which is supposed to force all variables to 64-bits before comparisons
are made. I think this may solve your problem.


 -- David Krowitz

krowitz@richter.mit.edu   (18.83.0.109)
krowitz%richter@eddie.mit.edu
krowitz%richter@athena.mit.edu
krowitz%richter.mit.edu@mitvma.bitnet
(in order of decreasing preference)