[comp.sys.apollo] ftn 10.8.p serious problems / tirade

system@alchemy.chem.utoronto.ca (System Admin (Mike Peterson)) (04/04/91)

We have found several serious problems in the recently released ftn
10.8.p (for DN10000's), especially when optimization is turned on.
The most serious one involves incorrect optimization of simple loops
with GOTO, IF and DO involved, which may in fact be an example of the
general bug described in the Release Notes (Section 4.1.3), if only I could
actually understand what that example means. The Notes say that this bug
affects only M680x0 systems, yet the problem programs actually work
properly on the m68k nodes, so this is probably something different.
Compiling with '-g' fixes this problem, but with slower running code,
provided you know that the program messed up in the first place.

We are also getting "backend failures" of the compiler itself on
simple routines that compiled with '-O' with the ftn 10.7/patch
compiler without problem (which has a lot of "backend failure"
problems itself, and also seems incapable of correctly compiling
most routines using complex variables/arrays).
Compiling with '-g' fixes these problems, but with slower running code.

Our ab initio chemistry program, derived from Gaussian 8x, also
produces incorrect (but close) results - I haven't had the nerve to
recompile G88 itself. The "incorrect but close" errors are the most
disturbing, since without an external check, you would believe the
calculation was correct, which could easily lead to publication of
incorrect results in the scientific literature, and resulting
embarrassment when the paper must be retracted (and believe me the cause
of the retraction would be made VERY CLEAR). In this case, compiling
with '-g' corrects some of the wrong results BUT NOT ALL, which is also
very disturbing.

Since the compiler can not produce useful, trustworthy code (or at least as
trustworthy as ftn 10.7/patch, which is not a terribly high watermark
itself unfortunately), we have abandoned the ftn 10.8 compilers and have
reverted to the 10.7/patch compiler, whose bugs (we think) we know.

As an aside, a couple of questions:

1) why was a compiler released with a bug like Section 4.1.3
of the Notes? The code will fail to work properly at execution time,
but may not screw up enough for the program to abort - bugs like
"backend failures" are much more benign since no code is produced
and the user gets a specific notice that something is wrong.

2) where the **** are the DN10000 compiler beta testers? We are finding
problems within 24 hours of installing the compilers, but that seems to
be story of our life with HP/Apollo :-(.
-- 
Mike Peterson, System Administrator, U/Toronto Department of Chemistry
E-mail: system@alchemy.chem.utoronto.ca
Tel: (416) 978-7094                  Fax: (416) 978-8775

hanche@imf.unit.no (Harald Hanche-Olsen) (04/06/91)

In article <1991Apr3.230735.9578@alchemy.chem.utoronto.ca> system@alchemy.chem.utoronto.ca (System Admin (Mike Peterson)) writes:

   2) where the **** are the DN10000 compiler beta testers?

Here is one.  I don't think it's fair to blame the state of the
compilers on the beta testers.  Their role is not to perform extensive
and comprehensive testing of the system, but to discover some of the
bugs that "typical" users find, as opposed to those found by the kind
of systematic testing that HPollo hopefully does in-house.  There is
no way a small handful of beta testers can try everything!  We
reported 25 bugs during the beta test: 11 f77 bugs and 14 cc bugs (we
use cc more than f77).  Maybe that indicates that the product was
really not ready for beta testing when we got it.  Still, my feeling
at the end of the beta test period was that the compiler seemed much
less buggy than the 10.7 version, at least for our purposes.

   We are finding
   problems within 24 hours of installing the compilers, but that seems to
   be story of our life with HP/Apollo :-(.

Not an unknown feeling.  This is one reason why we volunteered to help
with beta testing.  Maybe you should try to volunteer yourself for
beta testing the next release?

- Harald Hanche-Olsen <hanche@imf.unit.no>
  Division of Mathematical Sciences
  The Norwegian Institute of Technology
  N-7034 Trondheim, NORWAY

system@alchemy.chem.utoronto.ca (System Admin (Mike Peterson)) (04/06/91)

In article <HANCHE.91Apr5200543@hufsa.imf.unit.no> hanche@imf.unit.no (Harald Hanche-Olsen) writes:
>In article <1991Apr3.230735.9578@alchemy.chem.utoronto.ca> system@alchemy.chem.utoronto.ca (System Admin (Mike Peterson)) writes:
>   2) where the **** are the DN10000 compiler beta testers?
>
>Here is one.  I don't think it's fair to blame the state of the
>compilers on the beta testers.  Their role is not to perform extensive
>and comprehensive testing of the system ...

I am not necessarily blaming beta sites, but I would have expected that
beta sites would exercise a product heavily - in your case, using C more
than f77 means that you don't abuse the ftn compiler as much, which may be
giving HPollo a false impression of the robustness of the compiler.
Also, do you compile everything with '-O' or even '-W0,-opt,4' ('-O4')?
Since FORTRAN is usually used for number crunching, having programs/libraries
that are compiled without at least '-O' is rather pointless.
We have 3 compiler-breaker packages that I always try to compile with
a new compiler including Gaussian 88 (which HPollo has in-house), and the 
NCAR library (which HPollo could easily get).

>   We are finding
>   problems within 24 hours of installing the compilers, but that seems to
>   be story of our life with HP/Apollo :-(.
>
>Not an unknown feeling.  This is one reason why we volunteered to help
>with beta testing.  Maybe you should try to volunteer yourself for
>beta testing the next release?

We actually had one version of the 10.8 ftn beta compiler which we
needed to compile Gaussian 88. I did try it on other packages,
but had to give up after finding 2 bad optimizer bugs, 1 of which is
remarkably similar to the major problem I described. We have volunteered
to beta test things like ftn, but there has been no response; I think
HPollo would rather that a large heavy object fell on our site :-).
I can not beta test SR10.x.p (although we end up doing that anyways)
since that machine is our central system supporting almost all our users.

Mike.
-- 
Mike Peterson, System Administrator, U/Toronto Department of Chemistry
E-mail: system@alchemy.chem.utoronto.ca
Tel: (416) 978-7094                  Fax: (416) 978-8775

crh@APOLLO.ENG.OHIO-STATE.EDU (Charlotte Hawley) (04/08/91)

In reply to Harald Hance-Olsen's comment 

<  I don't think it's fair to blame the state of the
<  compilers on the beta testers.

I agree completely.  The state of the compiler is not their
fault, it's just their name that is incorrect.  They are
the "alpha" testers.  We, the consumers, are the "beta"
testers.