[comp.lang.fortran] CFT/CFT77 gotcha

dd@beta.lanl.gov (Dan Davison) (01/21/89)

I came across an interesting non-bug in CFT77 today.  A program
fragment:
	a = 6.0
	b = 3.0
	i = a / b

gave the result "2" in CFT on a X/MP-48 and "1" in CFT77 on the
same machine.

Apparently CFT did the equivalent of a NINT() after a force-to-
integer; CFT77 doesn't.  One consultant here said it had to do
with the loss of precision in the last bit, which makes sense.
Another consultant said the moral was don't used mixed mode
arithmetic if you want accurate results.

At least I didn't write the code!  Now, is this a bug or a feature?

A look through the back of the CFT77 manual (Appendix B) indicates
that I'm going to become very familiar with NINT() and IDNINT()...

hirchert@uxe.cso.uiuc.edu (01/23/89)

dd@beta.lanl.gov writes
>I came across an interesting non-bug in CFT77 today.  A program
>fragment:
>	a = 6.0
>	b = 3.0
>	i = a / b
>
>gave the result "2" in CFT on a X/MP-48 and "1" in CFT77 on the
>same machine.
>...
>At least I didn't write the code!  Now, is this a bug or a feature?

The key property here is that CRAY computers have no divide instruction, only
a way of computing reciprocals, so a/b is computed as a*(1.0/b).  The method of
computing reciprocals is such that if it is not exact, it is smaller than the
true reciprocal.  Thus
        1.0/3.0 = .333...33
        6.0*(1.0/3.0)=1.99...98
With either compiler, if you had written
      c=a/b
      i=c
the result would have been 1 because REAL to INTEGER assignment is defined as
truncating the fractional part of the number, no matter how close it is to the
next number.

In the original CFT compiler, they recognize the special case of a REAL
division being assigned to an INTEGER and apply not NINT(), but "strong
rounding".  This consists of multiplying the result by 1.00...02 and then
truncating.  This extra fudge factor compensates for the extent to which the
computed reciprocal might be lower than a true reciprocal and gives the
expected answer in most cases.  In developing CFT77, CRAY recognized that this
extra multiply slowed down all divides in this kind of situation, not just
those that were expected to be exact and that it could be confusing to get
different results depending on whether or not you assigned the intermediate
REAL quotient to a variable, so they chose not to do "strong rounding" in this
case and instead issue a warning message.  (Did you ignore this warning message,
dd?)

Incidentally, CRAY hardware also has no INTEGER divide instruction.  In this
case, both compilers convert to floating point, do the reciprocal approximation
and multiple, and then apply "strong rounding" before converting back to
INTEGER, since INTEGER arithmetic is expected to be exact.

Bug or feature?  I suppose you would have to say "feature", but really it is
an example of the fact that you should not make assumptions about the properties
of floating point arithmetic.

Kurt W. Hirchert     hirchert@ncsa.uiuc.edu
National Center for Supercomputing Applications

jot@victory.cray.com (Otto Tennant) (01/23/89)

In article <23252@beta.lanl.gov> dd@beta.lanl.gov (Dan Davison) writes:
>
>I came across an interesting non-bug in CFT77 today.  A program
>fragment:
>	a = 6.0
>	b = 3.0
>	i = a / b
>
>gave the result "2" in CFT on a X/MP-48 and "1" in CFT77 on the
>same machine.
>

CFT77 should have produced a warning message for the above fragment.

CFT uses a method called "strong rounding" to compensate for 
integer truncation in instances such as the above.  However, the
code is generated only when the compiler is certain that a real
quotient will be used in an integer context.  Thus, quoting from
an internal paper by Tim Peters, in the fragment

	Z = X/Y
	I = Z
	J = X/Y

I will not necessarily be equal to J.  

Strong rounding doesn't work all of the time.  In the paper, it is
noted that "I = 100 - 52./2." produces "73" with strong rounding.

It has been my experience that few, if any, programs when first 
run under CFT77 produce no warning messages of this sort.  Again
quoting, "NINT compiles to fast in-line code; its use should be
encouraged."

Regardless of compiler, a program which produces a warning message
should be considered to be in error.  It is just too risky to
treat warning errors casually.

Standard disclaimers.

mcdonald@uxe.cso.uiuc.edu (01/23/89)

>Bug or feature?  I suppose you would have to say "feature", but really it is
>an example of the fact that you should not make assumptions about the properties
>of floating point arithmetic.

That latter clause is a true statement, no doubt about it. But,
nevertheless, there is an IEEE standard for floating point formats
and operations. It is extremely specific about what a given result
must be, and reasonably close to the "principle of least astonishment".
Most of the computers in the world get quite close to this standard -
for in range results they seem to be EXACTLY according to it.
I think it would behoove Cray (and that other big computer manufacturer
with the leading up to three zeros in the mantissa) to convert to
it (at least for in-range results).

jlg@lanl.gov (Jim Giles) (01/25/89)

From article <50500101@uxe.cso.uiuc.edu>, by mcdonald@uxe.cso.uiuc.edu:
> 
>>Bug or feature?  I suppose you would have to say "feature", but really it is
>>an example of the fact that you should not make assumptions about the properties
>>of floating point arithmetic.
> 
> That latter clause is a true statement, no doubt about it. But,
> nevertheless, there is an IEEE standard for floating point formats
> and operations. It is extremely specific about what a given result
> must be, and reasonably close to the "principle of least astonishment".
> [...]
> I think it would behoove Cray (and that other big computer manufacturer
> with the leading up to three zeros in the mantissa) to convert to
> it (at least for in-range results).

The IEEE standard was invented for _small_ computers.  To implement it on
vector type archetecture would cause multiply to run 20-30% longer.  Divide
is even worse - a staged divider for the IEEE standard would require twice
the time (or more), but that isn't the problem: such a divide unit would
occupy as much hardware as the entire rest of the CPU!  So, the question I
have as a Cray user is: is fixing this minor divide problem worth slowing
all my other programs by large amounts, increasing the cost of the machine
by 20%, and omitting other improvments that Cray _could_ have made instead?

Having said all this I should point out that I am not violently opposed
to improving the arithmetic done by Cray and other big machines.  The time
wasted (both human and machine) corecting for inaccurate arithmetic is
substantial.  But the issue is not as simple as just saying 'it behooves
them to fix their machines'.  There are trade-offs to consider.  Cray
arithmetic behaves as it does because of _real_ limitations in the way
that hardware _can_ be designed.  You may decide that Cray made the wrong
compromise, but commercial success says otherwise.

Cross post further discussion to comp.arch.

dd@beta.lanl.gov (Dan Davison) (01/25/89)

In article <50500099@uxe.cso.uiuc.edu>, hirchert@uxe.cso.uiuc.edu writes:
> dd@beta.lanl.gov writes
> >I came across an interesting non-bug in CFT77 today.  A program
> >fragment:
> >	a = 6.0
> >	b = 3.0
> >	i = a / b
> >
> >gave the result "2" in CFT on a X/MP-48 and "1" in CFT77 on the
> >same machine.
> [...]  In developing CFT77, CRAY recognized that this
> extra multiply slowed down all divides in this kind of situation, not just
> those that were expected to be exact and that it could be confusing to get
> different results depending on whether or not you assigned the intermediate
> REAL quotient to a variable, so they chose not to do "strong rounding" in this
> case and instead issue a warning message.  (Did you ignore this warning message,
> dd?)

There was no warning issued!  The compiler didn't blink on it.

And thanks very much for an excellent discussion of what was going on.

[inews fodder]

dan davison/theoretical biology/t-10 ms k710/los alamos national laboratory
los alamos, nm 875545/dd@lanl.gov (arpa)/dd@lanl.uucp(new)/..cmcl2!lanl!dd
"Freedom is a heavy load, a great and strange burden for the spirit to
undertake.  It is not easy.  It is not a gift given, but a choice made,
and the choice may be a hard one." ...Le Guin, _The Farthest Shore_

smryan@garth.UUCP (s m ryan) (01/26/89)

>I think it would behoove Cray (and that other big computer manufacturer

It behooves Cray to sell computers. If it sells more boxes by providing
a faster, and sufficiently accurate, division, it will continue to do so.
-- 
So Loki left for elves of night,                                    -- s m ryan
who lived anigh the halls of plight.
He found a pike who swam alone                                 -- Andwari's Gem
and cast the drowning net of Ron.                                        -- 1/8