[can.usrgroup] C code

steve@cohort.UUCP (Steve Bird) (02/28/90)

Forwarded message:

>From steve Fri Nov 24 21:46:23 1989
Subject: C code
To: evan@telly (Evan Leibovitch)
Date: Fri, 24 Nov 89 21:46:23 EST
From: Steve Bird <steve@cohort.UUCP>
X-Mailer: ELM [version 2.2 PL7]

 Here's another interesting piece. 

     /* floaterr.c--demonstrates round-off error */
     # include <stdio.h>
     main()
       {
         float a,b;
         b = 2.0e20 + 1.0;
         a = b - 2.0e20;
         printf("%f \n",a);
       }

  When compiled the program returns the number 4008175468544.000000 .
  Now when the program is modified to read :

     /* floaterr.c--demonstrates round-off error */
     # include <stdio.h>
     main()
       {
         float a,b;
         b = 2.0e20 + 1.0;
         a = 2.0e20;
         printf("%f \n",b - a);
       }

  The program returns 0.000000 . Why ?

hugh (D. Hugh Redelmeier) (03/01/90)

| From: Steve Bird <steve@cohort.UUCP>
|
|  Here's another interesting piece. 
|
|      /* floaterr.c--demonstrates round-off error */
|      # include <stdio.h>
|      main()
|        {
|          float a,b;
|          b = 2.0e20 + 1.0;
|          a = b - 2.0e20;
|          printf("%f \n",a);
|        }
|
|   When compiled the program returns the number 4008175468544.000000 .
|   Now when the program is modified to read :
|
|      /* floaterr.c--demonstrates round-off error */
|      # include <stdio.h>
|      main()
|        {
|          float a,b;
|          b = 2.0e20 + 1.0;
|          a = 2.0e20;
|          printf("%f \n",b - a);
|        }
|
|   The program returns 0.000000 . Why ?

In one case, you are evaluating
	((double) (float) 2.0e20) - 2.0e20
In the other case, you are evaluating
	((double) (float) 2.0e20) - ((double) (float) 2.0e20)

The former represents the roundoff in representing 2.0e20 in single
precision.  The latter ought to be zero.  Here is a simpler version
of your programs.  Note that the "+ 1.0" has no effect.

  /* floaterr.c--demonstrates round-off error */
  # include <stdio.h>
  int
  main()
    {
      printf("%f\n",((double) (float) 2.0e20) - 2.0e20);
        /*4008175468544.000000*/

      printf("%f\n", ((double) (float) 2.0e20) - ((double) (float) 2.0e20));
        /*0.000000*/

      return 0;
    }

For questions like this, it is useful to know the compiler and
machine.  It appears to use IEEE f.p. so I was able the reproduce it
using RedC on my Sun 3.

Hugh Redelmeier
{utcsri, yunexus, uunet!attcan, utzoo, hcr}!redvax!hugh
When all else fails: hugh@csri.toronto.edu
+1 416 482-8253

clewis@eci386.uucp (Chris Lewis) (03/02/90)

In article <9002280838.AA01415@cohort.uucp> Steve Bird <steve@cohort.UUCP> writes:
|          float a,b;
|          b = 2.0e20 + 1.0;
|          a = b - 2.0e20;
|          printf("%f \n",a);
|
|   When compiled the program returns the number 4008175468544.000000 .
|   Now when the program is modified to read :
|
|          float a,b;
|          b = 2.0e20 + 1.0;
|          a = 2.0e20;
|          printf("%f \n",b - a);
|
|   The program returns 0.000000 . Why ?

Actually, it's truncation error ;-)

The numbers printed in the above cases frequently depend upon the precise
C compiler and processor you're running on.  The explanation I give is
"traditional" C, according to K&R (ANSI C provides for different behaviour
under certain circumstances):

When a float is passed to a function or used in an expression, the operand
is first coerced to a double.  Eg: the subtraction in the first fragment
has both arguments coerced to double, and then the result is forced into
a float.  Since floats are usually half the size of doubles, you lose
digits off the least significant end.

In the second fragment, the subtraction is also done with both as doubles,
but since it is being used as a function argument, it is not truncated
into a float, and it's passed as a double to printf's %f handler.

There are other factors coming into play - depending on your machine,
2e20 + 1 may actually *equal* 2e20, depending upon how many digits of
precision the variable that the result is stored in has.  (Which is what
I suspect that your second example is trying to tell you)

Frankly, given that "traditional" C does all floating point operations
and argument passing in doubles, I almost never use a float to store 
the result of an FP operation, and only use floats in large arrays.
If you use floats for the results of FP operations, the algorithm should
be well understood as to the magnitudes of the operands used.  This 
sort of thing is still possible with doubles, but you can get away with more.

If you make everything double, chances are it'll be faster (less coercing 
required), and only use significant amounts of space in large arrays.
-- 
Chris Lewis, Elegant Communications Inc, {uunet!attcan,utzoo}!lsuc!eci386!clewis
Ferret mailing list: eci386!ferret-list, psroff mailing list: eci386!psroff-list

henry@utzoo.uucp (Henry Spencer) (03/02/90)

In article <9002280838.AA01415@cohort.uucp> Steve Bird <steve@cohort.UUCP> writes:
>         float a,b;
>         b = 2.0e20 + 1.0;
>         a = b - 2.0e20;
>         printf("%f \n",a);
>  When compiled the program returns the number 4008175468544.000000 .
>
>         a = 2.0e20;
>         printf("%f \n",b - a);
>  The program returns 0.000000 . Why ?

Single-precision floating-point, aka "float", typically has about 24 bits
of precision.  Numbers circa 2e20, represented in circa 24 bits, have
roundoff error circa 1e13, so 4e12 and 0 probably differ by 1 in the least-
significant bit.  The arithmetic in these two programs is not quite the
same, because all C floating-point arithmetic is done in "double", and
you've got an extra double->float->double conversion between subtraction
and printing in the first case.  Somehow, the code your compiler is
generating is ending up introducing different roundoff behavior.  Asking
for a detailed explanation of a least-significant-bit difference is
pointless unless you're willing to examine compiler, generated code, and
hardware very closely; too many things can cause such differences.

"Floating point is an analog box anyway." -Hugh Redelmeier.  (Translation,
for the non-hardware types in the crowd:  "if there's only one or two bits
wrong at the least-significant end, you got what you paid for".)
-- 
MSDOS, abbrev:  Maybe SomeDay |     Henry Spencer at U of Toronto Zoology
an Operating System.          | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

lamy@cs.utoronto.ca (Jean-Francois Lamy) (03/02/90)

It is interesting to note that one gets the same results on those two pieces
of code when running them on Sun 4s (SunOS 4.0.3), Iris 4d (Irix 3.2 with the
old MIPS compiler) and under gcc on the Sun 4s.  Could compliance to IEEE
rules cause all 3 versions of the floating point code to behave essentially
the same given there is little room for optimizations?

Possibly fishing in the wrong pond,

Jean-Francois Lamy               lamy@cs.utoronto.ca, uunet!cs.utoronto.ca!lamy
Department of Computer Science, University of Toronto, Canada M5S 1A4