[comp.lang.c] Floating Point Arithmetic

rst@cs.hull.ac.uk (Rob Turner) (10/26/90)

<henry@zoo.toronto.edu> (Henry Spencer) writes:

 >In general, you should use 'double' for all floating point arithmetic

I agree with this, although it took me a fair while to get over the
natural hurdle of always prefering to use float because float
arithmetic 'must be faster' than double. I am under the impression
that in K&R C (which I have used to do most of my C programming), all
floating point computation is performed in double precision mode
anyway, so the compiler ends up having to convert floats to doubles
before you do the sums, then translate back into floats afterwards.
Depending on the floating point format, these conversions take up
varying amounts of time. Similarly, float parameters are passed as
doubles. I believe that the situation has changed with ANSI C, and
none of these conversions are performed.

         Robert Turner      <rst@cs.hull.ac.uk>

         Department of Computer Science,
         University of Hull, England.

steve@taumet.com (Stephen Clamage) (10/27/90)

rst@cs.hull.ac.uk (Rob Turner) writes:
>[ ... ] it took me a fair while to get over the
>natural hurdle of always prefering to use float because float
>arithmetic 'must be faster' than double.

If FP arithmetic is done in software and the software does both float
and double arithmetic, float is generally faster.  With hardware
floating-point, there is usually no difference, since the hardware
converts to its internal form and back.

>I am under the impression
>that in K&R C (which I have used to do most of my C programming), all
>floating point computation is performed in double precision mode
>anyway, so the compiler ends up having to convert floats to doubles
>before you do the sums, then translate back into floats afterwards.
>Depending on the floating point format, these conversions take up
>varying amounts of time. Similarly, float parameters are passed as
>doubles.

True; and you cannot predict whether conversion to/from float is faster
or slower than conversion to/from double on an unkonwn system, nor
whether the conversion time is compensated for by the difference, if any,
in float vs double arithmetic speed.

>I believe that the situation has changed with ANSI C, and
>none of these conversions are performed.

Not quite.  ANSI C allows but does not require arithmetic on float types
to be performed without conversion to double.
-- 

Steve Clamage, TauMetric Corp, steve@taumet.com

gwyn@smoke.brl.mil (Doug Gwyn) (11/07/90)

In article <27095.9010261638@olympus.cs.hull.ac.uk> rst@cs.hull.ac.uk (Rob Turner) writes:
><henry@zoo.toronto.edu> (Henry Spencer) writes:
> >In general, you should use 'double' for all floating point arithmetic
>I agree with this, although it took me a fair while to get over the
>natural hurdle of always prefering to use float because float
>arithmetic 'must be faster' than double.

Interestingly, that is not always true, especially using IEEE FP chips.

>I believe that the situation has changed with ANSI C, and
>none of these conversions are performed.

Conversion to double is not REQUIRED but it is PERMITTED.

The real reason for avoiding floats is that unless you already know all
about this stuff, your computations may produce garbage due to the lack
of precision attained using single precision.

rang@cs.wisc.edu (Anton Rang) (11/09/90)

In article <14366@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:
>The real reason for avoiding floats is that unless you already know all
>about this stuff, your computations may produce garbage due to the lack
>of precision attained using single precision.

  Of course, it's just as easy to write code which will give you
double-precision garbage as it is to get single-precision garbage.
(Not a reason not to use 'double' anyway, of course.)

	Anton
   
+---------------------------+------------------+-------------+
| Anton Rang (grad student) | rang@cs.wisc.edu | UW--Madison |
+---------------------------+------------------+-------------+

gwyn@smoke.brl.mil (Doug Gwyn) (11/09/90)

In article <RANG.90Nov8132409@nexus.cs.wisc.edu> rang@cs.wisc.edu (Anton Rang) writes:
>  Of course, it's just as easy to write code which will give you
>double-precision garbage as it is to get single-precision garbage.

I disagree.  Except for special, heavily iterative algorithms that would
probably not be attempted by the naive, generally double-precision
arithmetic produces usable answers more often than single-precision would.

rh@smds.UUCP (Richard Harter) (11/10/90)

In article <14406@smoke.brl.mil>, gwyn@smoke.brl.mil (Doug Gwyn) writes:
> In article <RANG.90Nov8132409@nexus.cs.wisc.edu> rang@cs.wisc.edu (Anton Rang) writes:
> >  Of course, it's just as easy to write code which will give you
> >double-precision garbage as it is to get single-precision garbage.

> I disagree.  Except for special, heavily iterative algorithms that would
> probably not be attempted by the naive, generally double-precision
> arithmetic produces usable answers more often than single-precision would.

One can scarcely argue with this -- it is in the nature of things that
double precision will be more accurate than single precision!  However
it is somewhat misleading.  In the majority of situations answers and
data with 3-4 places of precision are all that are required or are meaningful.
The loss of precision is typically 3 places or less; 32 bit single precision
(float on most machines today) is sufficient.  Situations where 32 bit
precision does not suffice are usually either numerically poorly conditioned
or inherently require high precision.  In these cases double precision
is a dangerous nostrum -- one should do one's numerical analysis homework.
-- 
Richard Harter, Software Maintenance and Development Systems, Inc.
Net address: jjmhome!smds!rh Phone: 508-369-7398 
US Mail: SMDS Inc., PO Box 555, Concord MA 01742
This sentence no verb.  This sentence short.  This signature done.

djh@xipe.osc.edu (David Heisterberg) (11/11/90)

In article <232@smds.UUCP> rh@smds.UUCP (Richard Harter) writes:
>                                               Situations where 32 bit
>precision does not suffice are usually either numerically poorly conditioned
>or inherently require high precision.  In these cases double precision
>is a dangerous nostrum -- one should do one's numerical analysis homework.

There is also the case of theoretical work, such as quantum chemistry,
for which all data is known exactly: in atomic units hbar = 1.0, qe = -1.0,
me = 1.0, etc.  It's not uncommon to "out do" others by calculating
energies that are less than 1 part in 10^6 lower than  previous values.
For such comparisons to have meaning in the face of a few large matrix
diagonalizations, double precision is a must.
--
David J. Heisterberg		djh@osc.edu		And you all know
The Ohio Supercomputer Center	djh@ohstpy.bitnet	security Is mortals'
Columbus, Ohio  43212		ohstpy::djh		chiefest enemy.

rh@smds.UUCP (Richard Harter) (11/11/90)

In article <1143@sunc.osc.edu>, djh@xipe.osc.edu (David Heisterberg) writes:
> In article <232@smds.UUCP> rh@smds.UUCP (Richard Harter) writes:
> >                                               Situations where 32 bit
> >precision does not suffice are usually either numerically poorly conditioned
> >or inherently require high precision.  In these cases double precision
> >is a dangerous nostrum -- one should do one's numerical analysis homework.

> There is also the case of theoretical work, such as quantum chemistry,
> for which all data is known exactly: in atomic units hbar = 1.0, qe = -1.0,
> me = 1.0, etc.  It's not uncommon to "out do" others by calculating
> energies that are less than 1 part in 10^6 lower than  previous values.
> For such comparisons to have meaning in the face of a few large matrix
> diagonalizations, double precision is a must.

Well, this is one of the categories I had in mind.  Certainly in theoretical
calculations where you want high final precision you need high intermediate
precision.  However you still need to know how much precision you need.  A 
simple minded "I need more precision so I will use double precision" is
what I was referring to as a "dangerous nostrum".  If your computational
process is not eating precision then the precision you use is the precision
you get.  If the computational process is eating precision then you do not
know what the resulting precision is unless you've done your homework.


-- 
Richard Harter, Software Maintenance and Development Systems, Inc.
Net address: jjmhome!smds!rh Phone: 508-369-7398 
US Mail: SMDS Inc., PO Box 555, Concord MA 01742
This sentence no verb.  This sentence short.  This signature done.

seanf@sco.COM (Sean Fagan) (11/16/90)

In article <14366@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:
>In article <27095.9010261638@olympus.cs.hull.ac.uk> rst@cs.hull.ac.uk (Rob Turner) writes:
>><henry@zoo.toronto.edu> (Henry Spencer) writes:
>> >In general, you should use 'double' for all floating point arithmetic
>>I agree with this, although it took me a fair while to get over the
>>natural hurdle of always prefering to use float because float
>>arithmetic 'must be faster' than double.
>Interestingly, that is not always true, especially using IEEE FP chips.

One of the more amusing tests I've run was on an AT&T 3B5, with no FPU (it
did, however, have an FP emulator).  Anyway, code using 'float' was *much*
slower than code using 'double', because all 'float' variables had to be
promoted to 'double', and that was *also* emulated, and took quite a bit of
time (not to mention demoting them back down to 'float').

-- 
-----------------+
Sean Eric Fagan  | "*Never* knock on Death's door:  ring the bell and 
seanf@sco.COM    |   run away!  Death hates that!"
uunet!sco!seanf  |     -- Dr. Mike Stratford (Matt Frewer, "Doctor, Doctor")
(408) 458-1422   | Any opinions expressed are my own, not my employers'.