[net.lang.c] Can C default to float? Are there float regs?

dove@mit-bug.UUCP (Web Dove) (09/12/85)

Many people in our group find it frustrating that C converts floating
arithmetic to double.  Converting float variables to double, doing the
calculation and converting back to float is usually so costly that it
is faster to do it in double.  Unfortunately, this wastes space.
Also, our machines (vax 750/4.2bsd) would be faster if the
computations were done in straight float.  Many people resort to
fortran/assembler to accomplish this, but it is unfortunate to need to use
two languages.

I realize that this violates the standard for C, but has anyone ever
changed the compiler to accomplish this?

On a related note, it appears that register declarations for float variables 
have no effect on our compiler (they don't cause the variables to be stored
in registers).  It has been hypothesized that those who write the compiler
don't feel that making "register float" do something is worth the effort.

Is there anyone who has made "register float" work?  Is it impossible?

ken@turtlevax.UUCP (Ken Turkowski) (09/13/85)

In article <175@mit-bug.UUCP> dove@mit-bugs-bunny.UUCP (Web dove) writes:
>Many people in our group find it frustrating that C converts floating
>arithmetic to double.  Converting float variables to double, doing the
>calculation and converting back to float is usually so costly that it
>is faster to do it in double.  Unfortunately, this wastes space.
>Also, our machines (vax 750/4.2bsd) would be faster if the
>computations were done in straight float.  Many people resort to
>fortran/assembler to accomplish this, but it is unfortunate to need to use
>two languages.
>
>I realize that this violates the standard for C, but has anyone ever
>changed the compiler to accomplish this?

I've heard of some implementations that do this.  They have implemented
two separate compilers, which one can switch between with a flag:  one
for the "standard interpretation" of the C specification, and one that
accommodates "float" as a bonafide type with a complete set of
operations on them.

The 16-bit machines allowed for two types of fixed point arithmetic:
short (int) and long.  There is no reason why this could not also be
implemented for floating-point numbers.

>On a related note, it appears that register declarations for float variables 
>have no effect on our compiler (they don't cause the variables to be stored
>in registers).  It has been hypothesized that those who write the compiler
>don't feel that making "register float" do something is worth the effort.
>
>Is there anyone who has made "register float" work?  Is it impossible?

Similarly this is not impossible.  I don't believe that the C spec
precludes this.  It's just that many machines do not have
flaoting-point registers.

On machines such as the 68000 that have separate address and data
register sets, the C compiler doesn't normally distinguish between the
two when allocating them; special enhancements need to be made to the
compiler in order for the allocation to be done appropriately.  A
similar enhancement need to be done for floating-point registers.

-- 
Ken Turkowski @ CADLINC, Menlo Park, CA
UUCP: {amd,decwrl,hplabs,seismo,spar}!turtlevax!ken
ARPA: turtlevax!ken@DECWRL.ARPA

henry@utzoo.UUCP (Henry Spencer) (09/15/85)

> Many people in our group find it frustrating that C converts floating
> arithmetic to double...
> I realize that this violates the standard for C, but has anyone ever
> changed the compiler to accomplish this?

Actually, it doesn't violate the ANSI C-standard draft, which provides
for this as a legitimate although implementation-dependent practice.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

mjs@sfmag.UUCP (M.J.Shannon) (09/15/85)

> In article <175@mit-bug.UUCP> dove@mit-bugs-bunny.UUCP (Web dove) writes:
> >Many people in our group find it frustrating that C converts floating
> >arithmetic to double.  Converting float variables to double, doing the
> >calculation and converting back to float is usually so costly that it
> >is faster to do it in double.  Unfortunately, this wastes space.
> >Also, our machines (vax 750/4.2bsd) would be faster if the
> >computations were done in straight float.  Many people resort to
> >fortran/assembler to accomplish this, but it is unfortunate to need to use
> >two languages.
> >
> >I realize that this violates the standard for C, but has anyone ever
> >changed the compiler to accomplish this?
> 
> I've heard of some implementations that do this.  They have implemented
> two separate compilers, which one can switch between with a flag:  one
> for the "standard interpretation" of the C specification, and one that
> accommodates "float" as a bonafide type with a complete set of
> operations on them.
> 
> The 16-bit machines allowed for two types of fixed point arithmetic:
> short (int) and long.  There is no reason why this could not also be
> implemented for floating-point numbers.
> 
> >On a related note, it appears that register declarations for float variables 
> >have no effect on our compiler (they don't cause the variables to be stored
> >in registers).  It has been hypothesized that those who write the compiler
> >don't feel that making "register float" do something is worth the effort.
> >
> >Is there anyone who has made "register float" work?  Is it impossible?
> 
> Similarly this is not impossible.  I don't believe that the C spec
> precludes this.  It's just that many machines do not have
> flaoting-point registers.
> 
> On machines such as the 68000 that have separate address and data
> register sets, the C compiler doesn't normally distinguish between the
> two when allocating them; special enhancements need to be made to the
> compiler in order for the allocation to be done appropriately.  A
> similar enhancement need to be done for floating-point registers.
> 
> -- 
> Ken Turkowski @ CADLINC, Menlo Park, CA
> UUCP: {amd,decwrl,hplabs,seismo,spar}!turtlevax!ken
> ARPA: turtlevax!ken@DECWRL.ARPA

Also, for machines which do have floating point registers, getting the code
generator to produce the correct code to use them is less trivial than most
would believe.  It is for this reason that most VAX compilers no longer
implement "register double" in registers (though many did at one time or
another, there were a significant number of cases that the compiler got wrong,
and fixing these cases would have taken orders of magnitude more work than was
feasable).
-- 
	Marty Shannon
UUCP:	ihnp4!attunix!mjs
Phone:	+1 (201) 522 6063
Disclaimer: I speak for no one.

c20@nmtvax.UUCP (09/17/85)

> << first part of message here >>
>
> On a related note, it appears that register declarations for float variables 
> have no effect on our compiler (they don't cause the variables to be stored
> in registers).  It has been hypothesized that those who write the compiler
> don't feel that making "register float" do something is worth the effort.
> 
> Is there anyone who has made "register float" work?  Is it impossible?

New Mexico Tech's v7 C compiler for TOPS-20 supports register floats.
It also supports register doubles, but does so by treating a "double"
declaration as being synonymous with "float":

    "Single-precision floating point (float) and double-precision floating
     point (double) may be synonymous in some implementations."
                                                K & R, page 183, line 6

It's not impossible;  it's just more or less difficult, depending upon
what the underlying architecture provides the C compiler author in
the way of support for floating-point.

greg
-- 

Greg Titus                  ..!ucbvax!unmvax!nmtvax!c20     (uucp)
NM Tech Computer Center     ..!cmcl2!lanl!nmtvax!c20        (uucp)
Box W209 C/S                c20@nmt                         (CSnet)
Socorro, NM 87801           c20.nmt@csnet-relay             (arpa)
(505) 835-5735
======================================================================

brooks@lll-crg.UUCP (Eugene D. Brooks III) (09/18/85)

It is just a shame that the FP11 on the old PDP11 did not support both
single and double floats in its registers at the same time.  If it did
this problem would not have been in C from the beginning!  It will of
course dissapear eventually, abeit more slowly than the PDP11

rubin@mtuxn.UUCP (M.RUBIN) (09/20/85)

Sun's C compiler has a flag "-fsingle" that causes floating point calculations
to be done in single precision.

brooks@lll-crg.UUCP (Eugene D. Brooks III) (09/20/85)

In reply to the comment concerning the difficulty of doing double
register vars right with pcc.  FOO on this!  This problem is no
different than getting longs done right on the PDP 11.  It has been
done at Caltech for one place and that compiler, which does single arithmetic
in single and allocates both single and double floats in registers properly
has been in use for many years.  Like any compiler modification it has
to be done right but it is not that difficult.

doug@escher.UUCP (Douglas J Freyburger) (09/22/85)

The discussion has been about single vs double precision
floating point as well as keeping them in registers.  It
was mentioned that there w{{uld be a big speed difference
on the VAX.

At site "cithep", the Caltech High Enerrgy Physics
Department, they made a C compiler that used single
precision for "float".  At first, they got ALMOST NO SPEED
IMPROVEMENT.  After adding floating point immediate values
to the assembler produced (instead of storing constants
like strings and then referring to them by name), they got
a pretty good improvement.  10-20%.  I don't know if they
were running a 750 or 780.  Still, for calculations that
only involve variables and no constants, the difference is
much smaller than you'd think.

The difference is smaller than you would think, but then,
has anyone ever seen an instruction timing diagram for any
VAX model published by DEC?  No.  The large number of
models is one reason.  The complexity of cache-hit vs
cache-miss vs page-fault out of the physical memory pool vs
page-fault out of disk is another reason.  Still, I think
they should be able to publish the timings for cache-hits
and cache-misses on a per-model basis.  They DO have the
microcode sources available to count from.  I am guessing
that they put a lot more effort into optimizing the double
precision floating point, and they are hesitant to show
that in public.

Ever look at the assembler produced by BLISS-32 on a VMS
machine?  That BLISS compiler obviously knows some stuff
that the rest of us don't.  It uses indexed address
modes instead of ADDs, in-line multiplies instead of
POLYs, and other tricks that I don't even recognize that
LOOK like they should be slower than the obvious way.  It
looks like instructions that people think are faster are
even sometimes slower.
-- 

Doug Freyburger		DOUG@JPL-VLSI, DOUG@JPL-ROBOTICS,
JPL 171-235		...escher!doug, doug@aerospace,
Pasadena, CA 91109	etc.

Disclaimer: The opinions expressed above are far too
ridiculous to be associated with my employer.

Unix is a trademark of Bell Labs, VMS is a trade mark of
DEC, and there are others that I'm probably forgeting to
mention.

dove@mit-bug.UUCP (Web Dove) (09/25/85)

In article <56@escher.UUCP> doug@escher.UUCP (Douglas J Freyburger) writes:
>The discussion has been about single vs double precision
>floating point as well as keeping them in registers.  It
>was mentioned that there w{{uld be a big speed difference
>on the VAX.
>
>At site "cithep", the Caltech High Enerrgy Physics
>Department, they made a C compiler that used single
>precision for "float".  At first, they got ALMOST NO SPEED
>IMPROVEMENT.  After adding floating point immediate values
>to the assembler produced (instead of storing constants
>like strings and then referring to them by name), they got
>a pretty good improvement.  10-20%.  I don't know if they
>were running a 750 or 780.  Still, for calculations that
>only involve variables and no constants, the difference is
>much smaller than you'd think.

Here is the simple test I used

#include <stdio.h>
main()
{
  register float x=1.1, y=1.0;
  register int i;
  for(i=0; i<100000; i++)
    {
      x = x+y;
      x = x+y;
      x = x+y;
      x = x+y;
      x = x+y;
      x = x+y;
      x = x+y;
      x = x+y;
      x = x+y;
      x = x+y;
    }
}

750/FPA results using 4.2bsd compiler (with "-O")

9.6u 0.6s 0:11 91% 1+3k 1+2io 2pf+0w

750/FPA results using 4.3bsd compiler (with "-O -f" i.e. use single
precision float)

1.8u 0.1s 0:01 101% 1+3k 0+2io 2pf+0w

This is an artificially dramatic result.  The following is more typical.


Here is a summary of the fft times for a program written in C (no assem code)

It uses either single precision (with the 4.3bsd compiler "-O -f") or double
precision (either 4.3bsd or 4.2bsd compilers, they are effectively the same).

The times for single with the 4.2 compiler are longer than double (because of
all the conversions) so it is pointless to list them as an alternative.

Summary of FFT Times:

1024 complex FFTs (in milliseconds) and IFFTs - Single Precision

Vax750 w/FPA (Radix 4):  210 and 210
Vax750 w/FPA (Radix 2):  280 and 290

Vax785 w/FPA (Radix 4):  80 and 90
Vax785 w/FPA (Radix 2):  100 and 110

1024 complex FFTs (in milliseconds) and IFFTs - Double Precision

Vax750 w/FPA (Radix 4):  350 and 350
Vax750 w/FPA (Radix 2):  360 and 390

Vax785 w/FPA (Radix 4):  200 and 210
Vax785 w/FPA (Radix 2):  220 and 230


These are considerable improvements for both 750 and 785 (more than
the 20% quoted above) and for a program that has a typical balance of
control code and floating arithmetic code (typical for signal
processing).