dove@mit-bug.UUCP (Web Dove) (09/12/85)
Many people in our group find it frustrating that C converts floating arithmetic to double. Converting float variables to double, doing the calculation and converting back to float is usually so costly that it is faster to do it in double. Unfortunately, this wastes space. Also, our machines (vax 750/4.2bsd) would be faster if the computations were done in straight float. Many people resort to fortran/assembler to accomplish this, but it is unfortunate to need to use two languages. I realize that this violates the standard for C, but has anyone ever changed the compiler to accomplish this? On a related note, it appears that register declarations for float variables have no effect on our compiler (they don't cause the variables to be stored in registers). It has been hypothesized that those who write the compiler don't feel that making "register float" do something is worth the effort. Is there anyone who has made "register float" work? Is it impossible?
ken@turtlevax.UUCP (Ken Turkowski) (09/13/85)
In article <175@mit-bug.UUCP> dove@mit-bugs-bunny.UUCP (Web dove) writes: >Many people in our group find it frustrating that C converts floating >arithmetic to double. Converting float variables to double, doing the >calculation and converting back to float is usually so costly that it >is faster to do it in double. Unfortunately, this wastes space. >Also, our machines (vax 750/4.2bsd) would be faster if the >computations were done in straight float. Many people resort to >fortran/assembler to accomplish this, but it is unfortunate to need to use >two languages. > >I realize that this violates the standard for C, but has anyone ever >changed the compiler to accomplish this? I've heard of some implementations that do this. They have implemented two separate compilers, which one can switch between with a flag: one for the "standard interpretation" of the C specification, and one that accommodates "float" as a bonafide type with a complete set of operations on them. The 16-bit machines allowed for two types of fixed point arithmetic: short (int) and long. There is no reason why this could not also be implemented for floating-point numbers. >On a related note, it appears that register declarations for float variables >have no effect on our compiler (they don't cause the variables to be stored >in registers). It has been hypothesized that those who write the compiler >don't feel that making "register float" do something is worth the effort. > >Is there anyone who has made "register float" work? Is it impossible? Similarly this is not impossible. I don't believe that the C spec precludes this. It's just that many machines do not have flaoting-point registers. On machines such as the 68000 that have separate address and data register sets, the C compiler doesn't normally distinguish between the two when allocating them; special enhancements need to be made to the compiler in order for the allocation to be done appropriately. A similar enhancement need to be done for floating-point registers. -- Ken Turkowski @ CADLINC, Menlo Park, CA UUCP: {amd,decwrl,hplabs,seismo,spar}!turtlevax!ken ARPA: turtlevax!ken@DECWRL.ARPA
henry@utzoo.UUCP (Henry Spencer) (09/15/85)
> Many people in our group find it frustrating that C converts floating > arithmetic to double... > I realize that this violates the standard for C, but has anyone ever > changed the compiler to accomplish this? Actually, it doesn't violate the ANSI C-standard draft, which provides for this as a legitimate although implementation-dependent practice. -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
mjs@sfmag.UUCP (M.J.Shannon) (09/15/85)
> In article <175@mit-bug.UUCP> dove@mit-bugs-bunny.UUCP (Web dove) writes: > >Many people in our group find it frustrating that C converts floating > >arithmetic to double. Converting float variables to double, doing the > >calculation and converting back to float is usually so costly that it > >is faster to do it in double. Unfortunately, this wastes space. > >Also, our machines (vax 750/4.2bsd) would be faster if the > >computations were done in straight float. Many people resort to > >fortran/assembler to accomplish this, but it is unfortunate to need to use > >two languages. > > > >I realize that this violates the standard for C, but has anyone ever > >changed the compiler to accomplish this? > > I've heard of some implementations that do this. They have implemented > two separate compilers, which one can switch between with a flag: one > for the "standard interpretation" of the C specification, and one that > accommodates "float" as a bonafide type with a complete set of > operations on them. > > The 16-bit machines allowed for two types of fixed point arithmetic: > short (int) and long. There is no reason why this could not also be > implemented for floating-point numbers. > > >On a related note, it appears that register declarations for float variables > >have no effect on our compiler (they don't cause the variables to be stored > >in registers). It has been hypothesized that those who write the compiler > >don't feel that making "register float" do something is worth the effort. > > > >Is there anyone who has made "register float" work? Is it impossible? > > Similarly this is not impossible. I don't believe that the C spec > precludes this. It's just that many machines do not have > flaoting-point registers. > > On machines such as the 68000 that have separate address and data > register sets, the C compiler doesn't normally distinguish between the > two when allocating them; special enhancements need to be made to the > compiler in order for the allocation to be done appropriately. A > similar enhancement need to be done for floating-point registers. > > -- > Ken Turkowski @ CADLINC, Menlo Park, CA > UUCP: {amd,decwrl,hplabs,seismo,spar}!turtlevax!ken > ARPA: turtlevax!ken@DECWRL.ARPA Also, for machines which do have floating point registers, getting the code generator to produce the correct code to use them is less trivial than most would believe. It is for this reason that most VAX compilers no longer implement "register double" in registers (though many did at one time or another, there were a significant number of cases that the compiler got wrong, and fixing these cases would have taken orders of magnitude more work than was feasable). -- Marty Shannon UUCP: ihnp4!attunix!mjs Phone: +1 (201) 522 6063 Disclaimer: I speak for no one.
c20@nmtvax.UUCP (09/17/85)
> << first part of message here >> > > On a related note, it appears that register declarations for float variables > have no effect on our compiler (they don't cause the variables to be stored > in registers). It has been hypothesized that those who write the compiler > don't feel that making "register float" do something is worth the effort. > > Is there anyone who has made "register float" work? Is it impossible? New Mexico Tech's v7 C compiler for TOPS-20 supports register floats. It also supports register doubles, but does so by treating a "double" declaration as being synonymous with "float": "Single-precision floating point (float) and double-precision floating point (double) may be synonymous in some implementations." K & R, page 183, line 6 It's not impossible; it's just more or less difficult, depending upon what the underlying architecture provides the C compiler author in the way of support for floating-point. greg -- Greg Titus ..!ucbvax!unmvax!nmtvax!c20 (uucp) NM Tech Computer Center ..!cmcl2!lanl!nmtvax!c20 (uucp) Box W209 C/S c20@nmt (CSnet) Socorro, NM 87801 c20.nmt@csnet-relay (arpa) (505) 835-5735 ======================================================================
brooks@lll-crg.UUCP (Eugene D. Brooks III) (09/18/85)
It is just a shame that the FP11 on the old PDP11 did not support both single and double floats in its registers at the same time. If it did this problem would not have been in C from the beginning! It will of course dissapear eventually, abeit more slowly than the PDP11
rubin@mtuxn.UUCP (M.RUBIN) (09/20/85)
Sun's C compiler has a flag "-fsingle" that causes floating point calculations to be done in single precision.
brooks@lll-crg.UUCP (Eugene D. Brooks III) (09/20/85)
In reply to the comment concerning the difficulty of doing double register vars right with pcc. FOO on this! This problem is no different than getting longs done right on the PDP 11. It has been done at Caltech for one place and that compiler, which does single arithmetic in single and allocates both single and double floats in registers properly has been in use for many years. Like any compiler modification it has to be done right but it is not that difficult.
doug@escher.UUCP (Douglas J Freyburger) (09/22/85)
The discussion has been about single vs double precision floating point as well as keeping them in registers. It was mentioned that there w{{uld be a big speed difference on the VAX. At site "cithep", the Caltech High Enerrgy Physics Department, they made a C compiler that used single precision for "float". At first, they got ALMOST NO SPEED IMPROVEMENT. After adding floating point immediate values to the assembler produced (instead of storing constants like strings and then referring to them by name), they got a pretty good improvement. 10-20%. I don't know if they were running a 750 or 780. Still, for calculations that only involve variables and no constants, the difference is much smaller than you'd think. The difference is smaller than you would think, but then, has anyone ever seen an instruction timing diagram for any VAX model published by DEC? No. The large number of models is one reason. The complexity of cache-hit vs cache-miss vs page-fault out of the physical memory pool vs page-fault out of disk is another reason. Still, I think they should be able to publish the timings for cache-hits and cache-misses on a per-model basis. They DO have the microcode sources available to count from. I am guessing that they put a lot more effort into optimizing the double precision floating point, and they are hesitant to show that in public. Ever look at the assembler produced by BLISS-32 on a VMS machine? That BLISS compiler obviously knows some stuff that the rest of us don't. It uses indexed address modes instead of ADDs, in-line multiplies instead of POLYs, and other tricks that I don't even recognize that LOOK like they should be slower than the obvious way. It looks like instructions that people think are faster are even sometimes slower. -- Doug Freyburger DOUG@JPL-VLSI, DOUG@JPL-ROBOTICS, JPL 171-235 ...escher!doug, doug@aerospace, Pasadena, CA 91109 etc. Disclaimer: The opinions expressed above are far too ridiculous to be associated with my employer. Unix is a trademark of Bell Labs, VMS is a trade mark of DEC, and there are others that I'm probably forgeting to mention.
dove@mit-bug.UUCP (Web Dove) (09/25/85)
In article <56@escher.UUCP> doug@escher.UUCP (Douglas J Freyburger) writes: >The discussion has been about single vs double precision >floating point as well as keeping them in registers. It >was mentioned that there w{{uld be a big speed difference >on the VAX. > >At site "cithep", the Caltech High Enerrgy Physics >Department, they made a C compiler that used single >precision for "float". At first, they got ALMOST NO SPEED >IMPROVEMENT. After adding floating point immediate values >to the assembler produced (instead of storing constants >like strings and then referring to them by name), they got >a pretty good improvement. 10-20%. I don't know if they >were running a 750 or 780. Still, for calculations that >only involve variables and no constants, the difference is >much smaller than you'd think. Here is the simple test I used #include <stdio.h> main() { register float x=1.1, y=1.0; register int i; for(i=0; i<100000; i++) { x = x+y; x = x+y; x = x+y; x = x+y; x = x+y; x = x+y; x = x+y; x = x+y; x = x+y; x = x+y; } } 750/FPA results using 4.2bsd compiler (with "-O") 9.6u 0.6s 0:11 91% 1+3k 1+2io 2pf+0w 750/FPA results using 4.3bsd compiler (with "-O -f" i.e. use single precision float) 1.8u 0.1s 0:01 101% 1+3k 0+2io 2pf+0w This is an artificially dramatic result. The following is more typical. Here is a summary of the fft times for a program written in C (no assem code) It uses either single precision (with the 4.3bsd compiler "-O -f") or double precision (either 4.3bsd or 4.2bsd compilers, they are effectively the same). The times for single with the 4.2 compiler are longer than double (because of all the conversions) so it is pointless to list them as an alternative. Summary of FFT Times: 1024 complex FFTs (in milliseconds) and IFFTs - Single Precision Vax750 w/FPA (Radix 4): 210 and 210 Vax750 w/FPA (Radix 2): 280 and 290 Vax785 w/FPA (Radix 4): 80 and 90 Vax785 w/FPA (Radix 2): 100 and 110 1024 complex FFTs (in milliseconds) and IFFTs - Double Precision Vax750 w/FPA (Radix 4): 350 and 350 Vax750 w/FPA (Radix 2): 360 and 390 Vax785 w/FPA (Radix 4): 200 and 210 Vax785 w/FPA (Radix 2): 220 and 230 These are considerable improvements for both 750 and 785 (more than the 20% quoted above) and for a program that has a typical balance of control code and floating arithmetic code (typical for signal processing).