gudeman@cs.arizona.edu (David Gudeman) (07/09/90)
In article <2324@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: [about implementing frexp(x,&n) -- mantissa of a float] >Now I submit it will take a real turkey machine language user to match the >slowness of any program written in C or Fortran, USING ONLY THE OPERATIONS >OF THOSE LANGUAGES. Also, inlining, with n being placed in the register >where it is to be used, is likely to cut the running time by more than >half. In C I would write #include <math.h> ... frexp(x,&n) If this doesn't produce code as fast as assembly language, then you don't have an optimizing compiler. I'd be surprised if the same doesn't hold for FORTRAN. Basically, your implication that C doesn't have an operation to directly destructure floats is wrong. True, the operation (frexp) is called a library function instead of an operator, but the effect is the same, given a smart enough compiler. There are only so many operator symbols available, and frexp is an obscure feature at best. Also, the definition of frexp is portable. It would work on a machine that didn't use the traditional representation of floats. That's because C is intended to be a portable language. If you don't care about portability, (or programming time, or maintenance time), then by all means use assembler. If you want a C-like language that is as machine dependent as assembler then by all means implement it, but don't expect that it will hold much interest for anyone who doesn't program exclusively for your machine. If you want a machine-independent language that lets you include machine-specific parts for performance improvement, then use C with the asm() feature. It is really quite meaningless to give a challenge as above to do something as efficient as assembly language, and rule out asm() as an option. asm() is provided specifically to allow you to do such things. -- David Gudeman Department of Computer Science The University of Arizona gudeman@cs.arizona.edu Tucson, AZ 85721 noao!arizona!gudeman
mjs@hpfcso.HP.COM (Marc Sabatella) (07/10/90)
>Anyways, almost every comment I've seen on compilers vs hand >optimization has been on a routine by routine case - and I still believe >(though I could be wrong) that the optimial results from assembly code >comes from register usage across/through routine calls... This has been my experience as well. I once did a little study where I went the other way - I took an assembly program written by someone else, and proceeded to translate it into the "natural" C equivalent (ie, turned lots of spaghetti code into something resembling loop structures. This was on an 8086, using Turbo C. The C code was much worse. Using the original assembly buffered I/O routines rather than C's stdio helped a lot, although so would have merely using setbuf to up my buffer size. Looking at the difference between the original and the generated code, it seemed pretty obvious where the remaining opportunities were. Arranging to pass some parameters in registers (by using Turbo C's register pseudo-variables), and keeping some globals in registers (by diddling the assembly output), got much of the original performance back. I also diddled the assembly code to perform some trivial call sequence optimizations like generalized tail recursion elimination. Finally, I started profiling, and by replacing the top few CPU-hogs with their original versions (perhaps not coincidentally, these were the ones with the most convoluted flow graphs - presumably the assembly programmer had spent a lot of time tuning these), I was just about back where I started. I think most of the remaining difference could be attributed to the assembly language programmer's ability to use the more arcane bit twiddling instructions of the 8086; on a more RISCy machine, this advantage would probably be reduced.