[comp.lang.c] FORTRAN to C converter wanted. Also tweaking code.

braner@batcomputer.tn.cornell.edu (Moshe Braner) (10/03/88)

[Disclaimer: I have no affection for FORTRAN, but I have stubborn
customers to satisfy :-)]

Does anybody know about software to automatically convert FORTRAN code
to C?  If it turns out good C, all the better, but if it makes ugly C
code but does the conversion fast, one could think of it as a preprocessor,
part of the compiler.  The code is (of course) scientific, numerical.

I have done a fair amount of manual FORTRAN-->C translation of scientific
subroutines.  It is generally easy, with one caveat:

FORTRAN uses array indexing starting at 1, C normally starts at 0.  There
are various ways to get around that, but all easily lead to confusion and
bugs, especially when one deals with arrays of more than one dimension.
The best way is probably to shift to 0-based arrays everywhere, but in
numerical algorithms there are always hidden statements such as "a[2] = ..."
or even "for (i=n; i>3; i--) a[i] = ...".  This difficulty, I believe,
has prompted the authors of "Numerical Recipes in C" to stick to the
FORTRAN indexing, through various, somewhat akward tricks.  They also went
the route of malloc()ing local arrays, a practice that has a severe
performance penalty if used in library procedures that are called often
and don't do very much.

I have frequently tweaked code written by scientists, resulting
in speed up factors of 5 or better.  Besides pointer arithmetic tricks
(not always ugly: e.g., using pointers to rows, as in "p=&a[i][0];
... p[j]=...")  the standard things to do include:
Taking things out of loops if they don't need to be there.
Using int ops whenever possible instead of FP.
Using FP types in FP expressions rather than forcing int-->float
conversions each time around the loop.
Use shifts for integer division by powers of 2.
Use masking instead of modulo ops (n&0x07 same as n%8, but faster).
In C, using double instead of float (grrrrrr!).
Using temp variables to avoid recalculations, especially to avoid
function calls.  Saving a lot of info in memory, even large arrays,
frequently allows significant speedups if the algorithm is well
thought through.  Unfortunately, many programmers still think in
terms of reducing memory use even when they have a lot more RAM than
they need.
When it comes to speeding up I/O, using unformatted (binary) files,
through the lowest-level OS calls available, to read and write large
chunks of data at a time, can do wonders.

Some of these things are redundant or ineffectual on some compilers
or machines, but they never hurt...

And yes, the 90/10 rule DOES frequently hold!

- Moshe Braner

Cornell Theory Center, 265 Olin Hall,
Cornell University, Ithaca, NY 14853
(607) 255-9401	(Work)
<braner@tcgould.tn.cornell.edu>		(INTERNET)
<braner@crnlthry> or <braner@crnlcam>	(BITNET)

--------------------------------
Why use AL if you can do it through a shell script?  It's 2000 times faster?
Well, just wait 10 years and the CPU will be that much faster... :-)

wgh@Grumpy.UUCP (William G. Hutchison) (10/05/88)

In article <6441@batcomputer.tn.cornell.edu>, braner@batcomputer.tn.cornell.edu (Moshe Braner) writes:
> Does anybody know about software to automatically convert FORTRAN code
> to C?  If it turns out good C, all the better, but if it makes ugly C
> code but does the conversion fast, one could think of it as a preprocessor,
> part of the compiler.  The code is (of course) scientific, numerical.

 Your timing is good: I have been collecting info about commercial
FORTRAN->C converters for our portation centers.  I am planning to post soon,
and I will do it sooner if there is interest. (I know about 3 now:
Rapitech, COBALT BLUE, and PROMULA.FORTRAN).

 How about some lateral thinking:
  some of the big hassles of translating FORTRAN->C are
   (1) origin-1 indexing
   (2) subroutine argument pass-by-reference
   (3) variables declared by default in-line (initial ijklmn) (this is easily
       solved by a 2-pass conversion, but why do all that work?)
   (4) equivalencing of variables
        ...
items (1), (2), and (3) can be solved more conveniently in C++ than in C:
   (1) by declaring your own array class (like class Vec in Stroustrup's
       C++ book)
   (2) by using C++ pass-by-reference
   (3) by making use of the fact that declarations may be sprinkled 
       through the executable part of the C++ program
   (4) equivalence is still a nuisance, but who said there was an ultimate
       programming language, anyway?

-- 
Bill Hutchison, DP Consultant	rutgers!cbmvax!burdvax!Grumpy!wgh
Unisys UNIX Portation Center	"What one fool can do, another can!"
P.O. Box 500, M.S. B121		Ancient Simian Proverb, quoted by
Blue Bell, PA 19424		Sylvanus P. Thompson, in _Calculus Made Easy_

henry@utzoo.uucp (Henry Spencer) (10/09/88)

In article <372@Grumpy.UUCP> wgh@Grumpy.UUCP (William G. Hutchison) writes:
>  some of the big hassles of translating FORTRAN->C are
>   (1) origin-1 indexing
>   (2) subroutine argument pass-by-reference
>   (3) variables declared by default in-line (initial ijklmn) (this is easily
>       solved by a 2-pass conversion, but why do all that work?)
>   (4) equivalencing of variables

I studied the problem at one point and concluded that all of these things
fell under the "nuisance" heading, not the "big problem" heading.  (One
caution:  this was a fairly brief study.)  The big hassles in translating
Fortran to C that I found were the complex, baroque formatted I/O and the
long list of non-trivial library functions that have to be provided.
-- 
The meek can have the Earth;    |    Henry Spencer at U of Toronto Zoology
the rest of us have other plans.|uunet!attcan!utzoo!henry henry@zoo.toronto.edu