[comp.lang.fortran] Highly Optimizable Subset of C

shenkin@cunixf.cc.columbia.edu (Peter S. Shenkin) (11/25/90)

In article <1990Nov23.181209.26366@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
>In article <1990Nov22.051446.1871@ccu.umanitoba.ca> salomon@ccu.umanitoba.ca (Dan Salomon) writes:
>> 3) There is a large body of well tested mathematical packages available
>>    for FORTRAN, that are not yet available in C.  For example the
>>    IMSL package.  However, this situation is improving for C.
>
>As others have mentioned, given f2c, this is a non-issue.  They are all
>available in C now.  (Sometimes they run faster that way, too...!)

And sometimes they run a little bit slower, but they seem to run at
*approximately* the same speed.  This raises the following questions.

The difficulty of optimizing C comes from C features (pointers) absent
in Fortran.  It has been observed that C programs translated from Fortran
using f2c run about as fast as the Fortran versions, which seems to imply that
(1) such translations do not use the problematic C features, and (2) if
the probematic C features are avoided, C compilers optimize about as well
as Fortran compilers;  in fact, much of the optimization goes on at the 
intermediate code level, doesn't it?

Now, many proposals have been made to improve C optimization:  the use
of "noalias", #pragmas, and so on.  But the above observations would seem to
imply that if the programmer simply restricts him/herself to a Fortran-like
"highly optimizable subset" of C, then he/she can expect Fortran-like
performance out of any reasonably good C compiler.

Now the questions are:  
	(1) How true is this?  
	(2) Just what is this highly optimizable subset of C? 
	(3) Whose compilers do best at this?  
Someone who could write guidelines for (2) and perform measurements of (3) 
would be performing a great service to the community.

Just as programmers writing Fortran for vector machines have learned how
to write code so as to optimize automatic vectorization by vectorizing
Fortran compilers, so, similarly, programmers writing C for numerical
applications can learn to write code so as to make it easy for a C compiler 
to optimize it.  Now, some might as, "Why bother?"  I.e., "If you're going to 
restrict yourself to a Fortran-like subset of C, why not just use Fortran?"
The answer, of course, is that only the numerical part of the code -- and 
most likely only a portion of the numerical part -- need be coded in 
this manner.  The rest can take full advantage of C's extra features.  And 
one need not be concerned with the non-portability of Fortran calls
from C routines, and vice-versa.

	-P.
************************f*u*cn*rd*ths*u*cn*gt*a*gd*jb**************************
Peter S. Shenkin, Department of Chemistry, Barnard College, New York, NY  10027
(212)854-1418  shenkin@cunixf.cc.columbia.edu(Internet)  shenkin@cunixf(Bitnet)
***"In scenic New York... where the third world is only a subway ride away."***

gwyn@smoke.brl.mil (Doug Gwyn) (11/25/90)

In article <1990Nov24.201731.3442@cunixf.cc.columbia.edu> shenkin@cunixf.cc.columbia.edu (Peter S. Shenkin) writes:
>... the above observations would seem to imply that if the programmer
>simply restricts him/herself to a Fortran-like "highly optimizable subset"
>of C, then he/she can expect Fortran-like performance out of any reasonably
>good C compiler.

It doesn't matter whether that is true or not; such crippled programming
would negate much of the advantage of using C in the first place.  Use
the right tool for the job and stop worrying about code optimization!

henry@zoo.toronto.edu (Henry Spencer) (11/25/90)

In article <1990Nov24.201731.3442@cunixf.cc.columbia.edu> shenkin@cunixf.cc.columbia.edu (Peter S. Shenkin) writes:
>The difficulty of optimizing C comes from C features (pointers) absent
>in Fortran.  It has been observed that C programs translated from Fortran
>using f2c run about as fast as the Fortran versions, which seems to imply that
>(1) such translations do not use the problematic C features, and (2) if
>the probematic C features are avoided, C compilers optimize about as well
>as Fortran compilers...

Actually, I think it is more a reflection of the low quality of the compilers
most of us use.  I don't think f2c makes any attempt to avoid trouble, given
that things like passing pointers to arrays are among the problematic areas,
and this is everywhere in any array-using program.  A really good Fortran
compiler, told to shoot for the Moon on optimization, should consistently
outdo a similar C compiler working from an f2c translation.
-- 
"I'm not sure it's possible            | Henry Spencer at U of Toronto Zoology
to explain how X works."               |  henry@zoo.toronto.edu   utzoo!henry

john@ghostwheel.unm.edu (John Prentice) (11/25/90)

Newsgroups: comp.lang.fortran
Subject: Re: Highly Optimizable Subset of C (was: Fortran vs. C for numerical work)
References: <1990Nov22.051446.1871@ccu.umanitoba.ca> <1990Nov23.181209.26366@zoo.toronto.edu> <1990Nov24.201731.3442@cunixf.cc.columbia.edu> <14568@smoke.brl.mil>
Organization: Amparo Corporation, Albuquerque, NM

In article <14568@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:
>
>It doesn't matter whether that is true or not; such crippled programming
>would negate much of the advantage of using C in the first place.  Use
>the right tool for the job and stop worrying about code optimization!

That is all well and good if you code only takes seconds to run.  Try
ignoring optimization in a code that runs hundreds of hours!

Here is a question for all the C people out there.  The following is a
simple Fortran code to compute the roots of a complex quadratic equation.
It is written in ANSI Fortran 77 and will run on any Fortran compiler that
conforms to the Fortran 77 standard.

      program root
c
c        solve a*z**2 + b*z + c = 0 for complex a,b,c,z
c
      real zero,one,two,three,four,five
      parameter (zero=0.0,one=1.0,two=2.0,three=3.0,four=4.0,
     *           five=5.0)
      complex a,b,c,root1,root2,disc,sqrt1
c
c        hard-wire in a,b,c to make it simple
c
      a=(one,zero)
      b=(-three,two)
      c=(five,-one)
c
c        calculate the discriminant
c
      disc=b**2-cmplx(four)*a*c
c
c        calculate the upper half plane square root of the
c        discriminant
c
      sqrt1=sqrt(disc)
c
c         now calculate the roots
c
      root1=(-b+sqrt1)/(cmplx(two)*a)
      root2=(-b-sqrt1)/(cmplx(two)*a)
c
c        print out result
c
      write (*,'('' the roots are: '',1p2e15.5/16x,1p2e15.5)') root1,
     *                                                         root2
c
      end

I now challenge C programmers to write an equivalent C code, using only
ANSI C features so that it will run using any ANSI C compiler.  I
am willing to bet that the Fortran code is much smaller and simplier.
I could easily have made this code even simplier by eliminating all
the cmplx() function calls which are not actually necessary.  Now,
this is a typical, if somewhat trivial, example of what people use
Fortran for.  So what is missing here that C provides me?  This is
exactly my point, Fortran works just fine for most things scientists
do, at least so far as numerical computation.  It isn't until
you stray away from numerical computation that C becomes useful
(bit manipulation for example is much better done in C than
Fortran).  In fact, I still would maintain that Fortran is an
easier language to learn and use for numerical computation, as my
example is intended to demonstrate.

John Prentice
Amparo Corporation
Albuquerque, NM

john@unmfys.unm.edu

paco@letaba.rice.edu (Paul Havlak) (11/26/90)

In article <1990Nov24.201731.3442@cunixf.cc.columbia.edu>,
shenkin@cunixf.cc.columbia.edu (Peter S. Shenkin) writes:
|> 
|> The difficulty of optimizing C comes from C features (pointers) absent
|> in Fortran.  It has been observed that C programs translated from Fortran
|> using f2c run about as fast as the Fortran versions, which seems to
imply that
|> (1) such translations do not use the problematic C features, and (2) if
|> the probematic C features are avoided, C compilers optimize about as well
|> as Fortran compilers;  in fact, much of the optimization goes on at the 
|> intermediate code level, doesn't it?
|> 
|> Now, many proposals have been made to improve C optimization:  the use
|> of "noalias", #pragmas, and so on.  But the above observations would seem to
|> imply that if the programmer simply restricts him/herself to a Fortran-like
|> "highly optimizable subset" of C, then he/she can expect Fortran-like
|> performance out of any reasonably good C compiler.
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I don't think this is true; it may seem to be true only because of shared
components between C and Fortran compilers from the same vendors.

On many systems where the C and Fortran compilers produce comparably good 
code (for programs written in the same style in both languages), they are
essentially the same compiler.  Either:

    1.  both have poor analysis and great peephole optimizations
        (the Fortran compiler has been adapted from the C compiler), or

    2.  both have great analysis and loop-level optimization
        (the C compiler has been adapted from the Fortran compiler,
         or they have been developed together).

(1) is the case for most "scalar" Unix systems.  (2) is the case for most 
vector and parallel Unix systems (Convex and Stardent, at least).  In the 
case of (2), much of the optimization is done at the source-level, or else
in an intermediate language that still admits loops and array subscripting.

Since the same compiler technology can be, should be, and often is applied
to the intersection of Fortran and C, I think the issue of which compilers 
are better is moot.

Peter's second question is a good topic for further investigation:

|>	(2) Just what is this highly optimizable subset of C? 

Hopefully, it includes some (but surely not all) elements of (C - Fortran). 
Compiler researchers (like me) are trying to enlarge the optimizable subset,
but it would be interesting to learn what current commercial compilers can
deal with.

--Paul