[comp.lang.fortran] Fortran vs. C for numerical work

bglenden@mandrill.cv.nrao.edu (Brian Glendenning) (11/21/90)

It is often stated that Fortran is better than C for numerical work. I
am interested in compiling both the standard list of such arguments
and what any counterarguments might be.

Please email me directly. I will post a summary to the net when the
replies have stopped trickling in.

(I am interested in things like pointer aliasing, optimizer
assumptions, inlining ability etc., not language wars).

Thank you

Brian
--
       Brian Glendenning - National Radio Astronomy Observatory
bglenden@nrao.edu          bglenden@nrao.bitnet          (804) 296-0286

ghe@comphy.physics.orst.edu (Guangliang He) (11/22/90)

In article <BGLENDEN.90Nov21003342@mandrill.cv.nrao.edu> bglenden@mandrill.cv.nrao.edu (Brian Glendenning) writes:
>
>It is often stated that Fortran is better than C for numerical work.
>[Some deleted text here]

It may not be true any more. A friend of mine brought a little fortran
program (It is two big do loops with some instrinsic function calculation in
the loop.) and the C translation of the fortran program. We compiled two
program on a IBM RISC System 6000/530 with xlc and xlf. To my surprise, the
excutable from C is faster than the excutable from Fortran by a few percent.

>Brian
>--
>       Brian Glendenning - National Radio Astronomy Observatory
>bglenden@nrao.edu          bglenden@nrao.bitnet          (804) 296-0286

Guangliang He
ghe@physics.orst.edu
                            Guangliang He

                            ghe@PHYSICS.ORST.EDU
                            hegl@ORSTVM.BITNET

paco@rice.edu (Paul Havlak) (11/22/90)

In article <21884@orstcs.CS.ORST.EDU>, ghe@comphy.physics.orst.edu
(Guangliang He) writes:
|> In article <BGLENDEN.90Nov21003342@mandrill.cv.nrao.edu>
bglenden@mandrill.cv.nrao.edu (Brian Glendenning) writes:
|> >
|> >It is often stated that Fortran is better than C for numerical work.
|> >[Some deleted text here]
|> 
|> It may not be true any more. A friend of mine brought a little fortran
|> program (It is two big do loops with some instrinsic function calculation in
|> the loop.) and the C translation of the fortran program. We compiled two
|> program on a IBM RISC System 6000/530 with xlc and xlf. To my surprise, the
|> excutable from C is faster than the excutable from Fortran by a few percent.

Presumably the Fortran-to-C translation preserved the array structure and 
indexing found in the original Fortran program.  A good compiler can optimize
Fortran, no matter what language it's written in.

But watch out if you use C in its full generality.  All but the simplest
pointers
will confuse a compiler and reduce its ability to optimize.  Heap-allocated 
dynamic data structures will reduce data locality and increase page faults.

To paraphrase Jack Schwartz:
"We don't know what the numerical programming language of the year 2000 will
be called, but it will look like Fortran."  (Well, at least the loops will.)

-----------
Paul Havlak
These are the opinions of a single grad student, 
working on compiler analysis of scientific programs.

salomon@ccu.umanitoba.ca (Dan Salomon) (11/22/90)

In article <BGLENDEN.90Nov21003342@mandrill.cv.nrao.edu> bglenden@mandrill.cv.nrao.edu (Brian Glendenning) writes:
>
>It is often stated that Fortran is better than C for numerical work. I
>am interested in compiling both the standard list of such arguments
>and what any counterarguments might be.

Here are the reasons that FORTRAN has not been replaced by C:

 1) C is definitely for wizards, not beginners or casual programmers.
    Usually people who are heavily into numerical work are not hacker
    types.  They are mathematicians, scientists, or engineers.  They
    want to do calculations, not tricky pointer manipulations.
    FORTRAN's constructs are more obvious to use, while even simple
    programs in C tend to be filled with tricks.  Even the fundamental
    operation of reading input is tricky in C, as shown by the recent
    postings on scanf, gets, and fgets.

 2) FORTRAN is dangerous to use, but not as dangerous as C.  For
    instance, most FORTRAN compilers have subscript checking as an
    option, while I have never encountered a C compiler with this
    feature.  The ANSI standard for function prototypes will
    give C an edge over FORTRAN in parameter mismatch errors, but
    that improvement is relatively recent and not enforced yet.

 3) There is a large body of well tested mathematical packages available
    for FORTRAN, that are not yet available in C.  For example the
    IMSL package.  However, this situation is improving for C.

 4) FORTRAN still gives the option of using single precision floating
    calculations for speed and space optimizations, whereas C forces
    some calculations into double precision.

 5) Optimizers are a non issue, since FORTRAN optimizers can match
    C optimizers on numerical expressions.


The reasons that C should replace FORTRAN for numerical work:

 1) C allows recursive functions, whereas portable FORTRAN doesn't.
    Recursive functions can often solve a problem more clearly
    than iterative methods, even if they are usually less efficient.

 2) FORTRAN has no dynamic array allocation.  Although C has dynamically
    allocated arrays, they are not trivial to describe or allocate.
-- 

Dan Salomon -- salomon@ccu.UManitoba.CA
               Dept. of Computer Science / University of Manitoba
	       Winnipeg, Manitoba, Canada  R3T 2N2 / (204) 275-6682

rosenkra@convex.com (William Rosencranz) (11/22/90)

In article <21884@orstcs.CS.ORST.EDU> ghe@comphy.PHYSICS.ORST.EDU.UUCP (Guangliang He) writes:
>It may not be true any more. A friend of mine brought a little fortran
>program (It is two big do loops with some instrinsic function calculation in
>the loop.) and the C translation of the fortran program. We compiled two
>program on a IBM RISC System 6000/530 with xlc and xlf. To my surprise, the
>excutable from C is faster than the excutable from Fortran by a few percent.

this says nothing about the *language*, only the *compilers*. actually, it
may not be that suprising when u consider that the 6000 runs unix and unix
needs a good C compiler. IBM may have spent more time on the C compiler than
the fortran compiler, figuring that more people may use C on the box than
fortran. believe it or not, i have seen similar behavior on crays (cft77 vs
scc [actually vc at the time], though also by only a few percent).

this does say that this particluar code seems well suited for C on the 6000
today. it implies that C is not a bad language for numerical work, if
performance is the criterion.

-bill
rosenkra@convex.com
--
Bill Rosenkranz            |UUCP: {uunet,texsun}!convex!c1yankee!rosenkra
Convex Computer Corp.      |ARPA: rosenkra%c1yankee@convex.com

avery@netcom.UUCP (Avery Colter) (11/22/90)

paco@rice.edu (Paul Havlak) writes:

>To paraphrase Jack Schwartz:
>"We don't know what the numerical programming language of the year 2000 will
>be called, but it will look like Fortran."  (Well, at least the loops will.)

Well sure, from what I can tell, the primary structures of C, Pascal,
and Fortran look pretty similar. After learning Fortran, C is coming
along pretty naturally. The pointers and structures and unions are
interesting new things, kinda like "Fortran with some bells and whistles".

-- 
Avery Ray Colter    {apple|claris}!netcom!avery  {decwrl|mips|sgi}!btr!elfcat
(415) 839-4567   "I feel love has got to come on and I want it:
                  Something big and lovely!"         - The B-52s, "Channel Z"

wvenable@spam.ua.oz.au (Bill Venables) (11/22/90)

paco@rice.edu (Paul Havlak) writes:
>   To paraphrase Jack Schwartz: 
>   "We don't know what the numerical programming language of the year 2000
>   will be called, but it will look like Fortran."
>

Actually this is an inversion rather than a paraphrase.  I recall it being
exactly the other way round:

 "We don't know what the numerical programming language of the year 2000
  will look like, but it will be called Fortran."

which seems all too distressingly plausible!

(Take that any way you like... :-)
--
  Bill Venables, Dept. of Statistics,  | Email: venables@spam.adelaide.edu.au
  Univ. of Adelaide,  South Australia. | Phone:                +61 8 228 5412

john@ghostwheel.unm.edu (John Prentice) (11/23/90)

In article <1990Nov22.051446.1871@ccu.umanitoba.ca> salomon@ccu.umanitoba.ca (Dan Salomon) writes:
>
>The reasons that C should replace FORTRAN for numerical work:
>
> 1) C allows recursive functions, whereas portable FORTRAN doesn't.
>    Recursive functions can often solve a problem more clearly
>    than iterative methods, even if they are usually less efficient.
>
> 2) FORTRAN has no dynamic array allocation.  Although C has dynamically
>    allocated arrays, they are not trivial to describe or allocate.
>-- 
>
It should be mentioned however, that the proposed Fortran 90 standard
does have allocatable arrays and most current generation Fortran compilers
already either allow for this or can be trivially linked to C to do it.
There are also recursive Fortran compilers available now and (if I remember
right) this is a feature of Fortran 90, should we live so long as to
actually see the standard adopted.

John Prentice
Amparo Corporation

john@ghostwheel.unm.edu (John Prentice) (11/23/90)

Another interesting point is that in studies done at Cray Research, they
found it took SIGNIFICANTLY longer for their programmers to learn C and the
number of errors generated in coding in C (as opposed to Fortran) was much
higher.  Anyone who has programmed in C should be familiar with that problem.
It is not a particularly straightforward language.  

I would also raise the point that neither Fortran nor C are really all that
great as scientific languages.  They are both old languages which lack alot
of the features one would like in a modern language, particularly in a
world where the future looks increasingly to be in parallelism.  I laughingly
agree that the scientific language of the future will be "called Fortran", but
I don't know that I necessarily believe it.  There is a whole generation
of programmers (and scientists) coming on line who don't particularly 
pledge allegence to Fortran.  Also, the traditional argument for not
ever throwing anything away in Fortran (i.e., there are billions of dollars
worth of old Fortran codes around, which is true I admit) will cease to
be that significant I expect in the future as we move away from serial
machines and as we concede that there is a finite lifetime to codes, even
ones written in Fortran.  This, by the way, is written from the perspective of 
a computational physicist who has authored two hydrodynamic codes, both of
which are on the order of 100,000 lines of code.  

John Prentice
Amparo Corporation
Albuquerque, NM

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (11/23/90)

In article <1990Nov22.051446.1871@ccu.umanitoba.ca>, salomon@ccu.umanitoba.ca (Dan Salomon) writes:
>     The ANSI standard for function prototypes will
>     give C an edge over FORTRAN in parameter mismatch errors, but
>     that improvement is relatively recent and not enforced yet.

There are several checkers around for Fortran:  several Fortran "lint"
programs (a perennial topic in this newsgroup), PFORT, something in
ToolPack.

>  3) There is a large body of well tested mathematical packages available
>     for FORTRAN, that are not yet available in C.

Given the existence of f2c, any math package available _in_ Fortran is
effectively available in C, and in UNIX and VMS at least, it isn't hard
to call anything that could have been called from Fortran from C.

>  4) FORTRAN still gives the option of using single precision floating
>     calculations for speed and space optimizations, whereas C forces
>     some calculations into double precision.

This is not true of ANSI C, and many vendors provided something like
Sun's "-fsingle" as an option for years before that.  It is also worth
noting that on a number of machines, single-precision calculations are
not faster than double precision.

>  1) C allows recursive functions, whereas portable FORTRAN doesn't.
>     Recursive functions can often solve a problem more clearly
>     than iterative methods, even if they are usually less efficient.

Solved in Fortran Extended.

>  2) FORTRAN has no dynamic array allocation.  Although C has dynamically
>     allocated arrays, they are not trivial to describe or allocate.

Solved in Fortran Extended.  Some vendors have provided pointers of some
sort for several years, and it is easy to fake on some systems.
-- 
I am not now and never have been a member of Mensa.		-- Ariadne.

gwyn@smoke.brl.mil (Doug Gwyn) (11/24/90)

In article <1990Nov22.051446.1871@ccu.umanitoba.ca> salomon@ccu.umanitoba.ca (Dan Salomon) writes:
>The reasons that C should replace FORTRAN for numerical work:

3)  C has decent support for nontrivial data structures, while they are
sufficiently painful to emulate in Fortran that few Fortran programmers
even try.

Most really interesting algorithms are associated with interesting data
structures.

henry@zoo.toronto.edu (Henry Spencer) (11/24/90)

In article <1990Nov22.051446.1871@ccu.umanitoba.ca> salomon@ccu.umanitoba.ca (Dan Salomon) writes:
>    ...Even the fundamental
>    operation of reading input is tricky in C, as shown by the recent
>    postings on scanf, gets, and fgets.

Actually, Fortran has much the same problems in this area:  the facilities
for formatted input make little provision for clean error recovery.  This
doesn't show up very much because the stereotypical use of Fortran is for
batch jobs, not interaction.

> 2) FORTRAN is dangerous to use, but not as dangerous as C.  For
>    instance, most FORTRAN compilers have subscript checking as an
>    option, while I have never encountered a C compiler with this
>    feature.  The ANSI standard for function prototypes will
>    give C an edge over FORTRAN in parameter mismatch errors, but
>    that improvement is relatively recent and not enforced yet.

One might ask what compilers you are using.  C compilers have trouble doing
subscript checking because of the complexity of C pointers, but debugging
compilers/interpreters which do this checking *do* exist.  And there are
already many C compilers which implement prototypes.

> 3) There is a large body of well tested mathematical packages available
>    for FORTRAN, that are not yet available in C.  For example the
>    IMSL package.  However, this situation is improving for C.

As others have mentioned, given f2c, this is a non-issue.  They are all
available in C now.  (Sometimes they run faster that way, too...!)

> 4) FORTRAN still gives the option of using single precision floating
>    calculations for speed and space optimizations, whereas C forces
>    some calculations into double precision.

Not any more.
-- 
"I'm not sure it's possible            | Henry Spencer at U of Toronto Zoology
to explain how X works."               |  henry@zoo.toronto.edu   utzoo!henry

rh@smds.UUCP (Richard Harter) (11/24/90)

In article <21884@orstcs.CS.ORST.EDU>, ghe@comphy.physics.orst.edu (Guangliang He) writes:
> In article <BGLENDEN.90Nov21003342@mandrill.cv.nrao.edu> bglenden@mandrill.cv.nrao.edu (Brian Glendenning) writes:

> It may not be true any more. A friend of mine brought a little fortran
> program (It is two big do loops with some instrinsic function calculation in
> the loop.) and the C translation of the fortran program. We compiled two
> program on a IBM RISC System 6000/530 with xlc and xlf. To my surprise, the
> excutable from C is faster than the excutable from Fortran by a few percent.

This probably has nothing to do with the merits of C versus Fortran and
has everything to do with the merits of the compilers involved.  In the
UNIX world C compilers are often optimized to a gnats posterior whereas
Fortran compilers are often relatively primitive.  The converse is true
in environments where Fortran is big and C is just another minor language.

Fundamentally Fortran compilers can be faster because the Fortran language
specification forbids aliasing (but makes the user responsible for 
making sure that it is not present) whereas C has to deal with it.
-- 
Richard Harter, Software Maintenance and Development Systems, Inc.
Net address: jjmhome!smds!rh Phone: 508-369-7398 
US Mail: SMDS Inc., PO Box 555, Concord MA 01742
This sentence no verb.  This sentence short.  This signature done.

steve@taumet.com (Stephen Clamage) (11/25/90)

ghe@comphy.physics.orst.edu (Guangliang He) writes:

>In article <BGLENDEN.90Nov21003342@mandrill.cv.nrao.edu> bglenden@mandrill.cv.nrao.edu (Brian Glendenning) writes:
>>
>>It is often stated that Fortran is better than C for numerical work...

>It may not be true any more. A friend of mine brought a little fortran
>program ...
>the excutable from C is faster than the excutable from Fortran by a few percent

What we have here is an example of one program compiled by one FORTRAN
compiler and a translation of that program compiled by one C compiler.
Comparison of the execution speeds of the two programs on one machine
cannot lead to any valid conclusions about the relative utility of the
two languages for numerical work.

FORTRAN has a large body of standard libraries for numerical work whose
operation and reliability have been well-tested for many years.  This
cannot be said for C, although some multi-language evironments allow
such FORTRAN libraries to be called from C programs.

The utility of a language must be judged by more criteria than just
execution speed of one sample program.
-- 

Steve Clamage, TauMetric Corp, steve@taumet.com

gt4512c@prism.gatech.EDU (BRADBERRY,JOHN L) (11/26/90)

On Nov 25 18:55:29 EST 1990 in Article <34828> , (Dan Salomon)
<salomon@ccu.umanitoba.ca> writes:

>Here are the reasons that FORTRAN has not been replaced by C:
>
> 1) C is definitely for wizards, not beginners or casual
>    programmers.
>    Usually people who are heavily into numerical work are not
>    hacker
>    types.  They are mathematicians, scientists, or engineers.
  
I agree! The group described happens to represent most of the
clients I deal with in Radar and Antenna applications. As in the
original derivation of the name, these people are FOR-mula TRAN-
slators! 
>
>..(text deleted)...
>
>The reasons that C should replace FORTRAN for numerical work:
>
>..(text deleted)...
>
  3) FORTRAN ANSI standards take entirely too long to pass through
committees and there is little or no effort made to 'purge' old
style Watfor/V methods from 'current' teaching texts! I wouldn't
be surprised to see references to 'card decks' in 'current' FORTRAN
books into the year 3000 (programmers live forever!)...

Sorry about the digression!
-- 
John L. Bradberry        |Georgia Tech Research Inst|uucp:..!prism!gt4512c
Scientific Concepts Inc. |Microwaves and Antenna Lab|Int : gt4512c@prism
2359 Windy Hill Rd. 201-J|404 528-5325 (GTRI)       |GTRI:jbrad@msd.gatech.
Marietta, Ga. 30067      |404 438-4181 (SCI)        |'...is this thing on..?'

kjartan@zig.inria.fr (Kjartan Emilson) (11/26/90)

This discussion seems to pop-up once in a while regurlarly on this
group. The reason seems to be that some people in science are
suddenly confronted with the reality of the workstations, based mainly
on UNIX, and as such, heavily biased towards the C language. They ask:
What is this ! I have programmed in fortran, just like my professor did,
just like his/her professor did, etc. etc.. Why should I switch to another language ?

Then there are other people, whith a more technical background, who come
up with the following arguments: 1. Fortran has been around longer than
C, so fortran compilers are more efficient. 2. Most of scientific
libraries going around are in fortran.

I am working in an institute which main research works are dynamical
systems, both from a mathematical point of view as from the point of
view of physics. The systems studied are mostly derived from fluid
dynamics, and as such are very numerically expensive systems.

Graduate students here get an extensive formation in computational
techniques, and recently we decided to convert totally to the use
of the C and the C++ language as our main programming languages. Our reasons for this are the following:

1. Computers are getting more and more powerful, and the importance
of speed is decreasing, i.e today on a SUN Sparcstation or an IBM Risc 6000 you can get computational power, that some years ago was only
available, on larger mainframes, using batch.

This idea of batch job, compared to interaction on a workstation, is
very important, when we say that the importance of speed is decreasing.
When you run a job in batch on a CRAY 2 computer, you will get your
result in a time scale ranging from a few hours to a few days, even though the actual job only takes a few minutes in running. If instead
you write a program which runs *interactively* on a workstation on
your desktop, you actually see what is happening and you can change
the parameters of the system accordingly, immediatly.

2. Another reason is the concept of abstraction. A language such as
C++ (and C if you are careful) offers the possibility of creating
more abstract data types and operation, which are more adapted to
the theory in which you work. This makes a bridge between the theorists
and the pure and hard numerics. There was a time, when there was a
clear distinction between those who formulated the problems and those
who actually implemented them on the computers. This distinction is
very bad, and can lead to many kinds of frictions. With the evolution
of computer science, the aim should be, that anybody who wants to use
the power of a computer to solve a problem should be able to do it without being an expert in the byte ordering of a given machine, or god knows what.

The cursus we offer to our gradutate students is the following:

1. Study of the C language
2. Study of X11, Xt and Athena widgets environment
3. Study of some object oriented language, such as C++

Some people did not believe that it was possible to teach pure and
hard physicists these programming language, but the result is quite
the opposite! and more than that the enjoy it...

In short: Fortran was a language designed to make calculations in
batch on some large mainframe, and is not adapted to be on your desktop.

The C language was not actually designed either to be on your desktop,
but the evolution has been that it is there, and that it works well
with UNIX, and works well with X11, the standard in networked graphics and interactive applications.

Computers are changing, and languages must change also.

-Kjartan

------------------------------------------------------------------
Kjartan Pierre Emilsson
Institut Non-lineaire de Nice-Sophia Antipolis
Parc Valrose
Nice
France e-mail: kjartan@zig.inria.fr

hp@vmars.tuwien.ac.at (Peter Holzer) (11/27/90)

paco@rice.edu (Paul Havlak) writes:

>But watch out if you use C in its full generality.  All but the simplest
>pointers
>will confuse a compiler and reduce its ability to optimize. 

Even simple pointer can confuse a compiler on brain-dead
architectures.
I had a little program that did something with two arrays

item * a, * b;
int i;
/* init a, b */
for (i = 0; i < N; i ++) {
	a [i] = f (a [i], b [i]);
}

After I optimized it to:

item * a, * b, * pa, * pb;
/* init a, b */
for (pa = a + N, pb = b + N;
     pa >= a; 	/* I know that is not portable, but it works
	     	 * with this compiler if sizeof (item) <= 8
		 */
     pa --, pb --) {
	* pa = f (* pa, * pb);
}

it ran 80% slower, because on this architecture (80286) far
pointers don't fit into registers whereas the integer i does,
and indexing is relatively fast. 

To make things more complicated: On the same computer program 2
is faster than program 1 if both are compiled with near
pointers.

Morale: Write your code readable. Usually the compiler knows
more about the architecture of the target computer than the
programmer (especially if the program has to be compiled on lots
of different computers) and can therefore optimize better.

--
|    _  | Peter J. Holzer                       | Think of it   |
| |_|_) | Technical University Vienna           | as evolution  |
| | |   | Dept. for Real-Time Systems           | in action!    |
| __/   | hp@vmars.tuwien.ac.at                 |     Tony Rand |

gt4512c@prism.gatech.EDU (BRADBERRY,JOHN L) (11/27/90)

In Article 4291 (Kjartan Emilson <kjartan@zig.inria.fr>) writes:

>
>Graduate students here get an extensive formation in computational  
>techniques, and recently we decided to convert totally to the use of the 
>C and the C++ language as our main programming languages...   

Good for you! It sounds like a good fit for your particular research
area. However, I would question anyone who would adopt exclusive 
usage of ANY form of C for numerical work! If you adopt the philosophy 
that C and FORTRAN represent software TOOLS, C in any form represents a 
'metric' tool not easily accepted (or practical) for all jobs - NO 
LANGUAGE IS!

>
>This idea of batch job, compared to interaction on a workstation, is  
>very important... even though the actual job only takes a few minutes 
>in running. If instead you write a program which runs *interactively* 
>on a workstation on your desktop, you actually see what is happening 
>and you can change the parameters of the system accordingly, immediately.  
>

Make a change in C 'immediately' yes! But are you talking about productive
code and debug cycles being done quickly by 'non-computer types' ??? 
Even the most experienced C programmer on any platform may question this.  

>
>2. Another reason is the concept of abstraction.  A language such as C++ 
>(and C if you are careful) offers the possibility of creating  more abstract 
>data types and operation, which are more adapted to the theory in which 
>you work....  
>

Yes! Yes! Yes!...

>Some people did not believe that it was possible to teach pure and  
>hard physicists these programming language, but the result is quite the 
>opposite! and more than that the enjoy it...  
>

I love what C and C++ may ALLOW you to do, but enjoy it (Ughhhhhh)...

>In short:  Fortran was a language designed to make calculations in batch 
>on some large mainframe, and is not adapted to be on your desktop.  
>

Says who??? There are SEVERAL PC based compilers and tools that are
every bit as good or better than their mainframe counterparts!

For the most part I agree with some of your points, but I wonder how many
colleges and universities are creating theoretical scientists who also
happen to be C wizards ????
-- 
John L. Bradberry        |Georgia Tech Research Inst|uucp:..!prism!gt4512c
Scientific Concepts Inc. |Microwaves and Antenna Lab|Int : gt4512c@prism
2359 Windy Hill Rd. 201-J|404 528-5325 (GTRI)       |GTRI:jbrad@msd.gatech.
Marietta, Ga. 30067      |404 438-4181 (SCI)        |'...is this thing on..?'

tang@venus.ycc.yale.edu (11/27/90)

Kjartan wrote that 

> 1. Computers are getting more and more powerful, and the importance
> of speed is decreasing, i.e today on a SUN Sparcstation or an IBM Risc 6000 you can get computational power, that some years ago was only 
> available, on larger mainframes, using batch.
> 
> This idea of batch job, compared to interaction on a workstation, is 
> very important, when we say that the importance of speed is decreasing.
> When you run a job in batch on a CRAY 2 computer, you will get your
> result in a time scale ranging from a few hours to a few days, even though the actual job only takes a few minutes in running.  If instead
> you write a program which runs *interactively* on a workstation on
> your desktop, you actually see what is happening and you can change
> the parameters of the system accordingly, immediatly.
> 
      Excuse me if my exposure to numerical computations is limited.  It
seems that the demand for speed is never satisfied.  It may be true that
some of the yesterdays problems could be solved quite readily on
platforms such as Spacstations or even PC's.  However, researchers never
ceased to ask for more.  Typical examples are higher resolution (more
grid points, etc), better physical representations which inevitably
require more computation.  For those who are devoted to pushing the
frontiers of numerical computation, they are more willing  to spend
efforts on the physics rather than on the langugauge since a language is
only a tool, it is incapable of yielding new ideas, but physics is a
"living species".  I agree completely that for teaching college
students, interactive programs are "better" than programs which run in
batch mode, but what difference does it make if a program takes a few
hours of run time on a Cray?  You will be bored to death just to sit in
front of a terminal.

> 2. Another reason is the concept of abstraction.  A language such as
> C++ (and C if you are careful) offers the possibility of creating 
> more abstract data types and operation, which are more adapted to 
> the theory in which you work.  This makes a bridge between the theorists
> and the pure and hard numerics.  There was a time, when there was a
> clear distinction between those who formulated the problems and those 
> who actually implemented them on the computers.  This distinction is
> very bad, and can lead to many kinds of frictions.  With the evolution
> of computer science, the aim should be, that anybody who wants to use
> the power of a computer to solve a problem should be able to do it without being an expert in the byte ordering of a given machine, or god knows what.
> 
      The way I perceive languages is that all languages are
more-or-less the same.  They differ in the way they alter the thinking
of a programmer.  The reasoning I offer is that a program must be
compiled into machine code, at which point the identity of the original
high-level langugaue is lost.  An array in Fortran or a pointer in c is
translated into machine code.  In fact, I witnessed a case in which a
piece of Fortran code was so well written that it was just as good as
its assembler equivalence, even though it is commonly assumed that codes
written in assembler are faster.  Hence, the goodness or badness of a
language is therefore compiler dependent.  But I would like to make one
more comment.  As pointed out by other defenders of Fortran, Fortran
does not have a steep initial learning curve as opposed to c as far as
numerical computation is concerned.  It seems that many c programmers
have suffered so much that they want the rest of the world to share it. 

> 
> In short:  Fortran was a language designed to make calculations in
> batch on some large mainframe, and is not adapted to be on your desktop.
> 
> The C language was not actually designed either to be on your desktop,
> but the evolution has been that it is there, and that it works well
> with UNIX, and works well with X11, the standard in networked graphics and interactive applications.
> 
> Computers are changing, and languages must change also.
> 
      It is horrible to imagine that one day in the future UNIX was no
longer the favorable operating system, what language would c-programmers
be using then?  

Tang

bglenden@mandrill.cv.nrao.edu (Brian Glendenning) (11/27/90)

A few days ago I posted a note asking:

>It is often stated that Fortran is better than C for numerical work. I
>am interested in compiling both the standard list of such arguments
>and what any counterarguments might be.

Here is my summary of the responses I received. If anyone wants to
read the raw responses please email me and I will be happy to forward
them (160k+!). Many thanks to all the respondents who so generously
answered my query.

1. Pointer aliasing.
     SUBROUTINE FOO(A,B)           void foo(a,b)
     REAL A(*), B(*)               float *a, *b;

The fortran standard requires that A,B be unaliased. In C a,b may well
be aliased, and there is no portable way to say that they are unaliased.

Compilers on serious vector machines (at least) will have ways of
declaring unaliased pointers. The programmer can make a mistake doing
this, but of course the programmer can also really pass aliased arrays
in Fortran as well. Although I understand that "noalias" is hated by C
purists, I wish that it had made it into the ANSI standard. (Maybe I
just don't understand the arguments against it).

2. C has no conformant arrays, i.e. you can't do the equivalent of:

      SUBROUTINE FOO(A, M, N)
      REAL A(M,N)

In C you either have to do your own indexing *(a + j*m +i) or have
pointers to pointers *(*(a + i) + j). You can in either case
use a macro expansion ARRAY(a,i,j) to take some of the sting out of
the syntax.

3. In fortran functions like sin, cos, ** are intrinsic.

I think that ANSI C has a method by which compilers may make sin, cos
etc intrinsic, but I don't remember how it works. Maybe a wizardly
followup could answer this question.

A ** builtin _is_ handy.

4. Fortran has complex variables.

If you need to do a lot of complex arithmetic this might be a show
stopper unless you have a good source of C complex arithmetic
routines. Even then it is not going to be as convenient as in Fortran.

5. There are many numerical libraries written for Fortran.

This is likely not a fundamental problem on any modern system
scientific programmers would use, e.g. either use f2c to convert it or
link in the fortran, but it does impose either some programmer time
overhead in the translation or make the linking process (at least) a
bit non-portable.

6. C can ignore the placement of parentheses
7. "C has too many system dependent aspects (e.g. round up or down when
   dividing negative integers)."

Both of these need to be understood by a scientific programmer so they
can work around them.

8. C does everything in double.

Not (necessarily) with ANSI C.

======

I will not go into the reasons why C was claimed to be better for
numerical work Fortran (basically better data typing, control
structures, dynamic memory etc).

_MY_ sumarry-summary is as follows:

I conclude that for scientific programming there are no overwhelming
reasons not to use C unless you do a lot of complex arithmetic.

Personally I don't consider the "pointer aliasing defeats optimizers"
to be too serious. Anyone who cares about speed is going to profile
their code, and at that time it shouldn't be too difficult to tell the
compiler what is not aliased in the "hot spot" routines.

Whether or not the switch to C is worthwhile will depend on whether
the above quirks in C outweigh the benefits of having "more modern"
data typing and control structures. Fortran is probably more portable (*)
and will run faster without tweaking. On the other hand Fortran may be
harder to maintain, and it is a poor fit to algorithms that are best
expressed with types more involved than n-dimensional arrays.

(*) I realize that it is not that hard to write portable C. I think
it's fair to say that it's easier to write portable Fortran for
numeric work, though.

When and if we have fortran9? available the story may be different
(but it's getting pretty late in the day for F9x to come riding over
the horizon to save us).

Brian

--
       Brian Glendenning - National Radio Astronomy Observatory
bglenden@nrao.edu          bglenden@nrao.bitnet          (804) 296-0286

olstad@uf.msc.umn.edu (Ken Olstad) (11/27/90)

paco@rice.edu (Paul Havlak) writes:
>   To paraphrase Jack Schwartz: 
>   "We don't know what the numerical programming language of the year 2000
>   will be called, but it will look like Fortran."
>

wvenable@spam.ua.oz.au (Bill Venables) writes:
>
> "We don't know what the numerical programming language of the year 2000
>  will look like, but it will be called Fortran."
>

I've always heard it closer to the latter, but attributed to Seymour
Cray.  Does anybody here really know where this came from?

-Ken
olstad@msc.edu

dik@cwi.nl (Dik T. Winter) (11/27/90)

In article <649.275140d0@venus.ycc.yale.edu> tang@venus.ycc.yale.edu writes:
 > Kjartan wrote that 
 > > you write a program which runs *interactively* on a workstation on
 > > your desktop, you actually see what is happening and you can change
 > > the parameters of the system accordingly, immediatly.
 > > 
 >                                               However, researchers never
 > ceased to ask for more.  Typical examples are higher resolution (more
 > grid points, etc), better physical representations which inevitably
 > require more computation.
As I see it Kjartan adequately describes the researchers viewpoint (for some
suitable definition of researcher) while Tang describes the viewpoint of the
user that is running production programs (like aerospace industries, chemical
industries etc.).  Kjartan should keep in mind that industry (where his
students will go eventually) have other requirements, and, whether they want
or not, they have to learn Fortran.  Large scale computers, that typically
speak mostly Fortran, are sold to aerospace, chemical and automotive industries
(and also financial companies in Japan, so I hear).  There are exceptions of
course, like all (European) national institutes that deal with CERN, and many
national institutes in the US, that are more in the production field as far
as computing is concerned than in the research field.  Their research is about
the results; much less about the algorithms.  I know that at least for the
Dutch national aerospace research laboratory, doing their programs
interactively on a workstation is out of the question.  Who is prepared to sit
many days after a workstation when his nearby NEC SX/2 will do the job in just
a few hours?  If turn around time on super computers make their use not
feasable, it just means that their access is inadequate.
--
dik t. winter, cwi, amsterdam, nederland
dik@cwi.nl

hp@vmars.tuwien.ac.at (Peter Holzer) (11/27/90)

bglenden@mandrill.cv.nrao.edu (Brian Glendenning) writes:

>A few days ago I posted a note asking:

>>It is often stated that Fortran is better than C for numerical work. I
>>am interested in compiling both the standard list of such arguments
>>and what any counterarguments might be.

>Here is my summary of the responses I received. 

A few remarks to the summary:

>2. C has no conformant arrays, i.e. you can't do the equivalent of:

>      SUBROUTINE FOO(A, M, N)
>      REAL A(M,N)

>In C you either have to do your own indexing *(a + j*m +i) or have
>pointers to pointers *(*(a + i) + j). You can in either case
		      ^^^^^^^^^^^^^^^ You would write that as 
a [i][j] ordinarily, which is a rather nice syntax :-)

>use a macro expansion ARRAY(a,i,j) to take some of the sting out of
>the syntax.

>3. In fortran functions like sin, cos, ** are intrinsic.

>I think that ANSI C has a method by which compilers may make sin, cos
>etc intrinsic, but I don't remember how it works. Maybe a wizardly
>followup could answer this question.

No special method. An ANSI C compiler may treat any function
defined in the standard as intrinsic if the header file that
defines the function has been included.

>A ** builtin _is_ handy.

>4. Fortran has complex variables.

>If you need to do a lot of complex arithmetic this might be a show
>stopper unless you have a good source of C complex arithmetic
>routines. Even then it is not going to be as convenient as in Fortran.

Would be nice sometimes. But other special types are also handy
sometimes (e.g. very long integers), and you cannot have
everything. If you want C++ you know where to find it.

>6. C can ignore the placement of parentheses

Not anymore. The standard says that the compiler may regroup
expressions only if it does not change the result.

--
|    _  | Peter J. Holzer                       | Think of it   |
| |_|_) | Technical University Vienna           | as evolution  |
| | |   | Dept. for Real-Time Systems           | in action!    |
| __/   | hp@vmars.tuwien.ac.at                 |     Tony Rand |

henry@zoo.toronto.edu (Henry Spencer) (11/28/90)

In article <BGLENDEN.90Nov26162335@mandrill.cv.nrao.edu> bglenden@mandrill.cv.nrao.edu (Brian Glendenning) writes:
>3. In fortran functions like sin, cos, ** are intrinsic.
>
>I think that ANSI C has a method by which compilers may make sin, cos
>etc intrinsic, but I don't remember how it works...

It's really very simple:  they are allowed to be intrinsic, essentially.
There is no complexity or mystery.  C and Fortran are no longer different
in this regard, except insofar as the Fortran libraries are larger.

>6. C can ignore the placement of parentheses

Not any more.  This too is an obsolete argument.

>7. "C has too many system dependent aspects (e.g. round up or down when
>   dividing negative integers)."

A lot of the purportedly "system dependent aspects" also exist in Fortran.
This particular one doesn't, but that is a concession to efficiency in an
area that rarely matters to programmers.  What it is, in fact, is permission
to the hardware to do things the wrong way because Fortran wants it that way!
People who use this argument are missing an important point:  C may have
system-dependent aspects, but well-crafted C programs do not.  Those who
believe that Fortran programs are automatically system-independent have
not tried to port very many amateur-written Fortran programs.  (Programs
written by competent professionals avoid these problems regardless of the
choice of language.)
-- 
"I'm not sure it's possible            | Henry Spencer at U of Toronto Zoology
to explain how X works."               |  henry@zoo.toronto.edu   utzoo!henry

khb@chiba.Eng.Sun.COM (chiba) (11/28/90)

In article <1990Nov27.175023.26039@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:

   >
   >I think that ANSI C has a method by which compilers may make sin, cos
   >etc intrinsic, but I don't remember how it works...

   It's really very simple:  they are allowed to be intrinsic, essentially.
   There is no complexity or mystery.  C and Fortran are no longer different
   in this regard, except insofar as the Fortran libraries are larger.

But in ANSI C one is still stuck with ERRNO which makes computing
things out of order, and/or at the same time, far more entertaining
and challenging. 

--
----------------------------------------------------------------
Keith H. Bierman    kbierman@Eng.Sun.COM | khb@chiba.Eng.Sun.COM
SMI 2550 Garcia 12-33			 | (415 336 2648)   
    Mountain View, CA 94043

ds@juniper09.cray.com (David Sielaff) (11/28/90)

In article <1990Nov27.175023.26039@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
>In article <BGLENDEN.90Nov26162335@mandrill.cv.nrao.edu> bglenden@mandrill.cv.nrao.edu (Brian Glendenning) writes:
>>3. In fortran functions like sin, cos, ** are intrinsic.
>>
>>I think that ANSI C has a method by which compilers may make sin, cos
>>etc intrinsic, but I don't remember how it works...
>
>It's really very simple:  they are allowed to be intrinsic, essentially.
>There is no complexity or mystery.  C and Fortran are no longer different
>in this regard, except insofar as the Fortran libraries are larger.
>

At the risk of beating a dead horse, the only gotcha here is that
if `static double sin(double);' is seen in a compilation unit, the
compiler had better wait to find the definition in the compilation
unit, and not treat calls to sin() as an intrinsic (after it has seen
the declaration, anyway).  This is a situation which does not arise
in FORTRAN.

Dave Sielaff
Cray Research, Inc.

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (11/28/90)

Summary: Each of the disadvantages that Brian mentions for C as compared
to Fortran should disappear within a few years without any further
standardization. In the same areas, Fortran has many big disadvantages,
each of which will be quite difficult to fix.

In article <BGLENDEN.90Nov26162335@mandrill.cv.nrao.edu> bglenden@mandrill.cv.nrao.edu (Brian Glendenning) writes:
> 1. Pointer aliasing.

If you allow interprocedural analysis, the compiler can detect the
``aliasing signature'' of every call to a function, and generate
separate versions that run as fast as possible for each signature.
If, in fact, no aliasing is present, the compiler will generate just one
version. So this is only a temporary disadvantage---when C compilers get
smarter, they'll be able to generate just as fast code as Fortran (at
the expense of somewhat slower compilation).

In my experience, pointer aliasing in C is an advantage. I don't have
to, e.g., write two versions of a large-number addition routine, one
where the output overlaps the input and one where it doesn't. The price
I pay for this is that most current compilers only generate the slower
version (so in speed-critical applications I'm now forced to write two
versions, one using some unportable assertion like ``noalias''). But a
smarter compiler could do the job. The extra compile time is worth the
programming time saved.

This disadvantage of Fortran (that it doesn't allow aliasing at all) is
much more difficult to fix. Even if a compiler appears that allows
aliasing, code taking advantage of it won't be portable. So nobody'll
bother.

Part of the flame war in comp.lang.misc is over my contention that the
compiler can do a respectable job of aliasing detection *without*
interprocedural analysis---by generating a small, static set of
signatures and working with those. But this is a side issue.

> 2. C has no conformant arrays, i.e. you can't do the equivalent of:
  [ ... ]
> In C you either have to do your own indexing *(a + j*m +i) or have
> pointers to pointers *(*(a + i) + j).

I agree that this is a problem. However, the double-pointer solution
usually allows faster access than the standard method of storing arrays,
doesn't waste much memory, allows more flexibility, gets around the
memory management problems of some small architectures, and lets you use
a[i][j] for *(*(a + i) + j), all within current C. It is extremely
difficult to do this efficiently in Fortran, and it will continue to be.

In some applications, though, dynamically sized flat multidimensional
arrays may be better than multiple-pointer arrays. At least one very
popular current compiler, namely gcc, lets you declare arrays the way
you want.

> 3. In fortran functions like sin, cos, ** are intrinsic.

Fortran and ANSI C treat intrinsics the same way.

> 4. Fortran has complex variables.

Given the number of inlining compilers, this is at most an issue of what
syntax you prefer. Many programmers don't like infix notation for
operations that are generally simulated by the compiler rather than
executed directly by the machine. For them, Fortran is at a
disadvantage here. This is hardly a major issue, though.

> 5. There are many numerical libraries written for Fortran.

Which, given f2c, is no longer an issue.

> 6. C can ignore the placement of parentheses

ANSI clamps down on this.

> 7. "C has too many system dependent aspects (e.g. round up or down when
>    dividing negative integers)."

Huh? Fortran has system-dependent aspects too.

Fortran does not have standard I/O powerful enough to, e.g., respectably
checkpoint a computation. This is a huge disadvantage. It will not be
fixed except by further standardization.

> 8. C does everything in double.

ANSI fixes this cleanly.

> (*) I realize that it is not that hard to write portable C. I think
> it's fair to say that it's easier to write portable Fortran for
> numeric work, though.

In my experience with numerical code, it has been extremely easy to
write portable C. Numerical programs use input and output too.

> When and if we have fortran9? available the story may be different
> (but it's getting pretty late in the day for F9x to come riding over
> the horizon to save us).

Fortran 8X: the teenage mutant ninja offspring of Modula-2 and Ada. A
few years ago I was in a room full of Fortran programmers listening to a
presentation about the language. They hated almost everything about it.
Now it's Fortran 9X and still somewhere over the horizon.

C is a de facto standard. ANSI C is a standard. So is Fortran 77. The
new Fortran is not. C, ANSI C, and Fortran 77 are the language variants
that will be widely used over the next several years, and those are the
languages we should be comparing.

---Dan

wsb@boise.Eng.Sun.COM (Walt Brainerd) (11/28/90)

In article <3017@uc.msc.umn.edu>, olstad@uf.msc.umn.edu (Ken Olstad) writes:
> > "We don't know what the numerical programming language of the year 2000
> >  will look like, but it will be called Fortran."
> 
> Cray.  Does anybody here really know where this came from?
> 

This has been attributed to many, many people (all very
authoritatively!).  The most common one I hear is Tony Hoare,
but a friend of mine claims I said something like this long,
long ago at an X3J3 meeting.  So long ago that I can't claim
or deny it 8^).  I think it's hopeless to figure out who
first said something like this.
--
Walt Brainerd        Sun Microsystems, Inc.
wsb@eng.sun.com      MS MTV 5-40
                     Mountain View, CA 94043
                     415/336-5991

ttw@lanl.gov (Tony Warnock) (11/28/90)

Re:
>> 2. C has no conformant arrays, i.e. you can't do the equivalent of:
>  [ ... ]
>> In C you either have to do your own indexing *(a + j*m +i) or have
>> pointers to pointers *(*(a + i) + j).
 
Dan Bernstein writes:
 
>I agree that this is a problem. However, the double-pointer solution
>usually allows faster access than the standard method of storing arrays,
>doesn't waste much memory, allows more flexibility, gets around the
>memory management problems of some small architectures, and lets you use
>a[i][j] for *(*(a + i) + j), all within current C. It is extremely
>difficult to do this efficiently in Fortran, and it will continue to be.
 
    What is it that Dan thinks is difficult? He uses only "this"
    so it is not clear what he has in mind. Fortran has allowed
    one to write a(i,j,k,l,m,n,p) for years. If done inside a
    loop where one of the supscripts is varying, there is only a
    single addition (if that) to do indexing. If the subscripts
    are chosed at random, how does having a pointer help: the
    necessary offset must still be gotten somehow.

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (11/28/90)

In article <7097@lanl.gov> ttw@lanl.gov (Tony Warnock) writes:
> >I agree that this is a problem. However, the double-pointer solution
> >usually allows faster access than the standard method of storing arrays,
> >doesn't waste much memory, allows more flexibility, gets around the
> >memory management problems of some small architectures, and lets you use
> >a[i][j] for *(*(a + i) + j), all within current C. It is extremely
> >difficult to do this efficiently in Fortran, and it will continue to be.
>     What is it that Dan thinks is difficult? He uses only "this"
>     so it is not clear what he has in mind. Fortran has allowed
>     one to write a(i,j,k,l,m,n,p) for years.

Sorry. I said ``The double-pointer solution allows X, Y, Z, W, and A,
all within current C. It is extremely difficult to do this efficiently
in Fortran, and it will continue to be.'' By ``this'' I was referring to
the double-pointer solution, not to any of its particular features.

> If the subscripts
>     are chosed at random, how does having a pointer help: the
>     necessary offset must still be gotten somehow.

Huh? A double-pointer array, as we were discussing, is a
single-dimensional array of pointers to single-dimensional arrays. To
access a random element of the array takes two additions and two memory
references. In contrast, to access a random element of a flat array
takes two additions, a multiplication, and a memory reference. On most
widely used machines, a multiplication is quite a bit slower than a
memory reference, particularly a cached memory reference. That's why
double-pointer arrays are better than flat arrays for so many
applications.

Fortran can't deal with a double-pointer array efficiently because it
doesn't have pointers. Simulating pointers efficiently is what I was
calling difficult. Do you disagree?

---Dan

gl8f@astsun.astro.Virginia.EDU (Greg Lindahl) (11/28/90)

In article <17680:Nov2806:04:1090@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:

>Huh? A double-pointer array, as we were discussing, is a
>single-dimensional array of pointers to single-dimensional arrays. To
>access a random element of the array takes two additions and two memory
>references. In contrast, to access a random element of a flat array
>takes two additions, a multiplication, and a memory reference. On most
>widely used machines, a multiplication is quite a bit slower than a
>memory reference, particularly a cached memory reference. That's why
>double-pointer arrays are better than flat arrays for so many
>applications.

Except that inside loops, the multiplication is strength-reduced to an
addition. That's why flat arrays are faster than double-pointer arrays
for so many applications. The totally flexible language would hide
this little detail from the programmer and would pick the right one.
If I have to choose, I'd use flat arrays in my numerical programs.
If I have to use C, a few macros give me what I really want.

burley@pogo.ai.mit.edu (Craig Burley) (11/28/90)

In article <1990Nov28.065242.10154@murdoch.acc.Virginia.EDU> gl8f@astsun.astro.Virginia.EDU (Greg Lindahl) writes:

   In article <17680:Nov2806:04:1090@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:

   >Huh? A double-pointer array, as we were discussing, is a
   >single-dimensional array of pointers to single-dimensional arrays. To
   >access a random element of the array takes two additions and two memory
   >references. In contrast, to access a random element of a flat array
   >takes two additions, a multiplication, and a memory reference. On most
   >widely used machines, a multiplication is quite a bit slower than a
   >memory reference, particularly a cached memory reference. That's why
   >double-pointer arrays are better than flat arrays for so many
   >applications.

   Except that inside loops, the multiplication is strength-reduced to an
   addition. That's why flat arrays are faster than double-pointer arrays
   for so many applications. The totally flexible language would hide
   this little detail from the programmer and would pick the right one.
   If I have to choose, I'd use flat arrays in my numerical programs.
   If I have to use C, a few macros give me what I really want.

Hmm, could you explain to me how to strength-reduce multiplication to addition
for random references to array elements within loops, as in

      SUBROUTINE FOO(A)
      REAL A(37,103)
      INTEGER I,J
      DO I,1=100
         DO J,1=100
            A(SOMEFUNC(I),OTHERFUNC(J)) = 0.
         END DO
      END DO
      END

where SOMEFUNC and OTHERFUNC may be external functions or statement functions
with expressions sufficient to ensure that there is no linearity between their
inputs (I and J) and their outputs (used as array indices)?

Such an optimization would be highly useful, I think, in GNU Fortran!  So do
let me know.  I thought each reference to an element in A would require a
computation involving at least one multiplication.
--

James Craig Burley, Software Craftsperson    burley@ai.mit.edu

patrick@convex.COM (Patrick F. McGehearty) (11/29/90)

In article <17680:Nov2806:04:1090@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
...stuff deleted in interest of brevity...
>Huh? A double-pointer array, as we were discussing, is a
>single-dimensional array of pointers to single-dimensional arrays. To
>access a random element of the array takes two additions and two memory
>references. In contrast, to access a random element of a flat array
>takes two additions, a multiplication, and a memory reference. On most
>widely used machines, a multiplication is quite a bit slower than a
>memory reference, particularly a cached memory reference. That's why
>double-pointer arrays are better than flat arrays for so many
>applications.
>
>Fortran can't deal with a double-pointer array efficiently because it
>doesn't have pointers. Simulating pointers efficiently is what I was
>calling difficult. Do you disagree?
>
>---Dan

Actually, simulating pointers in Fortran is not very hard, just a bit
ugly.  First declare an array SPACE for all data that might be pointed
to.  Then access it as you please.  For example: VAL = SPACE(IPTR(I)) 
I'm not claiming its a wonderful approach, just doable.
I did it 17 years ago in FTN66 because that was the only efficient
compiler for the machine I was using.

While a single access to a random element in memory of a flat array
takes two additions, one multiply and a memory reference, an successive
access on a normal loop iteration only takes one addition and a memory
reference.  That is, if we save the address of a(i,j) in a register, then
computing the address of either a(i+1,j) or a(i,j+1) only takes
a single addition, assuming a rectangular array.  The double-pointer
array still takes two memory accesses and an addition.

Also, on leading edge machines, a multiplication is as fast or faster
than a memory reference, especially if you miss the cache.  As
killer micros continue to increase their clock rates, this phenomena
will spread.  However, double-pointer arrays still will be used
for sparse data.

ttw@lanl.gov (Tony Warnock) (11/29/90)

Dan Bernstein answers [correctly]:
 
>Sorry. I said ``The double-pointer solution allows X, Y, Z, W, and A,
>all within current C. It is extremely difficult to do this efficiently
>in Fortran, and it will continue to be.'' By ``this'' I was referring to
>the double-pointer solution, not to any of its particular features.
>
>> If the subscripts
>>     are chosed at random, how does having a pointer help: the
>>     necessary offset must still be gotten somehow.
>
>Huh? A double-pointer array, as we were discussing, is a
>single-dimensional array of pointers to single-dimensional arrays. To
>access a random element of the array takes two additions and two memory
>references. In contrast, to access a random element of a flat array
>takes two additions, a multiplication, and a memory reference. On most
>widely used machines, a multiplication is quite a bit slower than a
>memory reference, particularly a cached memory reference. That's why
>double-pointer arrays are better than flat arrays for so many
>applications.
 
 
    Thanks, I didn't get the idea from your first posting.
 
    With respect to speed, almost all machines that I have used during
    the last 25 or so years have had faster multiplications than
    memory accesses. (I have been doing mostly scientific stuff.) For
    most scientific stuff, I think that the scales are tipped in favor
    of array-hood because of the rarity of accessing an arbitrary
    element. Most computations access an array along one of its
    dimensions, holding the others constant. In this case, there is
    only one addition and one memory access whatever the
    dimensionality of the array. There is also no storage overhead
    associated with keeping arrays of pointers. For multi-dimensional
    problems, this overhead could be quite large. Again, for
    scientific problems, there is usually no left over room as the
    entire memory will be taken up by arrays. It doesn't matter how
    much memory is available, my problems will easily fill it and
    still be at too coarse a granularity to be nice.
 
    Anyway, Dan's answer points out the performance differences in the
    array versus pointer access stuff. Personally I just don't user
    pointers much because my problems don't call for them. If I had to
    access multi-dimensional arrays in random fashion very often, the
    pointer solution might be acceptable.
 
    On a slightly different issue: often it is necessary to do row or
    column accesses (or whatever you call them in 3 or more
    dimensions) in the same code. How does one set up a pointer array
    for allowing easy access along each dimension (for example in a
    typical 5-dimensional array)? I use 5 dimensions as typical
    because a physical grid usually is indexed by x,y,z, and time
    indices and also by variable type. That is each point in x,y,z,t
    space has an array of variable present (on bad days it has arrays
    of matrices.)

tong@convex.csd.uwm.edu (Shuk Y Tong) (11/29/90)

In article <17680:Nov2806:04:1090@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>In article <7097@lanl.gov> ttw@lanl.gov (Tony Warnock) writes:
>> >I agree that this is a problem. However, the double-pointer solution
>> >a[i][j] for *(*(a + i) + j), all within current C. It is extremely
>> > difficult to do in Fortran.

>>     Fortran has allowed
>>     one to write a(i,j,k,l,m,n,p) for years.
>
>Huh? A double-pointer array, as we were discussing, is a
>single-dimensional array of pointers to single-dimensional arrays. To
>access a random element of the array takes two additions and two memory
>references. In contrast, to access a random element of a flat array
>takes two additions, a multiplication, and a memory reference. On most
>widely used machines, a multiplication is quite a bit slower than a
>memory reference, particularly a cached memory reference. That's why
>double-pointer arrays are better than flat arrays for so many
>applications.
>
>Fortran can't deal with a double-pointer array efficiently because it
>doesn't have pointers. Simulating pointers efficiently is what I was
>calling difficult. Do you disagree?

Yes. First of all, arrays are the most common data structure in
scientific computing. By not providing a generic array type in C(i.e.,
one which can have a variable in its declaration in functions),
it basically says C is not for numerical work. As for
two-dim arrays, in C, it has to be simulated by a pointer array
which is clumsy (it becomes clumsier for 6-dim arrays) and 
runs much SLOWER on vector machines. The reason
is simply the compiler is unable to tell where each element a[i][j]
points to. It might be posible to put tons of directives to tell
the compiler what is going on, but why bother when you can do
it by using a(i,j) alone ? As to accessing a random element a(i,j),
it is true that that a(i,j) is slower than a[i][j], but
the point is that in scientific programing, accessing a
random element in a ramdom place is RARE. Even it is
not rare, it doesn't matter becuase the CPU hog in scientifc 
computing is LOOPS with ARRAYS in it. When a(i,j) is within a loop, 
its address usually can be gotten by an addition (a constant stride 
or an increment, depending on which index runs faster).

Before ANSI C compiler became widely available, it was a nightmare
to do numerical work in C: arbitary evaluation order, mindless
promotion, too many to name. ANSI C is much better in this regard,
but still not quite has the (numerical) capbilities of Fortran.
Some of the common functions, like sqrt etc., is generated inline
by a Fortran compiler, but no way of doing it in C. If complex
math is desired, there is no end of trouble in C. A few dosens of
functions (the compiler better has inline capability, which is not
true present for most compilers) or macros need to be written.
I am almost sure people like to write c=c1/c2, c=r/c1,
rather than CDIV(c,c1,c2), RCDIV(c,r,c1).

The only possibe candidate of the languages that can replace
Fortran is C++ ( I doubt that too ), but not C, because C itself
will be dead when a true C++ compiler is written.

jlg@lanl.gov (Jim Giles) (11/29/90)

From article <BURLEY.90Nov28084920@pogo.ai.mit.edu>, by burley@pogo.ai.mit.edu (Craig Burley):
< [...]
< Hmm, could you explain to me how to strength-reduce multiplication to addition
< for random references to array elements within loops, as in
<
<       SUBROUTINE FOO(A)
<       REAL A(37,103)
<       INTEGER I,J
<       DO I,1=100
<          DO J,1=100
<             A(SOMEFUNC(I),OTHERFUNC(J)) = 0.
<          END DO
<       END DO
<       END
<
< where SOMEFUNC and OTHERFUNC may be external functions or statement functions
< with expressions sufficient to ensure that there is no linearity between their
< inputs (I and J) and their outputs (used as array indices)?

I don't understand what relevance this example has to the topic under
discussion.  The fact that this can't be strength reduced is
irrelevant to the question of whether (so-called) flat arrays are
more efficient than multiple indirections (pointer-to-pointer
arrays).  In this example, the flat array requires (in addition to
the function references) a multiply and two adds in the inner loop
(maybe another add as well, if the array is not zero based and not
static).  The pointer-to-pointer array version would require two
adds and an extra memory reference - memory is almost always slower
than int multiplies.

In the most common case (where the indices are _not_ sent to
functions), the strength reduced reference of a flat array costs a
single increment instruction in the inner loop; the pointer-to-pointer
implementation requires the same add (to get to the next element in
the array) AND it _may_ have to do an additional memory reference and
another add (if the inner loop is running 'against the grain' of the
array - with the outer pointer striding fastest: the flat array works
just as efficiently regardless of the order of the loops).

Finally, if the array has more than two dimensions, the number of
additional memory references in the inner loop _may_ go up with the
number of dimensions.  The flat array either gets strength reduced
to a single stride or (in the case you give) has as many extra mults
as the pointer-to-pointer version has extra memory traffic.  So, in
the _usual_ case (where the array is indexed directly by the loop
counters), pointer-to-pointer just breaks even IF the loops are nested
in the same order as the array - otherwise you always lose.  In your
_unusual_ case, the pointer-to-pointer version is still slower except
on those rare machines where memory references are cheaper than mults.
And this still ignores the extra memory required by the arrays of
pointers!

J. Giles

dik@cwi.nl (Dik T. Winter) (11/29/90)

All that stuff.  A basic problem still is that the major numerical libraries
(IMSL, NAG) are in fortran.  Some suggested f2c.  While pretty good it does
not help if you do not have the source (and IMSL and NAG are not distributed
with source).  And really, a very large amount of numerical programming depends
on those libraries.  Some suggested calling the Fortran routines from C; may
I respectful ask how to call the following Fortran routine from C on a Sun
without writing intermediate Fortran code?
	REAL FUNCTION F()
	F = 1.0
	RETURN
	END

Dan Bernstein suggested multiple compilations in order to allow for all
possible aliasing problems.  But how to do that if you do not have the source?
How to do a program wide flow analysis if you do not have the source, but
only the libraries?

A large body of numerical software still depends on Fortran.  There are many
reasons for this:
1.  They use common libraries, written in fortran.  (I know that NAG is
    considering a library in C; they are already considering a long time!)
2.  There are still systems that do not have C (surprise!).
3.  There are systems that do have C, but where C is much less efficient than
    Fortran (Alliant, CDC Cyber, Cray ...).
4.  And of course the always recurring question about arrays.  Fortran is
    half adequate, C is not adequate.  (There are a few languages with full
    support for arrays; I know Algol 60, Algol 68 and Ada; there are probably
    a few more, but Pascal as defined by Jensen and Wirth is not amongst them.)

Oh, well..
--
dik t. winter, cwi, amsterdam, nederland
dik@cwi.nl

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (11/29/90)

Here's how to set up a multidimensional array if you care about speed.

Let x be an array of foo[m][n]...[p]. Store it as a contiguous set of
m*n*...*p elements, with the [i][j]...[k] element at position
base + i+m*(j+n*(...(k)...)). This is done the same in C, Fortran, and
machine language.

To move along a row or column, you need to add or subtract multiples of
successive products of dimensions. So store m, m*n, and so on. This is
done the same in C, Fortran, and machine language.

Now here's where pointers start becoming useful and C begins to shine.

To find a random element of the array (certainly not as common as moving
by step, but still a large portion of the time on vector machines), you
need to calculate the position from the base of the array as above. Say
you need to calcalate the corner of the hyperplane with fixed i and j,
i.e., base + i + m * j. Store base + m * j in an array of pointers
indexed by j. Then finding the location of the corner takes one
addition, one memory access, and another addition, where the original
took two additions and a multiplication. On most machines available
today, a memory access (particularly a cached memory access, as this is
almost certain to be) is faster than a multiplication.

You can store that extra array of pointers in C or in machine language.
But you can't do it in Fortran. The best you can do is store (e.g.) m*j
in an array. But then you have an extra addition, and I haven't seen any
Fortran compiler smart enough to replace your array of integer offsets
with the corresponding array of pointers. So you lose speed. (Jim, do
any of your compilers manage this trick?)

In article <7928@uwm.edu> tong@convex.csd.uwm.edu (Shuk Y Tong) writes:
> In article <17680:Nov2806:04:1090@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
> >Fortran can't deal with a double-pointer array efficiently because it
> >doesn't have pointers. Simulating pointers efficiently is what I was
> >calling difficult. Do you disagree?
> Yes. First of all, arrays are the most common data structure in
> scientific computing. By not providing a generic array type in C(i.e.,
> one which can have a variable in its declaration in functions),
> it basically says C is not for numerical work.

That is a religious argument. How do you expect anyone to respond to it?
It's based on faith, not reason, so it can't be argued with.

> As for
> two-dim arrays, in C, it has to be simulated by a pointer array
> which is clumsy (it becomes clumsier for 6-dim arrays) and 
> runs much SLOWER on vector machines.

Would you like to run some benchmarks on a Convex?

Just because I use pointers doesn't mean I have to give up the
advantages of contiguous arrays.

> The reason
> is simply the compiler is unable to tell where each element a[i][j]
> points to. It might be posible to put tons of directives to tell
> the compiler what is going on, but why bother when you can do
> it by using a(i,j) alone ?

The Convex compiler can understand a linear expression in several
variables. It can reduce around it. In fact, I haven't seen a single
optimizing compiler that doesn't know at least a little bit about this
optimization (now commonly known as ``strength reduction'').

I don't have to ``put tons of directives to tell the compiler what is
going on''; I don't have to put even a single directive. The compiler
knows what is going on. Upon what do you base your claim?

> the CPU hog in scientifc 
> computing is LOOPS with ARRAYS in it. When a(i,j) is within a loop, 
> its address usually can be gotten by an addition (a constant stride 
> or an increment, depending on which index runs faster).

That's quite correct. There's nothing in using pointers that prohibits
you from doing exactly this.

If, in fact, moving through an array by step were the only operation,
there wouldn't be any point in having the extra pointer tables. But you
have to locate a position in an array before you zoom through a row or
column. Since zooming is twenty times faster on a Cray-style vector
machine, locating random spots suddenly becomes a noticeable percentage
of a program's run time. That's what you've failed to notice.

> Some of the common functions, like sqrt etc., is generated inline
> by a Fortran compiler, but no way of doing it in C.

This issue has been quite thoroughly addressed by previous articles in
this thread.

---Dan

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (11/29/90)

In article <7200@lanl.gov> ttw@lanl.gov (Tony Warnock) writes:
>     With respect to speed, almost all machines that I have used during
>     the last 25 or so years have had faster multiplications than
>     memory accesses.

Hmmm. What machines do you use? In my experience (mostly mainframes and
supers, some micros, and only recently a bit with minis) local memory
access is up to several times as fast as integer multiplication. (I
don't like this situation; converting to floating point just to multiply
quickly on a Cray seems rather silly.)

> Most computations access an array along one of its
>     dimensions, holding the others constant.

Yes, but just because you use pointers doesn't mean you have to give up
the advantages of flat arrays.

> There is also no storage overhead
>     associated with keeping arrays of pointers. For multi-dimensional
>     problems, this overhead could be quite large.

I assume you meant ``there is storage overhead...'' That's true, but
it's really not a problem. If you have a 5 by 5 by 2 by 3 by 15 array,
can you begrudge space for thirty pointers so that you save 5% of your
computation time? Thought so. Even if you set up pointers within
pointers for every dimension, you can do it so that the pointer space
is 1/N of the array space, where N is the widest single dimension.

>     Anyway, Dan's answer points out the performance differences in the
>     array versus pointer access stuff. Personally I just don't user
>     pointers much because my problems don't call for them. If I had to
>     access multi-dimensional arrays in random fashion very often, the
>     pointer solution might be acceptable.

But you really do access random spots in arrays; it's a rare problem
that always starts from the top left corner of every matrix. Take a
typical pivot-based algorithm, for instance: you're dealing with
essentially random rows at each step.

> How does one set up a pointer array
>     for allowing easy access along each dimension (for example in a
>     typical 5-dimensional array)?

You keep pointers to the top of each 4-dimensional hyperplane in the
array. You can get some of the benefit of this from storing integer
indexes, but you still lose at least an addition per array sweep, more
for higher-dimensional arrays if you store more pointers. Since the
sweeps run 20x faster on (e.g.) a Convex, the extra computation outside
each sweep becomes noticeable.

---Dan

sef@kithrup.COM (Sean Eric Fagan) (11/29/90)

In article <2392:Nov2902:59:0590@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>In article <7200@lanl.gov> ttw@lanl.gov (Tony Warnock) writes:
>>     With respect to speed, almost all machines that I have used during
>>     the last 25 or so years have had faster multiplications than
>>     memory accesses.
>Hmmm. What machines do you use? In my experience (mostly mainframes and
>supers, some micros, and only recently a bit with minis) local memory
>access is up to several times as fast as integer multiplication. 

On both Cybers and Crays, multiplication can easily take fewer cycles than
accessing memory.  (No cache, remember?)  But most machines aren't like
that, I believe.

>(I
>don't like this situation; converting to floating point just to multiply
>quickly on a Cray seems rather silly.)

Uhm... you don't have to, I don't think.  A Cyber had only one type of
multiply instruction, but if the exponent were 0, it did an integer
multiplication.  I believe Cray's do the same thing.

-- 
-----------------+
Sean Eric Fagan  | "That's my weakness:  vulnerable poultry."
sef@kithrup.COM  | 
-----------------+  Any opinions expressed are mine, shared with none.

tmb@bambleweenie57.ai.mit.edu (Thomas M. Breuel) (11/29/90)

tong@convex.csd.uwm.edu writes:
> the CPU hog in scientifc computing is LOOPS with ARRAYS in it.

Not surprisingly--anything else is very cumbersome to express in
FORTRAN--and if FORTRAN is all a scientific programmer is familiar
with, he'll easily dismiss good algorithms as "too complicated" that
are naturally and simply expressed using highly linked data
structures, closures, lists, etc. To add to the insult, computer
manufacturers like Cray have built whole machines based on the idea
that all people want to do is loop sequentially through arrays.

Hopefully, once scientific programmers switch to more expressive
programming languages (like FORTRAN 9x, C, C++, Modula-3, Scheme, ML,
Haskell), scientific programming will be liberated from such limited
paradigms. Arrays and loops are very important, but the are not a
panacea.

					Thomas

dik@cwi.nl (Dik T. Winter) (11/29/90)

In article <1990Nov29.040910.7400@kithrup.COM> sef@kithrup.COM (Sean Eric Fagan) writes:
 > In article <2392:Nov2902:59:0590@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
 > >(I
 > >don't like this situation; converting to floating point just to multiply
 > >quickly on a Cray seems rather silly.)
 > 
 > Uhm... you don't have to, I don't think.  A Cyber had only one type of
 > multiply instruction, but if the exponent were 0, it did an integer
 > multiplication.  I believe Cray's do the same thing.
 > 
No.  The Cybers give indeed the lower half of the product of two integers
(and the other multiply instruction gives the upper part, although that is
not documented).  The Cray returns the upper part if the two exponents are
zero.  But the Cray has a 24x24->24 bit integer multiply and as a compiler
option you can use 24 bit integers.
--
dik t. winter, cwi, amsterdam, nederland
dik@cwi.nl

jlg@lanl.gov (Jim Giles) (11/30/90)

From article <1990Nov29.040910.7400@kithrup.COM>, by sef@kithrup.COM (Sean Eric Fagan):
> [...]
> Uhm... you don't have to, I don't think.  A Cyber had only one type of
> multiply instruction, but if the exponent were 0, it did an integer
> multiplication.  I believe Cray's do the same thing.

No.  The Crays have an integer multiply unit for addresses.  This mult
takes 4 clocks.  Memory access costs 14 or 17 clocks depending on the
model of machine you have.

J. Giles

khb@chiba.Eng.Sun.COM (Keith Bierman fpgroup) (11/30/90)

In article <2619@charon.cwi.nl> dik@cwi.nl (Dik T. Winter) writes:

...
   on those libraries.  Some suggested calling the Fortran routines from C; may
   I respectful ask how to call the following Fortran routine from C on a Sun
   without writing intermediate Fortran code?

	   REAL FUNCTION F()
	   F = 1.0
	   RETURN
	   END


See page 163 of the Sun Fortran User's Guide (f77v1.3).
--
----------------------------------------------------------------
Keith H. Bierman    kbierman@Eng.Sun.COM | khb@chiba.Eng.Sun.COM
SMI 2550 Garcia 12-33			 | (415 336 2648)   
    Mountain View, CA 94043

ttw@lanl.gov (Tony Warnock) (11/30/90)

Dan Bernstein asks:
 
RE: >In article <7200@lanl.gov> ttw@lanl.gov (Tony Warnock) writes:
    >>     With respect to speed, almost all machines that I have
    >>     used during the last 25 or so years have had faster
    >>     multiplications than memory accesses.
 
>Hmmm. What machines do you use? In my experience (mostly mainframes and
>supers, some micros, and only recently a bit with minis) local memory
>access is up to several times as fast as integer multiplication. (I
>don't like this situation; converting to floating point just to multiply
>quickly on a Cray seems rather silly.)
 
 
      Model        Multiplication Time     Memory Latency
 
      YMP          5  clock periods         18 clock periods
      XMP          4  clock periods         14 clock periods
      CRAY-1       6  clock periods         11 clock periods
 
      Compaq       25 clock periods         4  clock periods
      Zenith      120 clock periods        30  clock periods
 
    The times on the PC-clones are approximate depending on the type
    of variables being accessed and the sizes of the indices.
 
    Most of my work has been on CRAY type computers. It is always a
    win to avoid memory accesses. On the PC-type machines, memory
    access is faster, but in the particular cases that I have been
    programming, one does many more constant-stride access than random
    accesses, usually in a ratio of many thousand to one. For an LU
    decompositon with partial pivoting, one does rougly N/3 constant
    stride memory accesses for each "random" access. For small N, say
    100 by 100 size matrices or so, one would do about 30
    strength-reduced operations for each memory access. For medium
    (1000 by 1000) problems, the ratio is about 300 and for large
    (10000 by 10000) it is about 30000.

ttw@lanl.gov (Tony Warnock) (11/30/90)

    In one of the smaller problems that I had to run, I found that I
    needed two large arrays declared as follows:
 
    COMPLEX a(3,3,4,12,12,12,32), b(3,4,12,12,12,32)
 
    These are BIG suckers for a small (CRAY-1) machine. Actually the
    first array was dimensioned as a(3,2,4,12,12,12,32) using a
    standard trick of the underlying physics.
 
    It was necessary to access the arrays letting each of the last
    four dimensions be the innermost loop. (I used more programming
    tricks to eliminate memory-bank conflicts.) Every few hundred
    steps, some of the a's and b's were treated specaially as single
    3x3 (realized as 3x2) and 4x3 complex matrices. There is not much
    room left over for the pointer arrays to get at these matrices.

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (11/30/90)

In article <7318@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
> The Crays have an integer multiply unit for addresses.  This mult
> takes 4 clocks.

But isn't that only for the 24-bit integer? If you want to multiply full
words you have to (internally) convert to floating point, multiply, and
convert back.

I have dozens of machines that can handle a 16MB computation; I'm not
gonig to bother with a Cray for those. The biggest advantage of the Cray
line (particularly the Cray-2) is its huge address space.

So what's the actual time for multiplying integers?

---Dan

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (11/30/90)

Several of you have been missing the crucial point.

Say there's a 300 to 1 ratio of steps through a matrix to random jumps.
On a Convex or Cray or similar vector computer, those 300 steps will run
20 times faster. Suddenly it's just a 15-1 ratio, and a slow instruction
outside the loop begins to compete in total runtime with a fast
floating-point multiplication inside the loop.

Anyone who doesn't think shaving a day or two off a two-week computation
is worthwhile shouldn't be talking about efficiency.

In article <7339@lanl.gov> ttw@lanl.gov (Tony Warnock) writes:
>       Model        Multiplication Time     Memory Latency
>       YMP          5  clock periods         18 clock periods
>       XMP          4  clock periods         14 clock periods
>       CRAY-1       6  clock periods         11 clock periods

Um, I don't believe those numbers. Floating-point multiplications and
24-bit multiplications might run that fast, but 32-bit multiplications?
Do all your matrices really fit in 16MB?

>       Compaq       25 clock periods         4  clock periods

Well, that is a little extreme; I was talking about real computers.

> For an LU
>     decompositon with partial pivoting, one does rougly N/3 constant
>     stride memory accesses for each "random" access. For small N, say
>     100 by 100 size matrices or so, one would do about 30
>     strength-reduced operations for each memory access. For medium
>     (1000 by 1000) problems, the ratio is about 300 and for large
>     (10000 by 10000) it is about 30000.

And divide those ratios by 20 for vectorization. 1.5, 15, and 150. Hmmm.

---Dan

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (11/30/90)

Here's a real example, which interested people can probably use for
benchmarks to settle this argument. But if you just care about the
theoretical point, skip to ``extra''.

Consider the following array code:

  for (j = 0;j < N;j++)
    a[i][j][k] *= y;

Here i and k were computed in some non-vectorizable way. i ranges from 0
through M - 1, and k ranges from 0 through P - 1.

With flat arrays in C (or Fortran) you might say instead

  for (j = 0;j < N;j++)
    a[(i * N + j) * P + k] *= y;

Any sane vectorizing compiler (e.g., Convex's) will recognize this as
the same code as above, even if the programmer sweeps through the array
with a pointer instead of j. It will convert both examples into vector
code like ``add (i * N * P + k) * sizeof(double) to the base of a, then
do a vector multiply by y starting there, stride P * sizeof(double),
ending N * P * sizeof(double).''

In C you can go a bit further. Set up an array of pointers to a[i*N*P]
for each i. Say double *(b[M]), with b[i] == &(a[i * N * P]). Now
a[i*N*P + j*P + k] is the same as b[i][j*P + k], so you can write

  for (j = 0;j < N;j++)
    b[i][j * P + k] *= y;

Now instead of adding (i * N * P + k) * sizeof(double) to the base of a,
the compiler will add i * sizeof(ptr) to the base of b, access that
value, and add k * sizeof(double) to that.

What's the difference? In both cases there's an addition of k, an
addition to an array base, and a multiplication by sizeof(double). In
the first case i is multiplied by N * P. In the second case i is
multiplied by the size of a pointer, and there's a memory access.

N * P can probably be precomputed and stored in a register. But there's
just as good a chance that i * sizeof(ptr) can be precomputed and stored
in a register, and in fact that it can be used in place of the register
holding i. (If neither of these is true, shifting i by 2 bits is faster
than multiplying two integer variables.)

So by using the extra pointers, b[i], we do one memory access instead of
a multiplication.

Is this an advantage? Certainly, on any machine where integer
multiplication is slower than memory accesses (including most RISC
architectures). Note that there are only a few extra pointers, and they
will almost certainly be kept in any memory cache available.

Does this advantage matter? Yes. It's true that most array accesses in
numerical programs are by step---like the N steps of j in the above
example. But even if N is large, a vector machine will run those steps
something like twenty times faster. If N is 500, N/20 is just 25. Many
of these imposing inner loops take less than 80% of a program's time on
such computers. Why ignore optimizations in the other 20%?

But in Fortran you can't do this optimization, because you can't store
that array of pointers. You can precompute the offsets, but you'll be
wasting at least an addition on each reference. If I were a religious
man like some others on USENET, I might conclude that Fortran is not
meant for numerical work.

In article <109378@convex.convex.com> patrick@convex.COM (Patrick F. McGehearty) writes:
> In article <17680:Nov2806:04:1090@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
> >Fortran can't deal with a double-pointer array efficiently because it
> >doesn't have pointers. Simulating pointers efficiently is what I was
> >calling difficult. Do you disagree?
> Actually, simulating pointers in Fortran is not very hard, just a bit
> ugly.  First declare an array SPACE for all data that might be pointed
> to.  Then access it as you please.  For example: VAL = SPACE(IPTR(I)) 

But this is not efficient. Jim Giles has been claiming (incorrectly)
that a sufficiently smart compiler can always avoid the extra addition;
even if he were right, no current compiler can do the job in real
examples like this one.

In C, you can declare IPTR as pointers. You store elements in it as
pointers. You don't think about the addition because it's not there.
In Fortran this is impossible.

> That is, if we save the address of a(i,j) in a register, then
> computing the address of either a(i+1,j) or a(i,j+1) only takes
> a single addition, assuming a rectangular array.  The double-pointer
> array still takes two memory accesses and an addition.

This is yet another common misconception: that using pointers forces you
to throw away the advantages of contiguous arrays. Why are you making
such an assumption?

Using pointers to the top of each hyperplane in a multidimensional array
does not force you to use those pointers just to move sequentially
through the array.

> Also, on leading edge machines, a multiplication is as fast or faster
> than a memory reference, especially if you miss the cache.  As
> killer micros continue to increase their clock rates, this phenomena
> will spread.

What machines are you talking about? On leading-edge RISC architectures,
a multiplication is several times slower than a memory load, and in
practice you will never miss the b[] cache. And on current micros
multiplication is particularly slow.

---Dan

mcdonald@aries.scs.uiuc.edu (Doug McDonald) (11/30/90)

In article <7339@lanl.gov> ttw@lanl.gov (Tony Warnock) writes:
>
>
> 
>      Model        Multiplication Time     Memory Latency
> 
>      YMP          5  clock periods         18 clock periods
>      XMP          4  clock periods         14 clock periods
>      CRAY-1       6  clock periods         11 clock periods
> 
>      Compaq       25 clock periods         4  clock periods
>      Zenith      120 clock periods        30  clock periods
> 
>    The times on the PC-clones are approximate depending on the type
>    of variables being accessed and the sizes of the indices.
> 
I don't know what kind of Compaq or Zenith you are using, but 
on a 25 or 33 MHz 386 or 486 machine with a 3167 or 4167 floating
point unit the memory latency and FPU multiplication time are
roughly equal. The manuals of the compilers I use say that the
memory accesses slow computations down by up to a factor of two compared
to things on the FPU stack.

Doug McDonald

ttw@lanl.gov (Tony Warnock) (11/30/90)

>From: brnstnd@kramden.acf.nyu.edu (Dan Bernstein)
>
>In article <7318@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
>> The Crays have an integer multiply unit for addresses.  This mult
>> takes 4 clocks.
>
>But isn't that only for the 24-bit integer? If you want to multiply full
>words you have to (internally) convert to floating point, multiply, and
>convert back.
>
>I have dozens of machines that can handle a 16MB computation; I'm not
>gonig to bother with a Cray for those. The biggest advantage of the Cray
>line (particularly the Cray-2) is its huge address space.
>
>So what's the actual time for multiplying integers?
 
>---Dan
 
    The time for multiplying 32-bit integers on the YMP is 5 clock
    periods. Normally YMP addresses are interpreted as 64-bit words
    not as bytes. On the previous models of CRAYS, 24 bits are used to
    address 16Mwords not Mbytes. (This saves 3 wires per address data
    path As most work on CRAY's is done on words (numerical) or
    packed-character strings, multiplication of longer integers is not
    provided for in the hardware.
 
    Personally I would like to have long integer support. The CRAY
    architecture supports a somewhat strange multiplication method
    which will yield a 48-bit product of the input words have total
    length less than 48 bits. That is, one can multiply two 24-bit
    quantities, a 16-bit and a 32-bit quantity, a 13-bit and a 35-bit
    quantity, or shorter things. This operation takes two shifts and
    one multiply. The shifts may be overlapped so the time is 3 clocks
    for the two shifts and 7 clocks for the multiply if the shifts are
    known; or 4 clocks for the shifts and 7 clocks for the multiply if
    the shifts are variable. Its a bit of a pain to program but the
    compiler does for us. Another form of integer multiplication is
    used sometimes: the integers are converted to floating, then
    multiplied, and the result converted back to integer. This method
    fails if an intermediate value exceeds 46-bits of significance.
    The time is 2 clocks for producing a "magic" constant, 3 clocks
    each for two integer adds (reduces to 4 total because of
    pipelining), 6 clocks each for two floating adds (reduces to 6
    because of pipelining overlap with the integer add), 7 clocks for
    the floating multiply, 6 clocks for another floating add, and 6
    clocks for another integer multiply. Total is 29 clocks if no
    other operations may be pipelined with these operations. If the
    quantities being multiplied are addresses, some of the above is
    eliminated, bringing the result down to 20 clocks. Still this is
    not as good as the floating point performance. All of the above
    may be vectorized which would result in 3 clocks per result in
    vector mode.

jrbd@craycos.com (James Davies) (11/30/90)

In article <7928@uwm.edu> tong@convex.csd.uwm.edu (Shuk Y Tong) writes:

 (C is inefficient and clumsy for numerical computation)

>The only possibe candidate of the languages that can replace
>Fortran is C++ ( I doubt that too ), but not C, because C itself
>will be dead when a true C++ compiler is written.

Up to this point I substantially agreed with all of your points (in
particular, the array-of-pointers representation of 2D arrays only
allows vector operations along one dimension).

However, I think that it will take far more than C++ to kill off C at
this point.  Consider the tenacity of Fortran and Cobol in the
face of a decades-long wave of new languages...C has joined the inner
circle of languages that get widespread, extensive use, and thus is
probably as immortal as the other two languages now...

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (12/01/90)

In article <1990Nov30.145649.17688@ux1.cso.uiuc.edu> mcdonald@aries.scs.uiuc.edu (Doug McDonald) writes:
> I don't know what kind of Compaq or Zenith you are using, but 
> on a 25 or 33 MHz 386 or 486 machine with a 3167 or 4167 floating
> point unit the memory latency and FPU multiplication time are
> roughly equal.

We are not talking about floating point.

---Dan

gl8f@astsun7.astro.Virginia.EDU (Greg Lindahl) (12/01/90)

In article <10444:Nov3008:15:4590@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:

>Now instead of adding (i * N * P + k) * sizeof(double) to the base of a,
>the compiler will add i * sizeof(ptr) to the base of b, access that
>value, and add k * sizeof(double) to that.

However, the first expression is a constant. So you only have to do
the multiplication once. One multiplication is often faster than N
memory references.

>But in Fortran you can't do this optimization, because you can't store
>that array of pointers.

The *compiler* isn't can. The compiler could build an array of
pointers and use it inside loops which do "random" accesses. I
suggested this yesterday, I guess you missed it ;-)

mroussel@alchemy.chem.utoronto.ca (Marc Roussel) (12/01/90)

     I think I'm pretty typical of Fortran users.  I know Fortran and a
smattering of other languages.  I use Fortran mostly because

     a) I can write code quickly in this language.
     b) Compilers exist for it on any machine I'm ever likely to use.

     I don't want to use C.  From what little I've been exposed to it, I
don't like it.  C has a nasty syntax which I don't have time to learn.
Now everybody who's been trying to convince scientific programmers like
me to learn C, go away!  Maybe you have the time to waste, but I don't.
     Every algorithm I've ever used is expressible in Fortran.  (I've
even written algorithms that create trees in Fortran using no
extensions other than recursion... That's right, no pointers, just good
old arrays.)  If ever I run across a problem that I can't code in
Fortran, then I'll consider other languages.  When the time comes, I may
even ask some of you what language you think is appropriate.  Until
then, I don't want your silly-ass opinion.  If you want to compare
languages, do it on comp.lang.misc where someone cares (notice the
followup-to line).
     Look, if someone out there can suggest a computer language that's
easy to learn and code in and that has the sort of widespread base that
Fortran does, I'll listen.  C just isn't for scientific programmers
like me so it's no use trying to convince me (and probably 90% of the
rest of the readership of this group) otherwise.  No one sensible would
say that Fortran is the best language for everything, but it's a more
than adequate language for most scientific computing.
     While I'm at it, I sincerely hope that some cleaner language like
Turing wipes C off the face of this planet.  I've about had it with all
this "my language is better than yours" garbage from the C folk and can
wish nothing for them other than extinction.

				Marc R. Roussel
                                mroussel@alchemy.chem.utoronto.ca

hwr@pilhuhn.uucp (Heiko W.Rupp) (12/01/90)

Organization: Not an Organization

In article <7339@lanl.gov>, Tony Warnock writes:

>
>
>      Model        Multiplication Time     Memory Latency
>
>      YMP          5  clock periods         18 clock periods
>      XMP          4  clock periods         14 clock periods
>      CRAY-1       6  clock periods         11 clock periods
>
>      Compaq       25 clock periods         4  clock periods
>      Zenith      120 clock periods        30  clock periods

These data also depend on clock speed !!!

But did someone see a self vectorizing compiler for C as there are many
for Fortran ?????

-Heiko


--
 O|O  Heiko W.Rupp             hwr%pilhuhn@bagsend.ka.sub.org
  |   Gerwigstr.5            | There is someone in my head, but it't not me
  U   FRG-7500 Karlsruhe 1   |                       - Pink Floyd
      Voice : + 49 721 693642| Do You know where Your towel is ???

salomon@ccu.umanitoba.ca (Dan Salomon) (12/01/90)

In article <9458:Nov2721:51:5590@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>
>> 5. There are many numerical libraries written for Fortran.
>
>Which, given f2c, is no longer an issue.

Does f2c handle conformant arrays properly?
If so, is the code that it generates maintainable?

I remember once trying to translate a FORTRAN fast-fourier analysis
procedure into PL/I.  Sounds like a piece of cake, right?  The problem
was that the FORTRAN procedure accepted arrays with not only any size
of dimensions, but also with any number of dimensions.  I devised a way of
doing it, but my method crashed the version of the PL/I compiler that
IBM was distributing at the time (circa 1972), got stuck in the operating
system, and the O/S had to be brought down by the operators to get it
deleted.

-- 

Dan Salomon -- salomon@ccu.UManitoba.CA
               Dept. of Computer Science / University of Manitoba
	       Winnipeg, Manitoba, Canada  R3T 2N2 / (204) 275-6682

cdm@gem-hy.Berkeley.EDU (Dale Cook) (12/01/90)

In article <2630@charon.cwi.nl>, dik@cwi.nl (Dik T. Winter) writes:
|> In article <1990Nov29.040910.7400@kithrup.COM> sef@kithrup.COM (Sean
Eric Fagan) writes:
|>  > In article <2392:Nov2902:59:0590@kramden.acf.nyu.edu>
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
|>  > >(I
|>  > >don't like this situation; converting to floating point just to multiply
|>  > >quickly on a Cray seems rather silly.)
|>  > 
|>  > Uhm... you don't have to, I don't think.  A Cyber had only one type of
|>  > multiply instruction, but if the exponent were 0, it did an integer
|>  > multiplication.  I believe Cray's do the same thing.
|>  > 
|> No.  The Cybers give indeed the lower half of the product of two integers
|> (and the other multiply instruction gives the upper part, although that is
|> not documented).  The Cray returns the upper part if the two exponents are
|> zero.  But the Cray has a 24x24->24 bit integer multiply and as a compiler
|> option you can use 24 bit integers.
|> --

Not on our Cray XMP system.  One has the option of 46 or 64 bit integers.

--- Dale Cook    cdm@inel.gov

========== long legal disclaimer follows, press n to skip ===========
^L
Neither the United States Government or the Idaho National Engineering
Laboratory or any of their employees, makes any warranty, whatsoever,
implied, or assumes any legal liability or responsibility regarding any
information, disclosed, or represents that its use would not infringe
privately owned rights.  No specific reference constitutes or implies
endorsement, recommendation, or favoring by the United States
Government or the Idaho National Engineering Laboratory.  The views and
opinions expressed herein do not necessarily reflect those of the
United States Government or the Idaho National Engineering Laboratory,
and shall not be used for advertising or product endorsement purposes.

ds@juniper09.cray.com (David Sielaff) (12/01/90)

In article <6690:Nov3006:15:3890@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>Several of you have been missing the crucial point.
>
>Say there's a 300 to 1 ratio of steps through a matrix to random jumps.
>On a Convex or Cray or similar vector computer, those 300 steps will run
>20 times faster. Suddenly it's just a 15-1 ratio, and a slow instruction
>outside the loop begins to compete in total runtime with a fast
>floating-point multiplication inside the loop.
>
>Anyone who doesn't think shaving a day or two off a two-week computation
>is worthwhile shouldn't be talking about efficiency.
>
>In article <7339@lanl.gov> ttw@lanl.gov (Tony Warnock) writes:
>>       Model        Multiplication Time     Memory Latency
>>       YMP          5  clock periods         18 clock periods
>>       XMP          4  clock periods         14 clock periods
>>       CRAY-1       6  clock periods         11 clock periods
>
>Um, I don't believe those numbers. Floating-point multiplications and
>24-bit multiplications might run that fast, but 32-bit multiplications?
>Do all your matrices really fit in 16MB?

On late-model X-MP's and all Y-MP's, those times are correct for
32 bit integer multiplications.  The change (from 24 to 32 bit
multiplies) corresponds to when the address space on the
Cray 1/X-MP/Y-MP line was bumped up from 24 bits to 32 bits (it was
always 32 bits on a Cray-2).

But this certainly seems to be getting an awfully long way from C ;-)

Dave Sielaff
Cray Research, Inc.

henry@zoo.toronto.edu (Henry Spencer) (12/02/90)

In article <1990Nov30.183032.5420@ccu.umanitoba.ca> salomon@ccu.umanitoba.ca (Dan Salomon) writes:
>>> 5. There are many numerical libraries written for Fortran.
>>Which, given f2c, is no longer an issue.
>Does f2c handle conformant arrays properly?

I don't recall the fine points, but f2c is a full Fortran 77 "compiler";
if conformant arrays are legal, portable F77, f2c does them.

>If so, is the code that it generates maintainable?

f2c deliberately does not try to generate maintainable code.  That is hard,
and nobody has yet produced a program that can do it without human tinkering
with the output.  In one sense, f2c really ought to be cc2fc -- its primary
mission is to be a pre-pass to turn a C compiler into a Fortran compiler.
Code maintenance is still better done on the Fortran.
-- 
"The average pointer, statistically,    |Henry Spencer at U of Toronto Zoology
points somewhere in X." -Hugh Redelmeier| henry@zoo.toronto.edu   utzoo!henry

john@newave.UUCP (John A. Weeks III) (12/02/90)

In a somewhat long discussion...
> > > It is often stated that Fortran is better than C for numerical work.

> > It may not be true any more. A friend of mine brought a little fortran
> > program (It is two big do loops with some instrinsic function calculation
> > in the loop.) and the C translation of the fortran program. We compiled two
> > program on a IBM RISC System 6000/530 with xlc and xlf. To my surprise, the
> > excutable from C is faster than the excutable from Fortran by a few percent.

I hope that numerical quality is not always measured by speed.  If my
recent experience means anything, I think that Fortran implementations
generally have more well behaved numerical libraries, documented behavior,
and better error handling.  And since FORTRAN is noted for numerical work,
someone usually tests the floating point stuff before the compiler is
shipped.  I have encountered C compilers that made me wonder if + and - was
even tested.  Of course, I remember things that don't work longer than
I remember things that do work flawlessly 8-).

The speed that FORTRAN is noted for can be found on the big IBM boat
anchors.  COBOL and FORTRAN rule on those machine for speed (discounting
assembler) with C being something like 10 times slower when performing
similar tasks (based on benchmarks with the IBM & SAS compilers).  C style
I/O is especially slow on IBM mainframes because those machines work best
with record I/O and C programmers tend to think in streams of characters.

-john-

-- 
===============================================================================
John A. Weeks III               (612) 942-6969               john@newave.mn.org
NeWave Communications                 ...uunet!rosevax!tcnet!wd0gol!newave!john
===============================================================================

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (12/03/90)

In article <1990Nov30.183032.5420@ccu.umanitoba.ca>, salomon@ccu.umanitoba.ca (Dan Salomon) writes:
> I remember once trying to translate a FORTRAN fast-fourier analysis
> procedure into PL/I.  Sounds like a piece of cake, right?  The problem
> was that the FORTRAN procedure accepted arrays with not only any size
> of dimensions, but also with any number of dimensions.

I don't recall that being legal in F77.  To be sure, I've met compilers
that didn't bother to check, and I think there was a special case for
DATA statements, but argument passing?
-- 
I am not now and never have been a member of Mensa.		-- Ariadne.

tang@venus.ycc.yale.edu (12/03/90)

From article <7552@lanl.gov>, J. Giles writes:
>> [...]
>> The essential reason why I get repelled fortran, dusty decks 
>> in fortran and the mindset of traditional (read: went to graduate
>> school before 1975) fortran programmers is the terrible
>> look-and-feel.  [...]
>
>You are right in changing the subject line.  This has little to do
>with the Fortran vs. C debate.  Especially since the look-and-feel
>of C is much worse for the problem domain appropriate to Fortran.
>
>> [...]           Beautiful (read: efficient on my time)
>> programming comes with a rich appreciation of how algorithms +
>> data structures makes programs, in careful attention to the way
>> modules interact, in fine-tuning the scope and visibility of data
>> across modules, in building really reuseable modules with sterile
>> interfaces, etc.  [...]
>
>You are thinking of programming as only a self-directed activity.
>If _you_ are the only end user, then your time is the only human
>efficiency consideration.  Most programmers are concerned with
>providing a useful capability for a large set of end users (including
>themselves).  The effect of a slow program on those end users is
>too complicated to be simple measured by the additional CPU time
>cost.  If a program takes an hour to run, I will use it differently
>(and less flexibly) than if it takes only a few minutes to run.
>Fast calculations are changing the way people work in all sorts
>of problem domains.  Fast simulations, for example, allow designers
>to iterate several ideas through the code (basing each new try on
>the results of the previous one) and are improving the design of
>everything from aircraft to computers.
>
>If there were a widely available language that allowed the programmer
>to do all the things you mention above and _still_ generated fast
>code, there would be no question about switching to it.  But the
>economics of computing still values speed over programmer time
>for most of the interesting problem domains.  The increase in
>raw speed of computers is not likely to change that much since
>this only expands the domain of viable problems to a larger
>set.  There are few problems for which code is already as fast
>as anyone will ever need.  End-user expectations rise as fast
>as the hardware improves.
>
>> [...]                                      I've seen
>> single-file-programs in fortran written two years ago (not the
>> dark ages of the 60s) of 10000 lines!  (not more than a few
>> hundred lines of comments, obviously).  That is disastrous, and
>> that is the essence of the problem to me.
>
>This is ambiguous.  A single file may contain any number of
>independent Fortran procedures (so your apparent assumption
>that the program in question is not modular is faulty).
>Further, you didn't tell us what the program _does_, so
>we can't determine whether 10000 lines is overly large or
>remarkably compact.  Complex problems require large programs:
>there is no "Royal road" to good programs.
>
>But, let us assume that you mean that the program was really a single
>procedure.  This still doesn't mean that it was disastrous.  It may
>have been carefully crafted using all the techniques that you support
>and _then_ all the procedure calls were "inlined" for speed.  This is
>not a bad technique - it will only disappear when "inlining" becomes
>a widely available feature of programming languages.  There is no
>a priori reason that Fortran could not be extended to do this (it is
>one of the things that Fortran Extended's 'internal' procedures should
>be able to do).
>
>In any case, I am in complete agreement that language designers
>should try to include features that promote more efficient coding
>and maintenance.  But so far, no language is widely available
>which does so AND is as efficient as Fortran-like languages.

This posting is the best argument presented so far in this argument, I believe.
I have included the whole posting so that one does not have to go back to find
out how the argument goes.

Tang

john@ghostwheel.unm.edu (John Prentice) (12/04/90)

In article <4421@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:
>In article <1990Nov30.183032.5420@ccu.umanitoba.ca>, salomon@ccu.umanitoba.ca (Dan Salomon) writes:
>> I remember once trying to translate a FORTRAN fast-fourier analysis
>> procedure into PL/I.  Sounds like a piece of cake, right?  The problem
>> was that the FORTRAN procedure accepted arrays with not only any size
>> of dimensions, but also with any number of dimensions.
>
>I don't recall that being legal in F77.  To be sure, I've met compilers
>that didn't bother to check, and I think there was a special case for
>DATA statements, but argument passing?
>-- 

Assuming I am reading the original posting correctly, there is nothing
to preclude this in Fortran.  Specifically, imagine the following code:

      program ralph
      real*8 a
      pointer (a,ipnt)
      integer imax,jmax,kmax,factor,nlen
c
c       memory will be allocated to array a of length imax*jmax*kmax
c
      imax=20
      jmax=40
      kmax=60
c
c        factor=8 if dynamic memory allocation wants values in bytes,
c        otherwise factor=1.  nlen is the number of whatever unit the
c        dynamic memory allocation routine wants
c
      factor=8
      nlen=factor*imax*jmax*kmax
      ... a call to malloc to allocate nlen words/bytes at the location a
      call fred (a,imax,jmax,kmax)
      end
      subroutine fred (john,imax,jmax,kmax)
      real*8 john(imax,jmax,kmax)
      .... do whatever he wants with the array john
      end

I am here using Cray/Sun type Fortran pointers along with a call to malloc
(either directly or using the Cray or Sun Fortran interfaces) to allocate
nlen number of bytes or words (as appropiate) starting at the location of
the local variable a.  I then call the subroutine and reference the array
john as a multidimensional array.  Since Fortran calls by reference, not
by value, you can do this.  This particular example is how I do most of my
memory management nowadays in Fortran.  It is not ANSI Fortran, but it is
something which every compiler I work on has in one form or the other
(I have not yet found one that doesn't, at least on workstations, though
they may not have the pointer capability and therefore there may be other
avenues required.  However, most all Fortran compilers allow you some capability
of this sort.  Fortran Extended has this built in (the allocate function). ).
In general I would not recommend changing the dimension of an array when
you make a subroutine call, but there are times that it is either necessary
(like the above case) or desireable for performance reasons (to avoid the
Cray limitation on only vectorizing the inner loop of a do-loop).

John Prentice
john@unmfys.unm.edu

salomon@ccu.umanitoba.ca (Dan Salomon) (12/04/90)

In article <1990Dec1.232408.13365@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
>             ...  In one sense, f2c really ought to be cc2fc -- its primary
>mission is to be a pre-pass to turn a C compiler into a Fortran compiler.
>Code maintenance is still better done on the Fortran.

If you have to maintain the numerical libraries in FORTRAN, then you
cannot really say that you are doing your numerical work in C.
-- 

Dan Salomon -- salomon@ccu.UManitoba.CA
               Dept. of Computer Science / University of Manitoba
	       Winnipeg, Manitoba, Canada  R3T 2N2 / (204) 275-6682

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (12/04/90)

In article <1990Dec4.011220.9302@ccu.umanitoba.ca> salomon@ccu.umanitoba.ca (Dan Salomon) writes:
> If you have to maintain the numerical libraries in FORTRAN, then you
> cannot really say that you are doing your numerical work in C.

One of the great advantages of the classical Fortran numerical libraries
is that they are so reliable that the code never has to be maintained. A
library is a library is a library.

---Dan

john@ghostwheel.unm.edu (John Prentice) (12/05/90)

In article <26434:Dec404:42:4990@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>One of the great advantages of the classical Fortran numerical libraries
>is that they are so reliable that the code never has to be maintained. A
>library is a library is a library.

I hate to do it, but I have to at least qualify this point. I have a good friend
who is in charge of maintaining the SLATEC and IMSL libraries at the Air
Force Weapons Laboratory (Cray 2, IBM 3070, and a bunch of workstation 
systems).  SLATEC is a joint project by Sandia, Livermore, Los Alamos, and
the Air Force Weapons Laboratory and is a large mathematical library,
somewhat like IMSL but free.  The software there is easily as good as
what I have seen in other libraries like IMSL or NAG.  However, I am
digressing.  His experience with these well worn and tested libraries is
that they quite often will not compile on new machines and will often
fail the quick checks until someone goes in and makes minor changes to the
code.  Now, the changes are usually minor, more often then not it is
just a question of changing some floating point test for small numbers, etc...
However, there have also been cases where the answers are just plain
wrong.  So, on an system where a library has been resident for long
periods, there is a good chance it is reliable (though it is not
an absolute certainty).  However, from what I have seen and heard of
these libraries, they are not easily transported to new systems and
unfortunately in science, new systems are always happening.  Perhaps
someone involved with SLATEC, IMSL, NAG, etc... could comment on all this.

So, I basically agree with Dan's comment, but it is not quite as simple
perhaps as his comment suggests.

John Prentice
Amparo Corporation
john@unmfys.unm.edu

gwyn@smoke.brl.mil (Doug Gwyn) (12/05/90)

In article <184a59d8.ARN0ebd@pilhuhn.uucp> hwr%pilhuhn@bagsend.ka.sub.org writes:
>But did someone see a self vectorizing compiler for C as there are many
>for Fortran ?????

There are vectorizing C compilers, particularly on large machines, but
if you're interested in comparisons you need to appreciate that Fortran
has essentially only one form of data structuring, the array, while in
C arrays are much less commonly used, other more appropriate data
structures taking their place.  Thus, while vectorization is important
for many Fortran applications, the same optimization is of much less
importance in C.  There are numerous other forms of optimization that
can be (and often are) applied in the course of generating code from C
programs.  As others have mentioned, the semantics of pointers raises
more severe aliasing concerns than apply to Fortran, so in some cases
C code must be less highly optimized than corresponding Fortran code.
This is traded off against more freedom for the C programmer, since it
is the Fortran programmer's responsibility to not alias function
arguments whereas C permits aliasing (which can at times be very useful).

Anyway, discussions about code optimization should have little to do
with selection of a programming language.

salomon@ccu.umanitoba.ca (Dan Salomon) (12/05/90)

In article <4421@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:
>In article <1990Nov30.183032.5420@ccu.umanitoba.ca>, salomon@ccu.umanitoba.ca (Dan Salomon) writes:
>> I remember once trying to translate a FORTRAN fast-fourier analysis
>> procedure into PL/I.  Sounds like a piece of cake, right?  The problem
>> was that the FORTRAN procedure accepted arrays with not only any size
>> of dimensions, but also with any number of dimensions.
>
>I don't recall that being legal in F77.  To be sure, I've met compilers
>that didn't bother to check, and I think there was a special case for
>DATA statements, but argument passing?

The FORTRAN subroutine had 3 formal parameters:
   1) a one dimensional array containing the data
   2) a parameter N specifying how many dimensions were
      in the actual data array.
   3) a vector of size N specifying the size of each
      of the dimensions of the actual data array.
The calling routine could pass an array of any number of dimensions
as the actual parameter, and the subroutine did its own subscripting
calculations.  Actually the subroutine just stepped through the data
sequentially doing its thing, and needed only to know when it crossed
dimension boundaries.  PL/I has syntax for conformant arrays, but none
for passing an N dimensional array, and receiving it as a one
dimensional array.

I realized later that this anecdote has very little relevance to the
discussion at hand or to the C language, so let's forget that part of
the posting, OK?

-- 

Dan Salomon -- salomon@ccu.UManitoba.CA
               Dept. of Computer Science / University of Manitoba
	       Winnipeg, Manitoba, Canada  R3T 2N2 / (204) 275-6682

shenkin@cunixf.cc.columbia.edu (Peter S. Shenkin) (12/05/90)

In article <1990Dec4.190148.4026@ariel.unm.edu> john@ghostwheel.unm.edu (John Prentice) writes:
>In article <26434:Dec404:42:4990@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>>One of the great advantages of the classical Fortran numerical libraries
>>is that they are so reliable that the code never has to be maintained. A
>>library is a library is a library.
>
>I hate to do it, but I have to at least qualify this point. I have a good friend
>who is in charge of maintaining the SLATEC and IMSL libraries at the Air
>Force Weapons Laboratory....
>.  His experience with these well worn and tested libraries is
>that they quite often will not compile on new machines and will often
>fail the quick checks until someone goes in and makes minor changes to the
>code.  Now, the changes are usually minor, more often then not it is
>just a question of changing some floating point test for small numbers, etc...
>However, there have also been cases where the answers are just plain
>wrong....

I have had the same experience with the Harwell subroutines....

	-P.
************************f*u*cn*rd*ths*u*cn*gt*a*gd*jb**************************
Peter S. Shenkin, Department of Chemistry, Barnard College, New York, NY  10027
(212)854-1418  shenkin@cunixf.cc.columbia.edu(Internet)  shenkin@cunixf(Bitnet)
***"In scenic New York... where the third world is only a subway ride away."***

dik@cwi.nl (Dik T. Winter) (12/05/90)

In article <1990Dec4.190148.4026@ariel.unm.edu> john@ghostwheel.unm.edu (John Prentice) writes:
 >                          However, from what I have seen and heard of
 > these libraries, they are not easily transported to new systems and
 > unfortunately in science, new systems are always happening.  Perhaps
 > someone involved with SLATEC, IMSL, NAG, etc... could comment on all this.

I am not involved with those, but I have some experience porting stuff.
My stuff consists of two layers.  Both layers are in Fortran and in C.  I ported
(part of) it to some 30 platforms.  Everything goes through the C preprocessor
because of machine pecularities (as some parts are written in assembler this
is necessary).  However, the C preprocessor is also used to avoid compiler
bugs.  There were 19 platforms that had a Fortran compiler when I used the
system.  I needed in 6 cases conditionals because of bugs.  I am now porting
the highest level in C.  I did port that to two platforms, and on one of those
I needed to avoid a compier bug.  So much about porting an (in principle)
perfectly portable package.  And this package contains only a fraction of
the code that is in the libraries mentioned above.  So what is involved:

1.  Many compilers have bugs that you may encounter; especially if the code
    is large.  (My favourite is the 68000 compiler that generates VAX
    assembler for some legal constructs.  But I have also seen compilers
    generate non-existing, apparently legal, instructions.)
2.  Do not rely on properties that seem to be intuitive.  See the ongoing
    debate on pointers in C (especially NULL).  But also do not expect that the
    construct  'sqrt(1-sin(x))'  is valid, because it will trap on some
    machines, etc.  (To quash any arguments; within your precision constraints
    it is perfectly possible that  sin(x) >1.)
3.  Especially if you have a large body of code, be prepared to customize it
    to every platform you may encounter.  Allow the use of preprocessors to
    do this (C preprocessor, m4, your own special built macro processor, etc.).

I know that NAG uses 3 (and the last count I heard was over 80 platforms).
They use generic sources and a preprocessor that customizes to the platform
required.
--
dik t. winter, cwi, amsterdam, nederland
dik@cwi.nl

dik@cwi.nl (Dik T. Winter) (12/05/90)

In article <14651@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:
 > Anyway, discussions about code optimization should have little to do
 > with selection of a programming language.

In a way: yes.  But the speed of the resulting code may be important.  You
mentioned the aliasing problem that prevents vectorization of code.  That *is*
an important aspect, because it can have a tremendous impact on the speed of
the algorithm (like 2 orders of magnitude).  Consider what the weather
forecasting bureau in Reading (UK) would have if they had programmed their
Cray using pointers in C?  They would barely be able with their current
model to have the forecast of the weather for yesterday tomorrow.  (Some
might argue that that would not be worse than it is now.)  Their job is
largely numerical, for a large part research (to get the correct model),
and their job is properly done in Fortran.
--
dik t. winter, cwi, amsterdam, nederland
dik@cwi.nl

scs@adam.mit.edu (Steve Summit) (12/05/90)

Are you all still arguing about C and Fortran?  The discussion
has gone from responding to a few inappropriate criticisms of C
(which was reasonable) to trying to downplay examples in which
Fortran might well be superior (which is silly) to discussing
multiplication vs. memory access time (which belongs in
comp.arch, if anywhere).

In article <1990Dec4.011220.9302@ccu.umanitoba.ca> salomon@ccu.umanitoba.ca (Dan Salomon) writes:
>In article <1990Dec1.232408.13365@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
>>Code maintenance is still better done on the Fortran.
>
>If you have to maintain the numerical libraries in FORTRAN, then you
>cannot really say that you are doing your numerical work in C.

Henry isn't saying you should do your numerical work in C, nor am I.
If your data structures are (as has recently been asserted) all
arrays, and you don't mind a few of Fortran's other weaknesses,
use it in good health.  C is better than a lot of people think it
is at numerical work, but it certainly isn't perfect, and C
apologists don't need to get up in arms when someone proposes an
example which Fortran can probably handle better.  Numerical work
has never been C's claim to fame, anyway.

                                            Steve Summit
                                            scs@adam.mit.edu

dik@cwi.nl (Dik T. Winter) (12/05/90)

(Yes, I can also play with Followup-To.  Apparently Doug Gwyn does not want
to have a discussion about the merits of C version Fortran in the C group.
This is the second time such a thing happened to my followup, but I gave an
answer to the following remark; you can find it in comp.lang.fortran.)

In article <14651@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:
 > Anyway, discussions about code optimization should have little to do
 > with selection of a programming language.
--
dik t. winter, cwi, amsterdam, nederland
dik@cwi.nl

gl8f@astsun8.astro.Virginia.EDU (Greg Lindahl) (12/05/90)

In article <14651@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:

>Anyway, discussions about code optimization should have little to do
>with selection of a programming language.

Great. I'll use my Cray allocation running interpreted Smalltalk.

:-)

sarima@tdatirv.UUCP (Stanley Friesen) (12/06/90)

In article <2392:Nov2902:59:0590@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>In article <7200@lanl.gov> ttw@lanl.gov (Tony Warnock) writes:
>> There is also no storage overhead
>>     associated with keeping arrays of pointers. For multi-dimensional
>>     problems, this overhead could be quite large.
 
>... That's true, but
>it's really not a problem. If you have a 5 by 5 by 2 by 3 by 15 array,
>can you begrudge space for thirty pointers so that you save 5% of your pointers

Except for one thing. In scientific computation typical dimensions would be
more like 1000 by 1000 by 1000 by 1000 by 50, which requires a great deal more
than a mere thirty pointers.  [In biological work it may be even larger, though
in that case there are usually fewer dimensions]

Really, I have done some *small* studies with hundreds of rows/columns.
[In fact my data set was so small that I failed to produce usable results,
I actually needed something like 5 times as many 'rows'].
-- 
---------------
uunet!tdatirv!sarima				(Stanley Friesen)

john@ghostwheel.unm.edu (John Prentice) (12/06/90)

In article <2642@charon.cwi.nl> dik@cwi.nl (Dik T. Winter) writes:
>...  Everything goes through the C preprocessor
>because of machine pecularities (as some parts are written in assembler this
>is necessary).  However, the C preprocessor is also used to avoid compiler
>bugs.

   and after some more comments ...

>3.  Especially if you have a large body of code, be prepared to customize it
>    to every platform you may encounter.  Allow the use of preprocessors to
>    do this (C preprocessor, m4, your own special built macro processor, etc.).
>

For what it is worth, I would second this suggestion.  We used to use a
home grown preprocessor called SAIL which we wrote back in 1972 before such
things were generally available.  Now we use the C preprocessor for this 
(the C preprocessor is pretty limited in capabilities, but it is usually
sufficient.  The big problem with the C preprocessor is that damn near
every one I have used is a little different!  For example, the one that
Borland sends out with their Turbo C and Turbo C++ will left justify every
line of the processed code (dumb!).  The Decus C preprocessor issued by
the C User's Group does the same thing.  Then you find that the Cray C
preprocessor can't handle all the preprocessor constructs listed in 
version 2 of K&R (i.e., the ANSI version).  The Sun versions seem to
pretty much work okay.  All this can be a real pain in using the
C preprocessor however unless you limit yourself to really simple
uses).  We have not had to use the C preprocessor to handle bugs however,
so far we use it mostly to separate machine dependent code (for example,
array constructs for the CM2 versus loops on every other computer).

John Prentice
john@unmfys.unm.edu

rob@mowgli.eng.ohio-state.edu (Rob Carriere) (12/06/90)

Would everybody please remove the word `(SUMMARY)' from the subject line?
Thanks.

SR
"We now rejoin the following discussion, already in progress"
---

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (12/06/90)

In article <1990Dec3.191100.22176@ariel.unm.edu>, john@ghostwheel.unm.edu (John Prentice) writes:
> In article <4421@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:
> >In article <1990Nov30.183032.5420@ccu.umanitoba.ca>, salomon@ccu.umanitoba.ca (Dan Salomon) writes:
> >> The problem
> >> was that the FORTRAN procedure accepted arrays with not only any size
> >> of dimensions, but also with any number of dimensions.

> >I don't recall that being legal in F77.  To be sure, I've met compilers
> >that didn't bother to check, and I think there was a special case for
> >DATA statements, but argument passing?

> Assuming I am reading the original posting correctly, there is nothing
> to preclude this in Fortran.  Specifically, imagine the following code:
>       real*8 a
>       pointer (a,ipnt)		<========================

My question was specifically about F77, by which I meant the 1978 ANSI
Fortran standard.  "pointer" is not part of that language.

I have seen some disgusting stuff in the Harwell library where the
programmers did

	REAL X(N)
	CALL FOO(X(1), X(3), ...)

and FOO used the first argument as a pointer to the beginning of an array,
and the *difference* between the second argument and the first as the
"stride" between array elements.  That meant that you could do

	CALL FOO(M(I,1), M(I,2), ...)
	CALL FOO(M(1,I), M(2,I), ...)

FOO, of course, had been coded in assembler.  I had the extremely unpleasant
task of trying to port a program using this stuff to a B6700, where there
wasn't any assembler, nor was there any hardware support for subtracting
pointers.  The question stands:  is passing a scalar (D=0) or an array
taking D subscripts to a subprogram expecting an array taking E.ne.D
subscripts legal in standard Fortran 77?

-- 
The Marxists have merely _interpreted_ Marxism in various ways;
the point, however, is to _change_ it.		-- R. Hochhuth.

john@ghostwheel.unm.edu (John Prentice) (12/06/90)

In article <4450@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:
>> Assuming I am reading the original posting correctly, there is nothing
>> to preclude this in Fortran.  Specifically, imagine the following code:
>>       real*8 a
>>       pointer (a,ipnt)		<========================
>
>My question was specifically about F77, by which I meant the 1978 ANSI
>Fortran standard.  "pointer" is not part of that language.
>

   and then later...

> ...The question stands:  is passing a scalar (D=0) or an array
>taking D subscripts to a subprogram expecting an array taking E.ne.D
>subscripts legal in standard Fortran 77?
>

Sorry I confused you with pointers.  Now you have confused me with this
last paragraph.  However, to give a simple answer to your original question.
If I followed your original question, it could be condensed to whether 
the following is legal in Fortran 77:
      program test
      real a(25)
      call ralph (a)
      end
      subroutine ralph (b)
      real b(5,5)
         .
         .
         .
      end
and the answer is yes, it is perfectly legal in ANSI Fortran 77.

By the way, my use of pointers was not even relevant to your question, 
I just happened to be playing with this form of dynamic memory allocation 
at the time.  In any case, the pointers also did not in any way affect the way
the subroutine was called or how it used it.  All my pointers did was allow me 
to dimension my array in the main routine on the fly.  It was a poor choice
of example to answer a simple question.  My apologies.

John Prentice
john@unmfys.unm.edu

dik@cwi.nl (Dik T. Winter) (12/07/90)

In article <4450@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:
 >            The question stands:  is passing a scalar (D=0) or an array
 > taking D subscripts to a subprogram expecting an array taking E.ne.D
 > subscripts legal in standard Fortran 77?

Perfectly legal.  Note however that the first (pass a scalar when an array
is expected) is only valid if the scalar is an array element.
--
dik t. winter, cwi, amsterdam, nederland
dik@cwi.nl

tang@venus.ycc.yale.edu (12/07/90)

Marc Roussel writes
>     I am in graduate school in a very active research group.  Almost all
>of us use Fortran in this department, except for system administration
>and X-window programming.  I think that your comments underestimate the
>strength of Fortran.  Physicists, chemists and other natural scientists have
>different needs and interests than computer scientists.  Fortran suits
>many of these needs and will continue to be learned by natural
>scientists although I concede that computer scientists can now safely
>ignore it.  Fortran has returned to the niche for which it was originally
>intended.  Similarly, Cobol is no longer used to write OS's, but will
>probably continue to be used in business programming as it provides a
>natural interface in this environment.  (I know a lot less about Cobol
>than I do about Fortran usage... If Cobol is really dying, as the CS
>types are constantly assuring me, then you may ignore the last sentence
>of this paragraph.)

Absolutely!  

However, since languages are part of CS, and if there were enough CSists 
around to convince people who have not made up their mind yet, we might 
see a demise of Fortran.  

There is a growing trend towards using more elaborate data structures.
For example, discrete-particle simulations are frequently used in numerical
statistical mechanical systems, evolution of solid structures under random
growth environment (diffusion is really a random process).  In just these
two cases, sorting, and random access of elements are the likely to be the 
most time-consumming tasks, Fortran may not be the appropriate language
(correct me if I am wrong).  

IMHO, these capabilities should be added to the
language (Fortran 90 may have them, I do not know much about F90) so that 
there is no need to switch to a different language.  I would like to make
another point on the fomat of the program structure.  I like the fomat
of F77.  I know exactly where one line begins and ends without having to
search over the page for the semicolon.  Multiple line is not 
a problem in F77 since you can always put a continuation sign at
the 6th column.  

Tang Wong

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (12/07/90)

In article <72@tdatirv.UUCP> sarima@tdatirv.UUCP (Stanley Friesen) writes:
> Except for one thing. In scientific computation typical dimensions would be
> more like 1000 by 1000 by 1000 by 1000 by 50, which requires a great deal more
> than a mere thirty pointers.

Fine. In your 50-terabyte computation, can you begrudge space for 2050
pointers so that you save 5% of your run time? I think so.

---Dan

dodson@convex.COM (Dave Dodson) (12/07/90)

In article <5760:Dec618:20:2290@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>In article <72@tdatirv.UUCP> sarima@tdatirv.UUCP (Stanley Friesen) writes:
>> Except for one thing. In scientific computation typical dimensions would be
>> more like 1000 by 1000 by 1000 by 1000 by 50, which requires a great deal
>> more than a mere thirty pointers.
>
>Fine. In your 50-terabyte computation, can you begrudge space for 2050
>pointers so that you save 5% of your run time? I think so.

Wouldn't that be 50+ million pointers?  If the Fortran array is dimensioned
(1000,1000,1000,50) then the C array would be [50][1000][1000][1000], so
you would have 50 pointers pointing into 50,000 pointers pointing into
50,000,000 pointers pointing to the beginnings of linear arrays of length
1000.

----------------------------------------------------------------------

Dave Dodson		                             dodson@convex.COM
Convex Computer Corporation      Richardson, Texas      (214) 497-4234

tps@chem.ucsd.edu (Tom Stockfisch) (12/07/90)

In article <72@tdatirv.UUCP> sarima@tdatirv.UUCP (Stanley Friesen) writes:
>In article <2392:Nov2902:59:0590@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>>In article <7200@lanl.gov> ttw@lanl.gov (Tony Warnock) writes:
>>> There is also no storage overhead
>>>     associated with keeping arrays of pointers. For multi-dimensional
>>>     problems, this overhead could be quite large.

>>... That's true, but it's really not a problem.
>>If you have a 5 by 5 by 2 by 3 by 15 array, can you begrudge space
>>for thirty pointers so that you save 5% of your pointers

>Except for one thing. In scientific computation typical dimensions would be
>more like 1000 by 1000 by 1000 by 1000 by 50, which requires a great deal more
>than a mere thirty pointers.

Lets see, assuming 4byte single precision elements,
one of your "typical" objects is... 200 trillion bytes!

Yow, you do some _serious_ computing!

Assuming a 4 byte pointer, the pointer overhead is still only 2%.

Oops, 32 bits can't access 200 trillion bytes.
Must be 64 bit pointers.

Ok, you have a 4% pointer overhead.



Disclaimer:    :-)
-- 

|| Tom Stockfisch, UCSD Chemistry	tps@chem.ucsd.edu

userAKDU@mts.ucs.UAlberta.CA (Al Dunbar) (12/09/90)

In article <671.275e3c4a@venus.ycc.yale.edu>, tang@venus.ycc.yale.edu writes:
>Marc Roussel writes
<<<deletions>>>
>
>However, since languages are part of CS, and if there were enough CSists
>around to convince people who have not made up their mind yet, we might
>see a demise of Fortran.
>
>There is a growing trend towards using more elaborate data structures.
>For example, discrete-particle simulations are frequently used in numerical
>statistical mechanical systems, evolution of solid structures under random
>growth environment (diffusion is really a random process).  In just these
>two cases, sorting, and random access of elements are the likely to be the
>most time-consumming tasks, Fortran may not be the appropriate language
>(correct me if I am wrong).
 
You *are* wrong, but you will have to correct yourself :-)
 
>
>IMHO, these capabilities should be added to the
>language (Fortran 90 may have them, I do not know much about F90) so that
>there is no need to switch to a different language
 
Generalized data structures *are* in Fortran 90. In addition,
Fortran will continue to have more primary data types than
C (i.e. logical and complex).
 
-------------------+-------------------------------------------
Al Dunbar          |
Edmonton, Alberta  |  "this mind left intentionally blank"
CANADA             |          - Manuel Writer
-------------------+-------------------------------------------

mat@mole-end.UUCP (Mark A Terribile) (12/09/90)

> > ... To access a random element of the array takes two additions and two
> >memory references. In contrast, to access a random element of a flat array
> >takes two additions, a multiplication, and a memory reference. On most
> >widely used machines, a multiplication is quite a bit slower than a
> >memory reference, particularly a cached memory reference.  ...

>     With respect to speed, almost all machines that I have used during
>     the last 25 or so years have had faster multiplications than
>     memory accesses. (I have been doing mostly scientific stuff.) ...

Please note that superscalar machines may change this again!  If a superscalar
machine has fewer multiplication resources than instruction pipelines, the
memory lookup may once again win, depending upon how much other stuff is done
to the datum being looked up.  Superscalar may not be a factor for the very
fastest machines (immersion cooled, GaAs, ECL, ballistic semiconductor,
whatever) but it will probably become more important for engineering work-
stations and for the second tier of fast machines (mini-supers, whatever).

But then, I could be wrong ...

Say, whatever happened to the Numerical C Extensions Group?
-- 

 (This man's opinions are his own.)
 From mole-end				Mark Terribile

jerry@violet.berkeley.edu (Jerry Berkman) (12/28/90)

In article <10444:Nov3008:15:4590@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>Here's a real example, which interested people can probably use for
>benchmarks to settle this argument. But if you just care about the
>theoretical point, skip to ``extra''.
>
>Consider the following array code:
>
>  for (j = 0;j < N;j++)
>    a[i][j][k] *= y;
>
>Here i and k were computed in some non-vectorizable way. i ranges from 0
>through M - 1, and k ranges from 0 through P - 1.
>
>With flat arrays in C (or Fortran) you might say instead
>
>  for (j = 0;j < N;j++)
>    a[(i * N + j) * P + k] *= y;
>
	...
>
>In C you can go a bit further. Set up an array of pointers to a[i*N*P]
>for each i. Say double *(b[M]), with b[i] == &(a[i * N * P]). Now
>a[i*N*P + j*P + k] is the same as b[i][j*P + k], so you can write
>
>  for (j = 0;j < N;j++)
>    b[i][j * P + k] *= y;
>
>Now instead of adding (i * N * P + k) * sizeof(double) to the base of a,
>the compiler will add i * sizeof(ptr) to the base of b, access that
>value, and add k * sizeof(double) to that.
   ...
>So by using the extra pointers, b[i], we do one memory access instead of
>a multiplication.

Why do this?  The original is looping through a sequence of values
P words apart in memory, i.e. it's a vector with a stride of P.
On a vector machine, once the initial address and stride are computed,
there is no more work in address computation.

Consider the following loops:

	parameter (lenj=1000)
	dimension a1(30,lenj,30),a2(30,lenj,30),b(lenj), c(lenj),d(lenj)

	do j=1,lenj
		c(j) = b(j)
	end do

        do j=1,lenj
		a1(i,j,k) = a1(i,j,k)*afact
	end do

	do j=1,lenj
		d(j) = d(j) * dfact
	end do

        do j=1,lenj
		a2(i,j,k) = a1(i,j,k)*afact
	end do

On our Cray X-MP, they all take about the same amount of time:
1214, 1401, 1378, and 1281 machine cycles respectively.
That's about 1.2-1.4 cycles per iteration of the loop.
Adding an array of pointers in memory complicates the program,
without significant gain.  It might speed up the initialization
of the loop, but that's not where most of the time is spent.
And most programs don't have simple loops with only a 1 line
loop body.

>
>---Dan

	- Jerry Berkman, U.C. Berkeley, jerry@violet.berkeley.edu