[comp.lang.misc] NOT Educating FORTRAN programmers to use C

karl@haddock.ima.isc.com (Karl Heuer) (01/12/90)

In article <14186@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:
>...at least in principle, Fortran should be faster at character manipulation
>than C.  [null-terminated strings are less efficient]

It depends on what operations you're trying to perform.  Some are faster with
counted strings, others with null-terminated strings.

Karl W. Z. Heuer (karl@haddock.isc.com or ima!haddock!karl), The Walking Lint

karl@haddock.ima.isc.com (Karl Heuer) (01/13/90)

In article <14191@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:
>From article <649@chem.ucsd.EDU>, by tps@chem.ucsd.edu (Tom Stockfisch):
>>[Because the F77 standard allows a double precision array to be equivalenced
>>to an arbitrary point in a real array, double precision objects might not be
>>aligned and hence Fortran cannot make use of the double-word fetch.]
>
>...The only way for optimization to be inhibited would be if he equivalenced
>two different double word objects to an array of singles - AND the two
>equivalances were an odd number of words apart!  In this case, most
>implementations I've seen will still do the fast loads/stores/ on the
>properly aligned data and will issue a warning about the other.

And if the misaligned double is passed (by reference) as an argument to a
separately-compiled function, does it still work?  If so, it seems that the
compiler is forced to assume possible misalignment of doubles received as
formal parameters.  This is not terribly efficient.

>>A reasonable C implementation on this machine would simply require
>>type "double" to be aligned on 8 byte boundary.
>
>Which just means that C CAN'T do something that Fortran CAN.

Much like the way that Fortran CAN'T support a function that can be passed the
same object in two distinct arguments, because it wants to be able to assume
the lack of aliasing.

>And, if C DID allow doubles on odd words, it would face the same optimization
>problems - worse: because the compiler can't necessarily tell if a
>pointer-to-double will be properly aligned until run-time.

You're hypothesizing a feature and then assuming that it would be badly
implemented.  If I were implementing a language extension to allow misaligned
objects (not necessarily restricted to |double|), I would do so with a new
type qualifier |unaligned|.  Given |int unaligned x|, |&x| would have the
type |int unaligned *|, which could not be safely stored in a plain |int *|.
Fetching would be done the "hard way" only if the object has the |unaligned|
qualifier.  If my earlier conjecture about Fortran's assumptions is correct,
this C extension would be more efficient than Fortran.

Karl W. Z. Heuer (karl@haddock.isc.com or ima!haddock!karl), The Walking Lint
Followups to comp.lang.misc.

johnl@esegue.segue.boston.ma.us (John R. Levine) (01/13/90)

On the last machine I saw that had problems with misaligned doubles, a
360/91, the Fortran compiler generated code assuming that there were no
alignment problems and the runtime system caught the alignment fault and
fixed it up, slowly, allowing the 99.9% of programs without alignment
problems to run at full speed and permitting the rest at least to run.

When IBM went to the 370 series they permitted misaligned operands, and that
particular problem went away.  In an article in the CACM a few years ago one
of the designers of the 370 said that in practice very few 370 programs use
misaligned operands and in retrospect the performance loss wasn't worth it,
they should have stuck with alignment.

This suggests that while something like explicit unaligned pointers would
work, a better solution is to write your programs so they don't have
alignment problems in the first place.  It's not very hard, most of the
misaligned data I've seen is either due to badly written I/O routines or
overenthusiastic bit squishing.  I have never understood why so many unix
programs (particularly those written in Berkeley, California) have alignment
problems, it must be all that VBD.
-- 
John R. Levine, Segue Software, POB 349, Cambridge MA 02238, +1 617 864 9650
johnl@esegue.segue.boston.ma.us, {ima|lotus|spdcc}!esegue!johnl
"Now, we are all jelly doughnuts."

mjs@hpfcso.HP.COM (Marc Sabatella) (01/17/90)

>Actually enough errno nonsense and stuff was added so that SVID, ANSI,
>POSIX are all modestly different .... and none as optimizable as
>Fortran.
>...
>On any given machine, with equal efforts
>given to the optimizer technology ... Fortran wins.

I disagree with this statement, at least when taken at face value.

It is true all the different standards have their own conception of which
functions are "builtin".  This is also true (on a vendor-by-vendor basis) of
Fortran.  If you limited Fortran math optimizations to deal only with those
functions actually specified by Fortran 77, you'd probably be on a par with C.
And if you allow optimization of vendor specific intrinsics, on the assumption
that anyone writing Fortran code on your system *knows* these are intrinsics
and *cannot* redefine them, you could do the same for C.  The difference is
in our perceptions, and is not inherent in the languages.

In any case, once you leave the realm of intrinsics, Fortran is left in the
dust.  With all parameters passed by reference (non-standard extensions aside)
you simply can't do as much as you can with C.  You have to assume such
atrocities as passing in one element of an array or common block might actually
modify any succeeding elements.  And if you want to deal with perceptions,
consider the fact that Fortran code is often "spaghetti" and difficult to
optimize (although it is just as possible to write bad code in C).

--------------
Marc Sabatella (marc%hpfcrt@hplabs.hp.com)
Disclaimers:
	2 + 2 = 3, for suitably small values of 2
	Bill and Dave may not always agree with me

johnl@esegue.segue.boston.ma.us (John R. Levine) (01/17/90)

In article <8960003@hpfcso.HP.COM> mjs@hpfcso.HP.COM (Marc Sabatella) writes:
>It is true all the different standards have their own conception of which
>functions are "builtin".  This is also true (on a vendor-by-vendor basis) of
>Fortran.  If you limited Fortran math optimizations to deal only with those
>functions actually specified by Fortran 77, you'd probably be on a par with
>C.

The Fortran 77 standard allows an implementation to have as many intrinsics
as desired.  Functions can be expanded in-line, moved out of loops, and
otherwise optimized, unless you specifically declare them external to tell it
to use one that you wrote in preference to any external.

>You have to assume such atrocities as passing in one element of an array or
>common block might actually modify any succeeding elements.

Fortran very specifically prohibits invisible aliasing among arguments and
common, the optimizer is allowed to make the most optimistic assumptions in
this case.  This is the place where Fortran wins biggest over C, and was the
motivation for the "noalias" hack proposed for ANSI C.
-- 
John R. Levine, Segue Software, POB 349, Cambridge MA 02238, +1 617 864 9650
johnl@esegue.segue.boston.ma.us, {ima|lotus|spdcc}!esegue!johnl
"Now, we are all jelly doughnuts."

mjs@hpfcso.HP.COM (Marc Sabatella) (01/18/90)

>The Fortran 77 standard allows an implementation to have as many intrinsics
>as desired.  Functions can be expanded in-line, moved out of loops, and
>otherwise optimized, unless you specifically declare them external to tell it
>to use one that you wrote in preference to any external.

And similarly, XPG3 defines oodles of C functions which are illegal to
redefine, and an optimizer can do the same things with those that it can with
Fortran intrinsics.  And user defined static functions can be expanded in-line,
as can global ones (provided the original definition stays intact for other
compilation units).  I still fail to see the win here.

>Fortran very specifically prohibits invisible aliasing among arguments and
>common, the optimizer is allowed to make the most optimistic assumptions in
>this case.

Show me the optimizer that makes these assumptions, and I'll show you one that
breaks the code of every customer we have, and I am not sure I believe it, in
any case.  Do you mean it is even illegal to pass the same argument in two
different positions of a formal parameter list?  For instance:

subroutine foo (x, y)

...

call foo (a, a)

res@cbnewsc.ATT.COM (Rich Strebendt) (01/19/90)

In article <1990Jan17.042608.447@esegue.segue.boston.ma.us>, johnl@esegue.segue.boston.ma.us (John R. Levine) writes:
> In article <8960003@hpfcso.HP.COM> mjs@hpfcso.HP.COM (Marc Sabatella) writes:
> >If you limited Fortran math optimizations to deal only with those
> >functions actually specified by Fortran 77, you'd probably be on a par with
> >C.
> 
> The Fortran 77 standard allows an implementation to have as many intrinsics
> as desired.  Functions can be expanded in-line, moved out of loops, and
> otherwise optimized, unless you specifically declare them external to tell it
> to use one that you wrote in preference to any external.
> 
> Fortran very specifically prohibits invisible aliasing among arguments and
> common, the optimizer is allowed to make the most optimistic assumptions in
> this case.  

It seems that this discussion has gotten to the "my optimizer is
bigger than your optimizer" stage.  Allow me to address this topic
from another perspective.

I prefer to write code in C (though I have also spent many years
writing in FORTRAN) because it allows me to better describe my INTENT
to the compiler and let it generate code that needs LESS optimization.
For example, I expect the compiler to generate cleaner code for a
structure copy than for an element-by-element copy of that structure,
perhaps using block move capabilities in the hardware.  Similarly,
using the x += y construct instead of X=X+Y will be a winner for the
compiler (especially if x is a multidimensional array) and will
require no pattern matching abilities in the optimizer to detect --
the optimizer will not even see the second construct in the compiler's
output.  There are a number of other examples that could be given
(such as incrementing a pointer rather than incrementing an index for
an array then recomputing the address of the array element), but
perhaps these will suffice to make my point.

In considering the final result, one should not ignore information
that the programmer can supply as clues to the compiler and optimizer
to obtain a better result.  Also, it should also be kept in mind that,
while it is fun to figure out how to optimize very exotic constructs,
the most frequently occuring construct in any programming language is
equivalent to the construct x += 1

					Rich Strebendt
					...!att!ihlpb!res

johnl@esegue.segue.boston.ma.us (John R. Levine) (01/19/90)

In article <8960004@hpfcso.HP.COM> mjs@hpfcso.HP.COM (Marc Sabatella) writes:
>Do you mean it is even illegal to pass the same argument in two
>different positions of a formal parameter list?  For instance:
>
>subroutine foo (x, y)
>call foo (a, a)

That's exactly what I mean, if foo changes x or y.  Quoting from my trusty
X3.9-1978 (don't leave home without it) section 15.9.3.6, "If a subprogram
reference causes a dummy argument in the referenced subprogram to become
associated with another dummy argument in the referenced subprogram, neither
dummy argument may become defined during execution of that subprogram."  They
then give exactly your example as something not to do.  The next paragraph
makes the same rule about arguments and common blocks.

In fact, this is just about the least portable Fortran you can write.  Some
Fortrans indirectly address scalar parameters, others copy scalar arguments
in and out.  Since there is no way to predict in what order arguments are
copied back, it is completely undefined which of x or y gets stored last.
-- 
John R. Levine, Segue Software, POB 349, Cambridge MA 02238, +1 617 864 9650
johnl@esegue.segue.boston.ma.us, {ima|lotus|spdcc}!esegue!johnl
"Now, we are all jelly doughnuts."

mjs@hpfcso.HP.COM (Marc Sabatella) (01/20/90)

>Show me the optimizer that makes these assumptions, and I'll show you one that
>breaks the code of every customer we have, and I am not sure I believe it, in
>any case.  Do you mean it is even illegal to pass the same argument in two
>different positions of a formal parameter list?  For instance:
>
>subroutine foo (x, y)
>
>...
>
>call foo (a, a)

Before the whole world jumps on my case, I now know this *is* illegal.
That's what I get for deciding what legal Fortran is based on only a textbook
and code our customers have ported from the Vax :-(
It apparently is a big enough concern that we had to provide an option to
turn off assumptions about no aliasing, and many of customers find themselves
forced to use it.

dave@PRC.Unisys.COM (David Lee Matuszek) (01/20/90)

In article <12950@cbnewsc.ATT.COM> res@cbnewsc.ATT.COM (Rich Strebendt) writes:

>  Also, it should also be kept in mind that,
>while it is fun to figure out how to optimize very exotic constructs,
>the most frequently occuring construct in any programming language is
>equivalent to the construct x += 1

Er, not quite *any* programming language.  I have been programming
professionally in Prolog for the last five years, and I need to do
this sort of thing maybe once or twice a year.

BTW, here's how to do it:

	increment(Variable) :-
		retract(value(Variable, OldValue)),
		NewValue is OldValue + 1,
		assert(value(Variable, NewValue)).

	..., increment(x), ...

but as this makes two modifications to the database and is therefore
pretty expensive (not to mention poor style), you don't want to do it
too often.

Moral:  "Toto, I don't think we're in Kansas anymore."

-- Dave Matuszek (dave@prc.unisys.com)
-- Unisys Corp. / Paoli Research Center / PO Box 517 / Paoli PA  19301
-- Any resemblance between my opinions and those of my employer is improbable.
  << Those who fail to learn from Unix are doomed to repeat it. >>

jlg@lambda.UUCP (Jim Giles) (01/20/90)

From article <12950@cbnewsc.ATT.COM>, by res@cbnewsc.ATT.COM (Rich Strebendt):
> [...]
> It seems that this discussion has gotten to the "my optimizer is
> bigger than your optimizer" stage.  [...]

Not at all.  So far, the discussion has been restricted to things that
could be optimized in one language and not in the other.  Fortran permits
optimization of procedure arguments that C must assume are aliased - no
optimizer, no matter how sophisticated, can eliminate this difference.
Fortran allows intrinsic functions to be optimized, C hasn't any intrinsics
(I know, the new standard permits such - show me an available implementation
before you mention it again).

On the other hand, C can optimize procedure calls (or even 'inline' them)
provided that the procedure and the call are in the same scope (file).
I don't know of any C implementation that actually does this, but it _is_
permitted by the existing (de facto) language specification.  (Note that
Fortran 90 also allows this for internal procedures and for procedure
calls within MODULES.  Just like C and intrinsics, this is not a fair
comparison until such time as implementations exist which actually perform
the optimization.)

> [... C is better because it ...] allows me to better describe my INTENT
> to the compiler and let it generate code that needs LESS optimization.

Actually, C is worse at this in some ways and only average in others.
For example, When I wish to pass an array to a procedure, Fortran (and
Pascal, Modula2, ADA, ...) lets me do so.  In C, the procedure always
sees a pointer instead of an array.  Another example: when writing
numerical programs you often need to force the order of evaluation
of an expression to avoid over/underflow or to eliminate cancellation
in subtractions.  Fortran (pascal, Modula2, ADA, ...) lets me do so
with parenthesis - C forces me to introduce temporary variables and
_hope_ that the optimizer doesn't actually do a store/load.

> [...]                                                   Similarly,
> using the x += y construct instead of X=X+Y will be a winner for the
> compiler (especially if x is a multidimensional array) and will
> require no pattern matching abilities in the optimizer to detect --

This is a particularly bad example for you to use.  The optimization
that you want the compiler to do without is called "common subexpression
elimination".  It is important that your compiler know how to do this
optimization even if you do have a '+=' operator.  I wouldn't ever buy
a compiler which didn't have this capability.  A compiler without this
capability is somewhat behind the state-of-the-art anyway (by about 30
years).

Another reason this is a bad example is the nature of the operator itself.
It is obviously very easy to implement.  This kind of thing was known
before C was invented.  Certainly language designers have known about it
in C for the last 15-18 years.  YET: there has been no stampede among
language designers to put these operators into their languages.  Nor has
there been a groundswell of user support (neither to the standards
committees nor to the vendors) for these operators to be added to existing
languages.  In fact, aside from C users, there seems to be no interest
in these operators at all.

One last reason that this is a bad example is that these operators work
on the expression level and not on the statement level.  This is known
(from several language design experiments) to be harmful to user's
productivity.

> [...]
> (such as incrementing a pointer rather than incrementing an index for
> an array then recomputing the address of the array element) [...]

Once again, the optimization you refer to has a name: "strength reduction".
Once again, I wouldn't _have_ a compiler which didn't do this.  A compiler
which fails to do this is, once again, slightly behind the times (with
30 years again being a fair estimate).  In fact, the use of arrays (which
are _NOT_ aliased to each other) is a direct win over using pointers
(which must be _assumed_ aliased even when they're not).

A further problem with this is that pointer incrementing on multi-dimensional
arrays is particularly error prone.  Unless your inner loop is the one which
steps over the first/last (depending on language) index of the arrays, your
stride for the pointer may not be one (1).  If not (as is often the case
in numerical programs), _you_ are responsible for using the correct strides,
bounds, and corrections in each level of loop nesting.  This is further
aggravated if you are only computing something for a corner or region of
the array.

J. Giles

gudeman@cs.arizona.edu (David Gudeman) (01/20/90)

In article  <14199@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:
>For example, When I wish to pass an array to a procedure, Fortran (and
>Pascal, Modula2, ADA, ...) lets me do so.  In C, the procedure always
>sees a pointer instead of an array.

But in C there is no difference between an pointer and an array except
that the array is a constant.  And you can declare the argument to be
an array if you want to.  I really don't see what point you are trying
to make...

>  Another example: when writing
>numerical programs you often need to force the order of evaluation
>of an expression to avoid over/underflow or to eliminate cancellation
>in subtractions.  Fortran (pascal, Modula2, ADA, ...) lets me do so
>with parenthesis...

So here is an example where C can do more optimization and Fortran has
conveniences for numerical work.  It's a trade-off, just like those
cases where it is Fortran that can do more optimization and C that is
more convenient.

>>[about +=, *= etc. operators]
>...In fact, aside from C users, there seems to be no interest
>in these operators at all.

Icon has these sorts of operators.  Common Lisp has a special macro
(define-modify-macro) for constructing these sorts of operators.  So
there is at least _some_ interest in them.  I expect that of those
language designers who aren't interested in them, most have never
programmed much in a language that uses them.  They are very
convenient.

>One last reason that this is a bad example is that these operators work
>on the expression level and not on the statement level.  This is known
>(from several language design experiments) to be harmful to user's
>productivity.

Be more careful with your use of phrases such as "known to be".  There
are a _lot_ of of programmers and language designers who claim the
opposite.
-- 
					David Gudeman
Department of Computer Science
The University of Arizona        gudeman@cs.arizona.edu
Tucson, AZ 85721                 noao!arizona!gudeman

karl@haddock.ima.isc.com (Karl Heuer) (01/20/90)

In article <14199@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:
>Fortran allows intrinsic functions to be optimized, C hasn't any intrinsics
>(I know, the new standard permits such - show me an available implementation
>before you mention it again).

Actually, the only difference here between the new standard and the old is the
degree of rigor.  If you grant that language specs talk about results rather
than implementation details, then even pre-ANSI C allows the compiler to
inline the library functions (despite the statements in K&R about what one
particular compiler happens to do).

Anyway, to answer your challenge, gcc can inline the intrinsics.

>On the other hand, C can optimize procedure calls (or even 'inline' them)
>provided that the procedure and the call are in the same scope (file).
>I don't know of any C implementation that actually does this...

Both GNU's and AT&T's compilers will do this.

>Another example: when writing numerical programs you often need to force the
>order of evaluation of an expression to avoid over/underflow or to eliminate
>cancellation in subtractions.  Fortran (pascal, Modula2, ADA, ...) lets me do
>so with parenthesis - C forces me to introduce temporary variables and _hope_
>that the optimizer doesn't actually do a store/load.

"Fixed in ANSI C."  Also, you're claiming that all implementations of Fortran,
Pascal, etc. inhibit optimization around parentheses.  I'm not convinced that
this has been true in practice.

Karl W. Z. Heuer (karl@haddock.isc.com or ima!haddock!karl), The Walking Lint

cet1@cl.cam.ac.uk (C.E. Thompson) (01/21/90)

In article <17036@megaron.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
>In article  <14199@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:
>>>[about +=, *= etc. operators]
>>...In fact, aside from C users, there seems to be no interest
>>in these operators at all.
>
>Icon has these sorts of operators.  Common Lisp has a special macro
>(define-modify-macro) for constructing these sorts of operators.  So
>there is at least _some_ interest in them.  I expect that of those
>language designers who aren't interested in them, most have never
>programmed much in a language that uses them.  They are very
>convenient.

Algol68 has assigning operators, e.g.  a +:= b, and they seem to be quite
popular with users of that language. It was probably the influence of 
Algol68, rather than C, which caused them to be added to the BCPL language:
all modern {:-)} BCPLs have <lhs> <operator>:= <rhs>.

Chris Thompson
JANET:    cet1@uk.ac.cam.phx
Internet: cet1%phx.cam.ac.uk@nsfnet-relay.ac.uk

ruede@boulder.Colorado.EDU (Ulrich Ruede) (01/21/90)

In the FORTRAN vs. C wars I have not seen some of the following arguments.

I am presently working on multilevel finite element codes using
unstructured meshes. The basic data structure is a graph. In C I represent a
node of the mesh (graph) by a structure like (actually it's more complicated):

struct node{
     Real u,      	/* approximate solution */
          f;      	/* right hand side */
     Real x, y;   	/* coordinates of the node */
     Edge_Ptr c;  	/* edges starting from this node */
     unsigned char num_edges;
                  	/* number of edges 1 Byte number < 255 */
     Node_Ptr coarse,	/* to coarser grid node under this one */
              fine; 	/* to finer grid node above this one */
     Node_Ptr next; 	/* link to next node on same level */
     Flags flags;
}

My meshes are created dynamically based on error estimation techniques.
The flexibility of the method requires even the space
for the pointers to be allocated dynamically.

I do not have a FORTRAN-Version of the code to really compare, but I
claim that C gives me the following advantages:

- The memory allocation problems become much easier, I do not have to
  waste statically allocated storage (like assuming that each node has
  a maximum number of edges). Programming a similarly storage efficient
  scheme in FORTRAN would be much more painful.

- I can save storage by using 8 Bit numbers, where appropriate.

- Machines with memory hierarchies (virtual memory or Cache) will work
  more efficiently. When processing the meshes (say for a relaxation pass)
  I usually access several components of a node at the same time;
  this data will typically reside on the same (or a few adjacent) memory
  pages.  A straightforward FORTRAN implementation would use separate
  arrays for each member of the structure, so that several pages must
  swapped in for accessing one single node. Simulating the C memory
  structure in FORTRAN would require use of EQUIVALENCE and be difficult
  to make portable.

- I am using preprocessing for making the code mode flexible and readable.
  A  FOR_ALL_NODES/END_ALL_NODES macro pair is easier to implement for C
  than for FORTRAN, (where I'd be bothered to create unique label numbers,
  unique temporary variable names for loop-indices and would have to be
  careful about FORTRAN-format restrictions).

In summary, I believe that C gives me a chance to write better
structured and more efficient code. FORTRAN does have some efficiency
advantages in an array-dominated world, where memory structures and access to
the memory is regular.

By the way, the authors of NUMERICAL RECIPES claim that aliased
arguments are common coding practice in FORTRAN and use them systematically.

Ulrich Ruede
(ruede@boulder.colorado.edu)

faustus@yew.Berkeley.EDU (Wayne Christopher) (01/22/90)

> Fortran permits
> optimization of procedure arguments that C must assume are aliased - no
> optimizer, no matter how sophisticated, can eliminate this difference.

What's wrong with compiling a "safe" version of the function that
assumes the arguments are aliased and a "fast" version that assumes
they are not, and deciding which one to use at runtime by comparing
the arguments?  I don't know if any compilers do this, but I've heard
the suggestion before.

	Wayne

tneff@bfmny0.UU.NET (Tom Neff) (01/23/90)

In article <14199@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:
>Fortran allows intrinsic functions to be optimized, C hasn't any intrinsics
>(I know, the new standard permits such - show me an available implementation
>before you mention it again).

Microsoft and Intel have both supported inline (intrinsic) string functions
and such for about two years in their draft-ANSI compilers.

I am actually on the pro-FORTRAN side of this, but let's not cite the
wrong reasons.
-- 
If the human mind were simple enough to understand,  =))  Tom Neff
we'd be too simple to understand it. -- Pat Bahn     ((=  tneff@bfmny0.UU.NET

seanf@sco.COM (Sean Fagan) (01/23/90)

In article <14199@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:
>Fortran allows intrinsic functions to be optimized, C hasn't any intrinsics
>(I know, the new standard permits such - show me an available implementation
>before you mention it again).

Microsoft C 5.0 and later, and MetaWare HighC, don't knot what versions, but at
least after 1.5x.  MSC has a bunch of str* and most of the mem* inlined, and
HighC has oodles and oodles of them (cos and sin, for example, if called
with the same argument, it will translate into a single fsincos instruction
[if you're using a '387]).  These are available *now*.

>On the other hand, C can optimize procedure calls (or even 'inline' them)
>provided that the procedure and the call are in the same scope (file).
>I don't know of any C implementation that actually does this, but it _is_
>permitted by the existing (de facto) language specification.  

gcc does.  MetaWare's latest compiler ("globally optimizing," but I don't
know what they mean by that) might, I'm not sure.

>Fortran (pascal, Modula2, ADA, ...) lets me do so
>with parenthesis - C forces me to introduce temporary variables and
>_hope_ that the optimizer doesn't actually do a store/load.

Read your standards.  ANSI-C (a reality, now) has requirements for
order-of-evaluation, much to my chagrin.  Since most companies have been
developing their compilers in parallel to X3J11, there are quite a few
compilers available now to do this (MSC has it, I think, and I'm almost
positive HighC does).

>This is a particularly bad example for you to use.  The optimization
>that you want the compiler to do without is called "common subexpression
>elimination".  It is important that your compiler know how to do this
>optimization even if you do have a '+=' operator.  I wouldn't ever buy
>a compiler which didn't have this capability.  A compiler without this
>capability is somewhat behind the state-of-the-art anyway (by about 30
>years).

	x[i] = x[i] + foo();

vs.

	x[i] += foo();

where one or more of x and i are global.  Oops.  If your FORTRAN copmiler
optimizes that away, you quite probably, have some problems.

Some of Jim's points have merit.  However, he does not really seem to know
what's available in C compilers today, nor what the C standard (no longer a
draft!) says, which kinda hurts his arguments.

-- 
Sean Eric Fagan  | "Time has little to do with infinity and jelly donuts."
seanf@sco.COM    |    -- Thomas Magnum (Tom Selleck), _Magnum, P.I._
(408) 458-1422   | Any opinions expressed are my own, not my employers'.

jlg@lambda.UUCP (Jim Giles) (01/23/90)

From article <12576@burdvax.PRC.Unisys.COM>, by dave@PRC.Unisys.COM (David Lee Matuszek):
> In article <12950@cbnewsc.ATT.COM> res@cbnewsc.ATT.COM (Rich Strebendt) writes:
... 
->  Also, it should also be kept in mind that,
->while it is fun to figure out how to optimize very exotic constructs,
->the most frequently occuring construct in any programming language is
->equivalent to the construct x += 1
> 
> Er, not quite *any* programming language.  I have been programming
> professionally in Prolog for the last five years, and I need to do
> this sort of thing maybe once or twice a year.

And, in functional languages you actually CAN'T do it.  Assignment itself
is illegal.

J. Giles

jlg@lambda.UUCP (Jim Giles) (01/23/90)

From article <17036@megaron.cs.arizona.edu>, by gudeman@cs.arizona.edu (David Gudeman):
> [...]
> But in C there is no difference between an pointer and an array except
> that the array is a constant.  And you can declare the argument to be
> an array if you want to.  I really don't see what point you are trying
> to make...

You just made it.  In C there is _NO_DIFFERENCE_ between a pointer and
an array.  In these other languages, there _IS_ a difference!  In a 
procedure in C, any two non-local pointers must be assumed _aliased_
to each other (and to all other non-local objects) because they _might_
be.  That is part of the legitimate functionality of pointers: dynamic
aliasing.  In all these other languages, this kind of aliasing is
illegal for array arguments and/or array globals.

Also, as a less important point, C doesn't let me declare array arguments
to procedures if they are multidimensional and the array bounds in each
dimension may change from call to call.  This means that I have to do
my own index calculations or define a macro to do them for me (which
may not optimize well since the compiler doesn't _know_ that array indexing
may be special cased).

> [...]
> So here is an example where C can do more optimization and Fortran has
> conveniences for numerical work.  It's a trade-off, just like those
> cases where it is Fortran that can do more optimization and C that is
> more convenient.

This is not true.  There is _NO_ case that I'm aware of that parenthesis
are _REQUIRED_ in Fortran (ADA, Pascal, etc.) when they are not also
_REQUIRED_ in C - and with the _SAME_ consequences with respect to
optimization.  It is with _OPTIONAL_ parenthesis that the languages
differ.  In C, the compiler ignores them, in all the other languages
the compiler must evaluate in parenthesis order.  So, the trade-off you
mention is under direct user control in all languages but C.

> [... bizarre assignment operators ...]
>                                             I expect that of those
> language designers who aren't interested in them, most have never
> programmed much in a language that uses them.  They are very
> convenient.

Or, perhaps these designers have actually read the literature of
programming language design.  There _have_ been experiments which
sought to discover the effect of such operators on user productivity.
No measurable difference was found in any such study I'm aware of.
Would you care to enlighten me with a counter-example?

> [... assignment as an expression-level operator or a statement-level one
> 
> Be more careful with your use of phrases such as "known to be".  There
> are a _lot_ of of programmers and language designers who claim the
> opposite.

I _AM_ careful with the phrase "known to be".  I use it to mean that
more than one experiment has been done (in this case: on user
productivity with expression/statement-level assignment as the
variable) and _ALL_ such experiments have shown the same thing
(namely, that productivity is better with assignment limited to
a statement-level operation).  "A _lot_ of programmers and language
designers can "claim" anything they want - as long as they represent
such claims as personal opinion, it's a free country.  But, if you
expect the majority of designers to take you seriously, you'd better
have something more convincing than personal opinion.  Now, if you are
aware of a study on this subject that I've missed, I'd again be
pleased if you would forward news of it.

J. Giles

jlg@lambda.UUCP (Jim Giles) (01/23/90)

From article <15706@haddock.ima.isc.com>, by karl@haddock.ima.isc.com (Karl Heuer):
> "Fixed in ANSI C."  Also, you're claiming that all implementations of Fortran,
> Pascal, etc. inhibit optimization around parentheses.  I'm not convinced that
> this has been true in practice.

If any implementation reorders expressions by ignoring parenthesis, it is
not standard conforming.  The vendor of such a compiler might find business
falling off - or even law suits (if they claim conformance).  Certainly
they should expect complaints.  In any case, being non-standard makes the
premise of your statement invalid: if they aren't standard (at least
as far as anyone knows), then they aren't implementations of the respective
language.

As for expression ordering being "Fixed in ANSI C", I'd like to know
_HOW_.  I don't have a final copy of the standard, so I'm a little
behind about what it finally contained.  There was some talk of munging
up the plus (+) operator for expression ordering.  I _HOPE_ they didn't
settle on this particular solution.  Parenthesis is the best solution
I've seen implemented (or proposed) in any language so far.  Further,
most programmers (non-C) expect parenthesis to have this function.

J. Giles

jlg@lambda.UUCP (Jim Giles) (01/23/90)

From article <15913@boulder.Colorado.EDU>, by ruede@boulder.Colorado.EDU (Ulrich Ruede):
> In summary, I believe that C gives me a chance to write better
> structured and more efficient code. FORTRAN does have some efficiency
> advantages in an array-dominated world, where memory structures and access to
> the memory is regular.

Yes, C has some advantages over Fortran.  The ones you mentioned:

Dynamic memory
struct types
small integers (or even char)
macros

Of these, the first two are provided (with better syntax and sometimes
better semantics) by Pascal, Modula2, ADA, etc..  Macros are better
provided as a separate preprocessor (which could even be written
to be independent of the target language).  Small integers and other
data types are also available in these other languages (but, due to 
strict type checking, are often difficult to use).

If suspect that you would really be happier with Fortran 90.  It has
dynamic memory (but NOT implemented using explicit pointers - so no
aliasing slow-down).  It has structs (called 'derived types' - same
thing really).  It has several different 'kinds' of INTEGERs, CHARACTERs,
REALs, etc..  In short, it has everything on your list but macros -
but they are mostly used in C to make the awful syntax more palatable.

> By the way, the authors of NUMERICAL RECIPES claim that aliased
> arguments are common coding practice in FORTRAN and use them systematically.

Aliasing is perfectly legal in Fortran, so long as none of the aliased
variables are assigned to.  I don't remember seeing any example in
NUMERICAL RECIPES that violated this rule.  The rule exists because
Fortran programs are required to have the same meaning in an imple-
mentation that uses call-by-reference as they do in an implementation
which uses call-by-value/result.  Programs meeting this constraint
may be optimized without fear of aliasing problems.

J. Giles

karl@haddock.ima.isc.com (Karl Heuer) (01/23/90)

In article <14204@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:
>As for expression ordering being "Fixed in ANSI C", I'd like to know _HOW_.
>I _HOPE_ they didn't settle on [the unary plus hack].

Fortunately, they did not.  It was decided that honoring parentheses would be
the best way.  Although there were originally some Committee members who
opposed this because it inhibits useful optimizations, I hear that they backed
down when it was pointed out that most such are still covered by the as-if
rule.  (E.g. |(i+1)-1| for integral |i| can still be reduced, since it gets
the same answer under all conditions under which the original is legal.)

Karl W. Z. Heuer (karl@haddock.isc.com or ima!haddock!karl), The Walking Lint

ruede@boulder.Colorado.EDU (Ulrich Ruede) (01/23/90)

In article <14205@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:
>
>Yes, C has some advantages over Fortran.  The ones you mentioned:
>
>Dynamic memory
>struct types
>small integers (or even char)
>macros

Let me add *block structure*.

>              .... Macros are better
>provided as a separate preprocessor (which could even be written
>to be independent of the target language).

If fact, I am using m4, which is cryptic but quite powerful (its SysV
version). I'm using m4 also with FORTRAN to alleviate some of its
shortcomings and there is even a tech. report on this issue

	C. R. Jaensch, U. Ruede, K. Schnepper: "Macro Expansion, a Tool
	for the Systematic Development of Scientific Software," 
	Technische Universitaet Muenchen, Nov. 1988.

>In short, it [FORTRAN-90] has everything on your list but macros -
>but they are mostly used in C to make the awful syntax more palatable.

No, I am trying to do some more than cosmetics. I am using constructs
like
	FOR_ALL_NODES_OF(this_mesh, each_node)
		each_node->component= 0;
	END_ALL_NODES

where *this_mesh* identifies the mesh, and *each_node* will be the
*loop index* (a pointer to the nodes). This technique allows me
to write most of my algorithms independent of the actual storage technique.

My program has evolved through several steps. In the
early versions I still used fixed two-dimensional arrays for storing the
nodes, now it is a linked list. In the old version the macro expanded
to a double loop, in the new one it represents a while loop through the
list of pointers. The macros help me to keep the information how to access the
nodes of a mesh in one central place, instead of spreading it throughout
the program. Changes in the storage structure become comparatively simple
to make, tuning becomes easier, etc.

At present I know no language that allows me to do this directly with
reasonable performance and nice syntax.

>> By the way, the authors of NUMERICAL RECIPES claim that aliased
>> arguments are common coding practice in FORTRAN and use them systematically.
>
>Aliasing is perfectly legal in Fortran, so long as none of the aliased
>variables are assigned to.  I don't remember seeing any example in
>NUMERICAL RECIPES that violated this rule.

I don't have a copy available, but they use illegal alliasing in the
ODE-codes.

Ulrich Ruede

firth@sei.cmu.edu (Robert Firth) (01/23/90)

In article <14203@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:

>You just made it.  In C there is _NO_DIFFERENCE_ between a pointer and
>an array.  In these other languages, there _IS_ a difference!  In a 
>procedure in C, any two non-local pointers must be assumed _aliased_
>to each other (and to all other non-local objects) because they _might_
>be.  That is part of the legitimate functionality of pointers: dynamic
>aliasing.

Type-safe dynamic aliasing, as provided in Pascal, Modula-2, Ada &c
is indeed a most useful and legitimate part of the functionality of
pointers.

It is also a virtually useless and illegitimate part of the functionality
of arrays.

For this reason, a language that conflates pointers and arrays is
seriously misdesigned.  On this one, I vote for Fortran.

tneff@bfmny0.UU.NET (Tom Neff) (01/23/90)

I think the one merit that everyone who's used FORTRAN will concede is
that it usually discourages you from using a complicated solution where
a simple one will suffice.  Also, the relative risks of letting
DUNDERHEADED programmers (and there's one in every shop) loose with it
are lower.

This newsgroup's opinion may be skewed by the disproportionate number of
systems-type programmers reading it.  There's no question that C was
designed optimally for some systems type tasks that FORTRAN handles more
clumsily, and I think everyone understands this.

But if we descend to the grody world of the Mere Application Programmer
for a moment -- a world more populous than ours I might add -- it's very
frequently the case that what's being asked of the programmer is
essentially a FORTRAN application.  People forget that FORTRAN was
optimized too -- for scientific and business APPLICATION programming.

C is the first systems language to be accepted everywhere.  Because
systems folks call the shots in most new-fashioned 80's shops, and
because they control the all powerful buzzword kingdom of the popular
magazines, it was inevitable that people would try and use C for
*everything*.  I cringed at that trend when it surfaced four or five
years ago, and still do.

One language is not enough for all purposes.  You can do anything in C
*if you're clever enough*, and there are all sorts of really clever
contributors to Usenet who enjoy proving this; but *clever* programmers
are just an interesting minority out there.

The next decade will, I suspect, richly illustrate the risks of forcing
DUMB programmers to do everything in C because the clever experts said
that was the way to go.  I don't think any company has verifiably gone
under because they *couldn't understand* their own code anymore.  The way
is now opened.
-- 
"NASA Awards Acronym Generation       :(%( :  Tom Neff
System (AGS) Contract For Space       : )%):  tneff%bfmny@UUNET.UU.NET
Station Freedom" - release 1989-9891  :(%( :  ...!uunet!bfmny0!tneff

sakkinen@tukki.jyu.fi (Markku Sakkinen) (01/24/90)

In article <4540@scolex.sco.COM> seanf@sco.COM (Sean Fagan) writes:
>In article <14199@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:
>> ... [a lot deleted]
 [about order of evaluation]
>>Fortran (pascal, Modula2, ADA, ...) lets me do so
>>with parenthesis - C forces me to introduce temporary variables and
>>_hope_ that the optimizer doesn't actually do a store/load.
>
>Read your standards.  ANSI-C (a reality, now) has requirements for
>order-of-evaluation, much to my chagrin. [...]
                      ^^^^^^^^^^^^^^^^^^
I hope the standard has left at least one other anti-mathematic trick
still intact for your enjoyment, namely that a leading zero changes
the meaning of a numeric literal (to octal).

> ...
>>[...]  The optimization
>>that you want the compiler to do without is called "common subexpression
>>elimination".  It is important that your compiler know how to do this
>>optimization even if you do have a '+=' operator.  I wouldn't ever buy
>>a compiler which didn't have this capability.  A compiler without this
>>capability is somewhat behind the state-of-the-art anyway (by about 30
>>years).
>
>	x[i] = x[i] + foo();
>
>vs.
>
>	x[i] += foo();
>
>where one or more of x and i are global.  Oops.  If your FORTRAN copmiler
>optimizes that away, you quite probably, have some problems.

I would certainly be puzzled by a Fortran compiler that accepted
and even optimised C code :-) (It is easy enough to imagine the
corresponding Fortran statement.)
The Fortran standard does not specify the evaluation order between the
left and right hand sides of an assignment, nor between the operands
of a binary operator. Therefore, the optimisation you suggest does in fact
correspond to one possible evaluation order - since indexing an array
by a simple variable cannot cause any side effects.

If the function foo causes side effects that make the total effect
of the statement dependent on the actual order of evaluation,
then the statement is incorrect (isn't it, language lawyers?).
Anyway, the problem does not come from the optimisation.
(The Fortran standard is principally permissive: some compiler builder
might well try to make even a large class of slightly incorrect programs
to behave "as the programmer probably intended".)

I think that the modifying assignment operators of Algol 68 and C are
convenient, especially when the expression that you then need not write
twice is complicated and error-prone. However, they are not very general.
They cannot simplify expressions like
   x[i] = y / x[i]
   x[i] = e * x[i] + z
At least in the Mode language designed
and implemented by Juha Vihavainen (University of Helsinki) one can refer
to the left-hand side of assignment on the right thus:
   x(i) := y / *
   x(i) := e * * + z
Lexically this happens to be yet another overloading of '*',
something else could have been chosen.

I think this idea would fit even more naturally
into a language with good old assignment _statements_ instead of assignment
_expressions_: expressions can be nested and the referent of the l.h.s.
reference less clear: e.g.
   x(i) := (y(j) := * + z) + fun (*);

Markku Sakkinen
Department of Computer Science
University of Jyvaskyla (a's with umlauts)
Seminaarinkatu 15
SF-40100 Jyvaskyla (umlauts again)
Finland

mjs@hpfcso.HP.COM (Marc Sabatella) (01/24/90)

>If suspect that you would really be happier with Fortran 90.  It has
>dynamic memory (but NOT implemented using explicit pointers - so no
>aliasing slow-down).

Now wait a minute.  Last I heard, Fortran 90 has something very akin to
a pointer, only worse - you can have a pointer to arbitrary and even sparse
array slices.  For instance, given an array indexed from 1 to 20 of reals, it
is possible to have a pointer to an arrary of 5 reals, and get it to "point"
at the "array" consisting of a[2], a[6], a[7], a[17], and a[19], where the
latter indices are determined at run time.  Now they may require that you not
modify this pointer and the original array itself in the same scope, but
dereferencing these types of pointers has got to be a logistic mightmare worse
than aliasing.

brnstnd@stealth.acf.nyu.edu (01/24/90)

In article <14203@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:
> From article <17036@megaron.cs.arizona.edu>, by gudeman@cs.arizona.edu (David Gudeman):
> You just made it.  In C there is _NO_DIFFERENCE_ between a pointer and
> an array.  In these other languages, there _IS_ a difference!

The only problem with aliasing is parallel optimization. C compilers for
vector machines provide directives to indicate that arrays x and y don't
overlap. Fortran doesn't even have a standard way to specify extensions.

> Also, as a less important point, C doesn't let me declare array arguments
> to procedures if they are multidimensional and the array bounds in each
> dimension may change from call to call.

Hopefully the next ANSI C will allow foo(m,a) int m; char a[][m]. As is,
it's some extra work for the programmer and, as you point out, probably
some missed optimizations.

  [ parentheses ]

If you really want to force evaluation order, use a temporary variable.
C's as-if optimization is logical and powerful; if a compiler provides
as many inline mathematical functions as Fortran, it can optimize them
just as well.

  [ assignment as an expression ]

I've found C's expressions to be useful syntactic devices. Sometimes
I've wished for even more: something analogous to a = b the same way
that a++ is analogous to ++a. I find it difficult to believe that the
presence of these operators decreases productivity.

The only thing about C that bothers me is its control structures. Any
language that doesn't terminate all its control structures should throw
out its syntax and commit suicide (pun not intended). There aren't any
named loops or multilevel breaks. It's not worth leaving C for that
non-Fortran called Fortran 8X (90, whatever), but it's a pain.

---Dan

jlg@lambda.UUCP (Jim Giles) (01/24/90)

From article <4540@scolex.sco.COM>, by seanf@sco.COM (Sean Fagan):
> [...]
> 	x[i] = x[i] + foo();
> vs.
> 	x[i] += foo();
> where one or more of x and i are global.  Oops.  If your FORTRAN copmiler
> optimizes that away, you quite probably, have some problems.

If the function evaluation changes either x or i in the above, the Fortran
version of " x[i] = x[i] + foo()" would be illegal!  This is not perfect
either and the Fortran version would have to introduce an intermediate
variable to store the function result.  However, if any of your C programs
rely on the above distinction, it is likely to be a source of errors
anyway.  Most advocates of "structured programming" recommend against
fuctions with side-effects at all.  As a practical matter, such functions
require more careful treatment regardless of language.

This is not to say that I agree with the Fortran restriction.  I think
the above assignment should be legal and the user should be allowed to
control whether optimization is performed or not by declaring whether
the function has side-effects or not.  Neither C nor Fortran currently
has this feature.

> [...]
> Some of Jim's points have merit.  However, he does not really seem to know
> what's available in C compilers today,

True, I don't know the whole C market.  I only know what the C compilers
avaliable to ME are capable of doing.  Further, most C supporters in this
discussion don't seem to know what available Fortran implementations do
(I haven't been limited to 6 char identifiers for over a decade).

> [...]                               nor what the C standard (no longer a
> draft!) says, which kinda hurts his arguments.

Just because C has been approved doesn't mean general availability.
C supporters don't know what Fortran 90 offers either - why should they?
Fortran 90 is on the verge of approval (within a few months most likely)
but it won't be widely available for a while yet.  I don't claim that
your ignorance of Fortran 90 hurts your argument.

J. Giles

jlg@lambda.UUCP (Jim Giles) (01/24/90)

From article <15999@boulder.Colorado.EDU>, by ruede@boulder.Colorado.EDU (Ulrich Ruede):
> In article <14205@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:
>> [...]
>>Dynamic memory
>>struct types
>>small integers (or even char)
>>macros
> 
> Let me add *block structure*.

Fortran 90 has that too.  Better than C's almost unreadibly invisible
curly braces.  Fortran 90 lets you label blocks too, so that EXIT
and CYCLE statements can be multi-level.

> [...]
> No, I am trying to do some more than cosmetics. I am using constructs
> like
> 	FOR_ALL_NODES_OF(this_mesh, each_node)
> 		each_node->component= 0;
> 	END_ALL_NODES
> 
> where *this_mesh* identifies the mesh, and *each_node* will be the
> *loop index* (a pointer to the nodes). This technique allows me
> to write most of my algorithms independent of the actual storage technique.

So, what you _really_ want is something like FIDIL.  FIDIL is a
scientific programming language which lets you do things like the
above directly.  Your mesh doesn't have to be strictly rectangular
either.  The domain of the mesh can be pretty much any shape.  FIDIL
was designed for finite differencing with domain decomposition, but
it has applications outside that field (like, apparently, yours).

> [... NUMERICAL RECIPIES ...]
> I don't have a copy available, but they use illegal alliasing in the
> ODE-codes.

If so, they can't expect their codes to work on the vast majority of
Fortran implementations.  I'll have to go take a look at my copy of
the book.  I don't think that such a large problem could really have
slipped by during the publishing of such a book.






J. Giles

jlg@lambda.UUCP (Jim Giles) (01/24/90)

From article <1735@gannet.cl.cam.ac.uk>, by cet1@cl.cam.ac.uk (C.E. Thompson):
> [...]
> Algol68 has assigning operators, e.g.  a +:= b, and they seem to be quite
> popular with users of that language. It was probably the influence of 
> Algol68, rather than C, which caused them to be added to the BCPL language:
> all modern {:-)} BCPLs have <lhs> <operator>:= <rhs>.

Especially since BCPL was a precursor of C.  This only strengthens my
previous argument though.  I can't believe that _any_ language designer
of recent vintage is ignorant of ALGOL68.  Yet, as I said, there hasn't
been a stampede toward such operators in newly designed languages.

This is not to say that I personally oppose such operators.  I don't use
them.  I understand what they do.  I'm designing a language which _doesn't_
have them - but, since users can define their own operators (and overload
existing ones), all C's operators can be implemented if the user wants.
I am somewhat more opposed to assignment operators working at the
expression level - but even here I allow users to define such operators.

J. Giles

jlg@lambda.UUCP (Jim Giles) (01/24/90)

From article <9738@stealth.acf.nyu.edu>, by brnstnd@stealth.acf.nyu.edu:
> [...]
> The only problem with aliasing is parallel optimization. C compilers for
> vector machines provide directives to indicate that arrays x and y don't
> overlap. Fortran doesn't even have a standard way to specify extensions.

The above argument is not true.  Consider the following code sequence
in C:

   z = compilcated_expression;
   *a= 0;
   *b= complicated_expression;

Where the "complicated_expression" is the same for both 'z' and '*b'.
The optimization you would like to use _ON_ANY_MACHINE_ is to compute
the expression, store it into both 'z' and '*b' and then store zero
into '*a'.  Or, store the zero in '*a' first - or _while_ you're computing
the expression.  Any one of these would work correctly - and be FASTER -
provided that you KNOW that '*a' is not aliased to 'z', '*b', or part
of the expression.  If you don't know that, then you have to assume the
worst and generate code accordingly - including computing the expression
twice!!  Furthermore, aliasing effects the scheduling of cache, registers,
and possibly even stack for similar reasons.  I don't know of _ANY_ machine
which cannot benefit from the knowledge of whether things are aliased or
not.

>   [ parentheses ]
> 
> If you really want to force evaluation order, use a temporary variable.
> C's as-if optimization is logical and powerful; [...]

This assumes that you compiler is smart enough not to really reserve space
for the temporary variable - or, at the very least, it doesn't actually
generate the unnecessary store/load.  

>   [ assignment as an expression ]
> 
> I've found C's expressions to be useful syntactic devices. Sometimes
> I've wished for even more: something analogous to a = b the same way
> that a++ is analogous to ++a. I find it difficult to believe that the
> presence of these operators decreases productivity.

Some of the people doing studies of such operators were expecting different
results as well.  The problem is that evidence doesn't wait upon your
difficulty in believing.  




J. Giles

johnl@esegue.segue.boston.ma.us (John R. Levine) (01/24/90)

In article <4540@scolex.sco.COM> seanf@sco.COM (Sean Fagan) writes:
>	x[i] = x[i] + foo();
>vs.
>	x[i] += foo();
>
>where one or more of x and i are global.  Oops.  If your FORTRAN copmiler
>optimizes that away, you quite probably, have some problems.

If you write code that depends on the order of evaluation of foo() and x[i],
you certainly have problems.  Quoting again from F77, secton 6.6, Evaluation
of Expressions, "The execution of a function reference in a statement may not
alter the value of any other entity within the statement in which the function
appears.  The execution of a function reference in a statement may not alter
the value of any entity in common that affects the value of any other function
reference in that statement."

In a later paragraph it also says that it need not evaluate part of all of
an expression if it can figure out the value some other way.  This explicitly
permits short circuit (C style) evaluation of logical expressions and 
implicitly allows arbitrarily agressive common subexpresion elimination.

People who express strong opinions about what a Fortran compiler should and
shouldn't do would be well advised to have at least a passing familiarity
with what the standard says Fortran really is.  I express no opinions on F90
yet, I'm still chewing through my copy of the draft.
-- 
John R. Levine, Segue Software, POB 349, Cambridge MA 02238, +1 617 864 9650
johnl@esegue.segue.boston.ma.us, {ima|lotus|spdcc}!esegue!johnl
"Now, we are all jelly doughnuts."

brnstnd@stealth.acf.nyu.edu (01/24/90)

Normal Fortran has only two advantages over C, namely a wider set of
standard builtins and a larger support base. Fortran 90 loses the
support base (why did X3J3 have to change the comparison names?) and
adds just two other advantages: named loops and the multilevel break. 
Other articles list many of the advantages of C over Fortran.

In article <14206@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:
> From article <4540@scolex.sco.COM>, by seanf@sco.COM (Sean Fagan):
> > 	x[i] = x[i] + foo();
> > 	x[i] += foo();

If foo() changes the value of i, both of these statements are illegal in
Fortran and produce undefined results in C. This issue is moot.

> Just because C has been approved doesn't mean general availability.
> C supporters don't know what Fortran 90 offers either - why should they?

I favor C over Fortran, and I know what Fortran 90 offers. (Back in 1987
I listened to a talk on the Fortran 8X draft, and my two questions were
``Won't compilers for such a huge language be rather inefficient?'' and
``Is it really Fortran?'' The answers are ``yes'' and ``no'' respectively.)
Basically, Fortran 90 is Modula-2, Ada, Pascal, and bits of C and Fortran.
The latest draft is slightly more cleaned up but even less streamlined.
(There's a cute bumper sticker running around: BAN X3J3.)

In contrast, ANSI C feels only slightly different from K&R C. It is a
compact, efficient language.

As for aliasing, I don't see any sequential optimizations hurt by it
that shouldn't be done by hand in any case.

---Dan

seanf@sco.COM (Sean Fagan) (01/24/90)

In article <14203@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:
>This is not true.  There is _NO_ case that I'm aware of that parenthesis
>are _REQUIRED_ in Fortran (ADA, Pascal, etc.) when they are not also
>_REQUIRED_ in C - and with the _SAME_ consequences with respect to
>optimization.  It is with _OPTIONAL_ parenthesis that the languages
>differ.  In C, the compiler ignores them, in all the other languages
>the compiler must evaluate in parenthesis order.  So, the trade-off you
>mention is under direct user control in all languages but C.

1.  Given
	a + b + c - e - f;

I'm happy to let the compiler rearrange things as much as possible to
generate fast code.  In FORTRAN, by your own admission, you can't do that.

2.  Given

	((((a + b) + c ) -e ) - f);

The compiler is free to rewrite this, *IFF* the result would be the same!
If you're using shorts, and overflows are ignored, then, probably, it can do
what it wants to.  If, however, the result is *not* the same, then the
compiler cannot do this, and, if it does, it's a bug.

-- 
Sean Eric Fagan  | "Time has little to do with infinity and jelly donuts."
seanf@sco.COM    |    -- Thomas Magnum (Tom Selleck), _Magnum, P.I._
(408) 458-1422   | Any opinions expressed are my own, not my employers'.

faustus@yew.Berkeley.EDU (Wayne Christopher) (01/25/90)

I haven't used FORTRAN myself, but from these discussions it seems
to me that there are a lot of things you are not supposed to do to
make things nice for the compiler, like create aliases and allow foo()
in "x[i] + foo()" to alter x or i.  Do FORTRAN compilers check these
constraints?  If not, aren't they a great source of hard-to-find bugs,
that go away when you turn off optimization?

	Wayne

gudeman@cs.arizona.edu (David Gudeman) (01/25/90)

In article  <2806@tukki.jyu.fi> sakkinen@tukki.jyu.fi (Markku Sakkinen) writes:
>In article <4540@scolex.sco.COM> seanf@sco.COM (Sean Fagan) writes:
>>In article <14199@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:
>>> ... [a lot deleted]
> [about order of evaluation]
>>>Fortran (pascal, Modula2, ADA, ...) lets me do so
>>>with parenthesis - C forces me to introduce temporary variables and
>>>_hope_ that the optimizer doesn't actually do a store/load.
>>
>>Read your standards.  ANSI-C (a reality, now) has requirements for
>>order-of-evaluation, much to my chagrin. [...]
>                      ^^^^^^^^^^^^^^^^^^
>I hope the standard has left at least one other anti-mathematic trick
>still intact for your enjoyment, namely that a leading zero changes
>the meaning of a numeric literal (to octal).

Just a note for the Fortran partisans, extra parens occur all the time
in C when the programmer has no intention of specifying the order of
evaluation.   In particular, you should always parenthisize macros:

#define next(x) ((x) + 1)
...
  foo = next(x) + 100

In this example, one would hope that the optimizer would ignore the
parens and compile the last expression as

  foo = x + 101

P.S. I agree that the form of octals is a pretty bad misfeature...
-- 
					David Gudeman
Department of Computer Science
The University of Arizona        gudeman@cs.arizona.edu
Tucson, AZ 85721                 noao!arizona!gudeman

jlg@lambda.UUCP (Jim Giles) (01/25/90)

From article <8960006@hpfcso.HP.COM>, by mjs@hpfcso.HP.COM (Marc Sabatella):
->If suspect that you would really be happier with Fortran 90.  It has
->dynamic memory (but NOT implemented using explicit pointers - so no
->aliasing slow-down).
> 
> Now wait a minute.  Last I heard, Fortran 90 has something very akin to
> a pointer, only worse - [...]

True!  And it is _REALLY_WORSE_!  Fortunately, they have also retained
the ALLOCATABLE attribute from the previous draft of the standard.  So,
you _can_ allocate memory without pointers (and, in fact, I can't find
anything I'd use pointers for except recursive data structures).
_
> [...]                 you can have a pointer to arbitrary and even sparse
> array slices.  [...]

Unfortunately, the most important type of array slices aren't available
for use with pointers.  For example, I can't select out the diagonal
of an array as a vector.  All the other uses of pointers to array sections
are more efficiently implemented by just using explicit subscript triples.
The fact is, they took out RANGE and IDENTIFY to simplify the language
and then added these pointers - which are _more_ complicated than
RANGE and IDENTIFY were, less optimizable, and less powerful.  The
present pointer facility was the main reason I strongly opposed the
present draft of the standard.

> [...]                                                                but
> dereferencing these types of pointers has got to be a logistic mightmare worse
> than aliasing.

Agreed!

J. Giles

jlg@lambda.UUCP (Jim Giles) (01/25/90)

From article <21403@pasteur.Berkeley.EDU>, by faustus@yew.Berkeley.EDU (Wayne Christopher):
> [... aliasing ...}                  Do FORTRAN compilers check these
> constraints?  If not, aren't they a great source of hard-to-find bugs,
> that go away when you turn off optimization?

No, many Fortran compilers don't check these things (although the cost would
be less than an array bounds check and many compilers _do_ check this - Oh
well).  And, YES, it is a source of hard-to-find bugs.  But aliasing itself
is a source of hard-to-find bugs!  I see more C code with these kinds of
errors than Fortran code.  C supports and encourages aliasing while it's
illegal in Fortran (and users quickly learn to avoid it).

J. Giles

jlg@lambda.UUCP (Jim Giles) (01/25/90)

From article <4561@scolex.sco.COM>, by seanf@sco.COM (Sean Fagan):
> [...]
> 1.  Given
> 	a + b + c - e - f;
> 
> I'm happy to let the compiler rearrange things as much as possible to
> generate fast code.  In FORTRAN, by your own admission, you can't do that.
I didn't admit any such thing.  Fortran is allowed to optimize the expression
in any _mathematically_ equivalent way so long as it doesn't violate paren-
thesis.  There are no parenthesis in theis expression at all - so the compiler
can do the operations in any order it likes.

> [...]
> 2.  Given
> 
> 	((((a + b) + c ) -e ) - f);
> 
> The compiler is free to rewrite this, *IFF* the result would be the same!
> If you're using shorts, and overflows are ignored, then, probably, it can do
> what it wants to.  If, however, the result is *not* the same, then the
> compiler cannot do this, and, if it does, it's a bug.

In this case, a Fortran compiler must generate code which is _computationally_
equivalent to the code given.  So, if over/underflow or roundoff (for floating
point) is possible, the compiler must generate code for the expression in
exactly the order given.  This is, in fact, the very purpose of the feature.
Fortran (and most other languages) gives you the flexibility to control the
optimization of expressions, C doesn't (or, didn't - apparently ANSI C
finally joined the modern world in this respect).

J.Giles

jlg@lambda.UUCP (Jim Giles) (01/25/90)

From article <11962@stealth.acf.nyu.edu>, by brnstnd@stealth.acf.nyu.edu:
> Normal Fortran has only two advantages over C, namely a wider set of
> standard builtins and a larger support base. Fortran 90 loses the
> support base (why did X3J3 have to change the comparison names?) [...]

Huh?  Fortran 90 is _supposed_ to be backward compatible to Fortran 77.
I know of a couple of really obscure contradictions (which the committee
is presently fixing).  I don't know _any_ compatibility problem that
relates to the syntactic form of comparisons.

> [...]                                                           and
> adds just two other advantages: named loops and the multilevel break. 

And - array syntax, non-aliasing dynamic memory, user defined operators,
overloaded procedures (with type checking of arg types required), etc..

C++ has _some_ of these.  No ordinary C (ANSI or not) has any of them.
ANSI C leaves the door open slightly) with function prototypes, but
all function definitions with the same function name must return the
same type!

> Other articles list many of the advantages of C over Fortran.

Maybe you should have listed them again.  So far, the only one I can
think of is dynamic memory.  This advantage is offset by the innefficient
code generated because pointers (even to dynamic memory) must be assumed
to be aliased.  It is further offset by the fact that most Fortran
environments already provide a (non-standard, I know) form of dynamic
memory.  And, before you say that it's useless if it's not portable,
I could use the same argument about the future possibility of 'noalias'
pragmas in C - and, by the time _those_ are available, the new Fortran
standard will be providing portable dynamic memory constructs which are
_automatically_ non-aliasing.


> [...]
> I favor C over Fortran, and I know what Fortran 90 offers. (Back in 1987
> I listened to a talk on the Fortran 8X draft, and my two questions were
> ``Won't compilers for such a huge language be rather inefficient?'' and
> ``Is it really Fortran?'' The answers are ``yes'' and ``no'' respectively.)

Whoever gave the talk was misleading you.  In terms of difficulty of
implementation, Fortran 90 is not much worse than C:

      Derived types are the same as C's 'structs'.
      C has 'unions', Fortran 90 doesn't (darn) - so Fortran is simpler here.
      The new KIND attributes are no worse than C's plethora of types.
      The new flow control (CASE, DO WHILE, CYCLE, EXIT, etc.) are about
            the same as C.  Fortran is actually simpler since 'blocks'
            aren't separate scopes as they are in C.
      Fortran 90 has interface blocks, which serve a function similar to
            ANSI C's function prototypes.  Both would cost about the same
            to implement but Fortran allows procedures to be overloaded
            and C doesn't.
      Fortran 90 allows overloaded and user-defined operators.  C doesn't
            (not even ANSI C).  The cost of implementing such operators
            is about the same as overloaded procedures (and is done the
            same way: with interface blocks).  The only additional work
            is to modify the parser to recognize user defined operators
            (parsers are the most automated part of compiler construction,
            A simple grammar change is sufficient to change the operator
            into the corresponding function call - which is possible
            'inlined' if the function is defined in the same MODULE
            as the call).
      Fortran 90's MODULEs introduce a scoping mechanism that is basically
            identical to C's use of 'files'.  The implementation cost should
            be about the same.
      Fortran 90's ALLOCATABLE attribute gives Fortran the same capabiltiy
            as C's malloc()/free().  It is about the same difficulty to
            implement except that ALLOCATABLE objects are _known_ not to
            be aliased - so they optimize better.
      Fortran 90's pointers are abominable.  Used as simple pointers (in
            linked lists and such) these pointers are simpler than those
            in C (no pointer arithhmetic).  Pointers to array slices are
            terrible: they add _NO_ functionality to the "whole array"
            syntax, but they introduce the possibility of aliasing which
            wouldn't be there without them.  Nevertheless, there is a
            simple (but inefficient) way to implement these.  Since I
            don't believe they _CAN_ be made efficient, this simple
            implementation is all they deserve.
      Etc..

In fact, the only part of Fortran 90 that is intrinsically _different_
from some corresponding part of ANSI C is the "whole array" functionality.
The reason is that C _doesn't_have_ anything of the sort.  The question
is, is it really all that hard to implement?

Well, yes and no.  There are very simple implementations which might provide
all the performance you need on a scalar machine, but which would be very
inefficient on a parallel machine.  On the other hand, the vendors of
parallel hardware have spent almost a decade finding ways to optimize
array operations - this might even simplify their task since they will
no longer have to analyse nested loops just to find what array operations
are really being performed.  I don't think this particular feature will
be a major stumbling block.

> [...]
> As for aliasing, I don't see any sequential optimizations hurt by it
> that shouldn't be done by hand in any case.

Boy, I'm impressed.  You actually do strength reduction, common expression
elimination, elimination of loop invariants, register and cache scheduling,
etc. ALL BY HAND?  Wait a minute - you'd have to use assemble exclusively
in order to do all that!

J. Giles

gudeman@cs.arizona.edu (David Gudeman) (02/03/90)

In article  <KHB.90Jan29185013@chiba.kbierman@sun.com> khb@chiba.kbierman@sun.com (Keith Bierman - SPD Advanced Languages) writes:
>
>*sigh* Do mathfolk forget what
>              T  -1
>	A = (A A)
>
>means a week after writing it down ? It is absurd to believe that what
>has worked for a thousand years of mathematics won't work for
>programming language.	

This is not a good anology.  First, mathematical writings are usually
much shorter than programs.  Second, mathematical writings usually are
accompanied by more English explanation than notation.  Third,
mathatical notation usually has a much longer time of consideration
behind it than does code.  Fourth, mathameticians usually either write
all their own notation or independently and exhaustively review any
notation writen by a co-author.

I think that if a mathematitican wrote two thousand pages of dense
notation with less than, say 20% of it English text, then yes, he
would forget what some of the notation stands for.
-- 
					David Gudeman
Department of Computer Science
The University of Arizona        gudeman@cs.arizona.edu
Tucson, AZ 85721                 noao!arizona!gudeman

pcg@aber-cs.UUCP (Piercarlo Grandi) (02/04/90)

In article <14219@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:

an amusing article, which I will comment upon later, but I
cannot resist this little quick pun:

    From article <PCG.90Jan29131938@rupert.cs.aber.ac.uk>, by
    pcg@rupert.cs.aber.ac.uk (Piercarlo Grandi):
    
    > From my armchair I often see the "communication" approach give
    > better performance than the "obey cleverly" one, both at the
    > macro (Big-O wise) and micro (constant factor wise) scale.
    
    An armchair programmer - what next?

A bedtime programmer! For a period of time my machine (a 386
with Unix and goodies on it) was on the desk next to my bed,
and I did write quite a few lines of code while laying in it. My
girlfriend was a bit jealous, but she did actually manage to
win me over... :->.

	Disclaimer: I am not really this nerdish. Just a
	temporary situation. Now my machine is in my study!
	I know of people that are much worse :->.
-- 
Piercarlo "Peter" Grandi           | ARPA: pcg%cs.aber.ac.uk@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcvax!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk

jeff@aiai.ed.ac.uk (Jeff Dalton) (03/02/90)

In article <14203@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:
>You just made it.  In C there is _NO_DIFFERENCE_ between a pointer and
>an array. 

Can't assign to something declared as an array.

seanf@sco.COM (Sean Fagan) (03/05/90)

In article <1895@skye.ed.ac.uk> jeff@aiai.UUCP (Jeff Dalton) writes:
>In article <14203@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:
>>You just made it.  In C there is _NO_DIFFERENCE_ between a pointer and
>>an array. 
>
>Can't assign to something declared as an array.

Not again?!  In C, there is no difference twixt and 'tween the two, provided
it gets passed as an argument to a function.  With, however:

	int a[100];
	int b[100];
	void foo(int x, int y, int z) {
		/* do whatever with a and b */
	}

the compiler does *not* have to worry about aliasing.  If a function is
called, it *does* have to worry about a and b changing (unless they are
static, and no pointers were passed out, which a compiler *can* be aware of
fairly easily).  Voila:  no aliasing problems.

With pointers, however, it is a bit different.

-- 
Sean Eric Fagan  | "Time has little to do with infinity and jelly donuts."
seanf@sco.COM    |    -- Thomas Magnum (Tom Selleck), _Magnum, P.I._
(408) 458-1422   | Any opinions expressed are my own, not my employers'.

jlg@lambda.UUCP (Jim Giles) (03/06/90)

From article <5055@scolex.sco.COM>, by seanf@sco.COM (Sean Fagan):
> [...]
> 	int a[100];
> 	int b[100];
> 	void foo(int x, int y, int z) {
> 		/* do whatever with a and b */
> 	}
> 
> the compiler does *not* have to worry about aliasing.  [...]
> [...]            Voila:  no aliasing problems.

Please try to read the thread before making rash claims.  This thread of
the newsgroup had _already_ discarded global arrays as being insufficiently
general.  No one doubts that global arrays are not an aliasing problem in
C.  No one said they were.  No one wants them to be.  No one considers
making all their arrays global (or work only with local arrays, etc.) to 
be an adequate array mechanism.  The only sufficiently general array
manipulation capability involves passing arrays (of different size and
of multiple dimension) as arguments to procedures.  In this respect, C
has an unavoidable aliasing problem.  In this respect, array minipulation
is _missing_ from C.

[Yes, I know.  I'm still beating this issue to death.  But, as I said:
every time I think the issue is resolved - another invalid C claim is
made.  I've only got thirty years until retirement.  I wonder if this
issue will be dead even then.]

J. Giles