[comp.lang.fortran] function side effects

jlg@lanl.gov (Jim Giles) (09/20/88)

From article <3983@h.cc.purdue.edu>, by ags@h.cc.purdue.edu (Dave Seaman):
> In article <3701@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
>  [ After I quoted the standard to prove he was wrong about side effects]
> 
>>of Section 6.6.2, "Order of Evaluation of Functions":
>>
>>      In a statement that contains more than one function
>>      reference, the value provided by each function reference
>>      must be independent of the order chosen by the processor for
>>      evaluation of the function references.

How did your quote prove that I was wrong about side effects? I said that
a particular example of code quoted in a previous article was not allowed
to have function calls which contained side effects (actually I said that
the compiler was free to assume that there were no side effects).  My quote
from the standard _exactly_ proves what I said.  The fact that some other
part of the standard explicitly allows a particular type of side effect
doesn't make it legal in the particular case under consideration.

> You said that it was not possible to implement a random number generator
> in standard Fortran.

I didn't either.  I just pointed out that the standard has never contained
a random number generator and that maybe the constraints (in 6.6 of the
standard) were the reason why.

> You said that allowing side effects is "an extension to the standard."

Side effects other than the one you found (which I admitted that I'd
forgotten) _are_ extensions to the standard.  The standard doesn't
explicitly prohibit side effects anywhere except in section 6.6.  But
they _are_ definitly prohibited in those contexts covered in 6.6.  Some
(probably most) compilers allow functions with side effects even in
those contexts covered by section 6.6 (they allow it by simply not checking
and the rtesulting code produces bad results).

> Let me point out two things about the statement you quoted from the
> standard.  If the intent was to rule out side effects altogether, then why
> is that first clause there?  Why is the restriction so narrowly worded
> that it applies only when there are multiple function calls in a
> statement?  [...]

And why is it that I am not allowed to assume that the issue in question
is the use of multiple function calls in one statement?  That _WAS_ the
example under consideration!! 

>             And why does it only rule out one particular category of side
> effect (the ones that would affect the functions' returned values)?  It
> does not even guarantee that it is safe to evaluate the subscript only
> once in my example which started this discussion.  The function may
> increment a variable in COMMON each time it is called.  [...]

Read the rest of section 6.6 - these things are indeed illegal if they
effect any other part of the statement that makes the call.  That _was_
the context of this discussion wasn't it?

> Isolated special cases do not count.  It is the general rule that I am
> concerned with.

Well, here it is - the admission.  I always suspected that you had
different rules of validity for my arguments than for yours.  The issue
that started this discussion was an isolated special case of the use of 
'+=' and _you_ originated it.  Now, all _my_ arguments are supposed to 
be invalid because I constrained my discussion to that particular case.

---- End of flaming ----

Now.  The real question is not 'when and where can Fortran functions have
side effects and when can't they?'  The question should be: 'how can 
Fortran be improved so that optimization can be done without making
incorrect assumptions about function side effects?'  I still recommend
that functions should be declared to have side effects if they do.
That way the compiler would be free to apply optimizations to those
that don't have side effects _and_ the compiler could actially detect
the illegal (or unsafe) uses of functions which _do_ have side effects.
In Fortran 8x style syntax:

      side effect, external :: func1, func2

      ...

      ABC = func1(4)+func1(5)   !would be unsafe to optimize because 
                                !the order of execution might be important

      XYZ(gfunc(2))=gfunc(2)    !would be safe to optimize (call gfunc only
                                !once), gfunc has no side effects

J. Giles
Los Alamos 

ags@h.cc.purdue.edu (Dave Seaman) (09/20/88)

In article <3745@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
>How did your quote prove that I was wrong about side effects? I said that
>a particular example of code quoted in a previous article was not allowed
>to have function calls which contained side effects (actually I said that
>the compiler was free to assume that there were no side effects).  My quote
>from the standard _exactly_ proves what I said.  The fact that some other
>part of the standard explicitly allows a particular type of side effect
>doesn't make it legal in the particular case under consideration.

I have been discussing both the general and the special case.  So have
you, though you apparently don't want to admit it.  Here is what you said
originally:

>No, it doesn't cause the subscript to be evaluated twice.  In Fortran
>functions are not allowed to have side effects.  

And here is what you are saying now (from the very same posting in 
which you took me to task for discussing the general case):

>Side effects other than the one you found (which I admitted that I'd
>forgotten) _are_ extensions to the standard.  

Therefore you have been discussing the general case yourself, 
whether you admit it or not.  Moreover, your revised statement is 
still incorrect.  Here are two consecutive statements from section 
6.6.  The first is one you have referred to yourself, in the 
mistaken belief that it refutes something I said in my previous 
posting (more on that shortly).  At the moment I need the first 
statement to establish the context for the second, which is one 
you apparently overlooked entirely.  The statement numbers in 
square brackets are mine.  

[1] The execution of a function reference in a statement may 
not alter the value of any entity in common (8.3) that 
affects the value of any other function reference in that 
statement.  [2] However, execution of a function reference in 
the expression e of a logical IF statement (11.5) is 
permitted to affect entities in the statement st that is 
executed when the value of the expression e is true.

In statement [2] we find a second type of side effect that is 
explicitly allowed by the standard.  What's next?  Do you intend 
to change your statement again to say that all side effects except 
the TWO cases which I have cited are forbidden?  I am prepared to 
argue that even that statement is incorrect, but right now I want 
to turn my attention to your charge that I have been ignoring the 
special case which began this discussion.

You have correctly pointed out that because of statement [1] that 
I just quoted, the function call in my example is not permitted to 
return different values if it is called twice in the same 
statement with the same argument.  In my previous posting, 
however, I described another kind of side effect that would not 
violate this prohibition.  Here is an example.

	COMMON // E
	REAL A(10)
	E = 0.0
	DO 10 I=1,10
  	  A(I) = 0.0
10	CONTINUE
	DO 20 I=1,10
	  A(INVERT(I)) = A(INVERT(I)) + 1
20	CONTINUE
	PRINT *, E
	END
	FUNCTION INVERT(N)
	COMMON // E
	E = E + 1
	INVERT = 11 - I
	END

You will notice that the function INVERT has a side effect on the 
COMMON variable E.  This does not violate statement [1], because E 
does not affect the value returned by INVERT. I will be the first 
to agree that this is lousy code.  I am even prepared to agree 
that it might be illegal, provided you can find a statement in the 
standard to support that conclusion.  I have not yet found one.  
If you can't find one either, then perhaps that points out another 
area that needs to be corrected in the next standard.  You said 
you are primarily interested in improving Fortran.  I share your 
concern.

-- 
Dave Seaman	  					
ags@j.cc.purdue.edu

ok@quintus.uucp (Richard A. O'Keefe) (09/20/88)

In article <3745@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
>...  I just pointed out that the standard has never contained
>a random number generator and that maybe the constraints (in 6.6 of the
>standard) were the reason why.

There is no reason why the random number generator has to take the form
of a _function_.  The standard could have specified something like
	CALL RNSEED(I)		/* initialise */
	CALL RNUNIF(X)		/* get next random number */
Perhaps someone associated with the F77 standard is reading this newsgroup
and could tell us the real reason.  I suspect that the answer may have more
to do with the standard not specifying integer arithmetic (e.g. not wanting
to specify a generator which requires 32-bit twos complement with no
overflow detection).  Or they may not have wanted to commit the standard to
a generator which might later prove unsatisfactory.

To return briefly to the "update assignment" notation: I suggest that a
notation like <lhs> <op>:= <rhs> is advantageous to both writer and reader.
To the writer:
	if you don't type the <lhs> twice, you've missed a change to
	get it wrong.
To the reader:
	it is *obvious* that the lhs is being updated without having to
	decode a complicated expression.
Never mind whether the <lhs> is evaluated once or twice (in a well-written
program it really shouldn't matter): you only have to _write_ it once and
you only have to _read_ it once.

scf@statware.UUCP (Steve Fullerton) (09/20/88)

In article <3745@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
>
>Now.  The real question is not 'when and where can Fortran functions have
>side effects and when can't they?'  The question should be: 'how can 
>Fortran be improved so that optimization can be done without making
>incorrect assumptions about function side effects?'  I still recommend
>that functions should be declared to have side effects if they do.
>That way the compiler would be free to apply optimizations to those
>that don't have side effects _and_ the compiler could actially detect
>the illegal (or unsafe) uses of functions which _do_ have side effects.
>In Fortran 8x style syntax:
>
>      side effect, external :: func1, func2
>
>      ...

The Fortran compiler for the HP9000 Series 800 machines has many compiler
optimization directives that allow the specification of side effects and
more.

$OPTIMIZE ASSUME_NO_PARAMETER_OVERLAPS  is no actual parameters passed to
      a procedure overlap each other.

$OPTIMIZE ASSUME_NO_SIDE_EFFECTS is the current procedure changes only
      local varaibles.  It does not change any variables in COMMON, nor does
      it change parameters.

$OPTIMIZE ASSUME_PARM_TYPES_MATCHED is all of the actual parameters passed
      were the type expected by this subroutine.

$OPTIMIZE ASSUME_NO_EXTERNAL_PARMS is none of the parameters passed to the
      current procedure are from an external space, that is, different from
      the user's own data space.  Parameters can come from another space if
      they come from operating system space or if they are in a space shared
      by other users.

$OPTIMIZE ASSUME_NO_SHARED_COMMON_PARMS is none of the parameters passed to
      the current procedure are from a shared common block.

These directives can be sprinkled throughout the code, which makes a mess
not to mention highly non-portable code; however, certain ones that you
always want in effect can be placed into a file which the compiler will
include at the beginning of each file. (+Q file).

They also have different optimization levels, level 1 (optimization within
each basic block), level 2 minimum (optimizes within each procedure with no
assumptions on interactions of procedures, all ASSUME settings OFF),
level 2 normal (optimizes within each procedure with ASSUME_NO_SIDE_EFFECTS
turned OFF, all others ON), level 2 maximum (all assumptions set to OFF).

The main problem I have encountered with optimization is that while users
want as much optimization as possible, the higher levels of optimization
absolutely require well tested code; e.g., no assumptions of local variables
being zero, etc.  Many programmers find that code that has always `worked'
now fails under a high level of optimization and the easiest thing to do is
to blame the optimizer.

Just my 2 cents worth---I have used a lot of different compilers, from
micro's to mainframes and the unfortunate fact of life is that I now write
code that I know will work when ported to all of these systems, rather than
following the standards or literature.  When a particular compiler has problems
with a certain coding construct, that construct is removed from my code for all
systems.  When porting new revisions of 130,000 line program every year to
several systems, it has to be this way.

-- 
Steve Fullerton                        Statware, Inc.
scf%statware.uucp@cs.orst.edu          260 SW Madison Ave, Suite 109
orstcs!statware!scf                    Corvallis, OR  97333
                                       503/753-5382

ags@h.cc.purdue.edu (Dave Seaman) (09/20/88)

In article <3987@h.cc.purdue.edu> I write:
>	FUNCTION INVERT(N)
>	COMMON // E
>	E = E + 1
>	INVERT = 11 - I
>	END

Sorry, that last assignment should be INVERT = 11 - N.  The point is that
INVERT has a side effect on E, but the value of E does not affect the value
returned by INVERT.

-- 
Dave Seaman	  					
ags@j.cc.purdue.edu

jlg@lanl.gov (Jim Giles) (09/21/88)

From article <3987@h.cc.purdue.edu>, by ags@h.cc.purdue.edu (Dave Seaman):
> [...]                             What's next?  Do you intend 
> to change your statement again to say that all side effects except 
> the TWO cases which I have cited are forbidden? [...]

No, I am going to state once again that the specific example you cited
to start this discussion is not allowed to have side effects.  I am
prepared to say that section 6.6 outlaws all but the most trivial side
effects (and the standard committee _intended_ to outlaw _all_ of them)
in the kind of statement under discussion.  You are the one who keeps
introducing into the discussion things which weren't relevant to the
original issue.  If you want to talk about side effects in general - fine.
Read section 1.4 of the standard document.  Even if side effects _were_
completely outlawed, section 1.4 permits a standard conforming compiler
to provide them as an extension.  So, I have no problem with side effects
anywhere except where they conflict with explicit prohibitions of the
standard - I never have.

> 	  A(INVERT(I)) = A(INVERT(I)) + 1

I have talked to members of the Fortran standards committee.  It is intended
that the following optimization of the above statement be permissible:

      TEMP = INVERT(I)
      A(TEMP) = A(TEMP) + 1

This is intended to have exactly the same meaning as the line above.
It is always possible that you have come across a type of side effect
which the committee forgot to outlaw when considering this issue.  In
that event, I would bet that the committee would agree that the
optimization given here is still valid. (After all, what leads you to
believe that a function is called twice just because its name appears
twice in a statement?  The standard doesn't _require_ that, you know.
The only thing required by the standard is that it compute the statement
correctly - the above optimization still _does_ that.)

----------------------------------------------------------

Still, you haven't addressed the REAL issue that _I'm_ interested in -
should side effects be allowed where Fortran currently doesn't allow
them?  If so, how can the language be modified to make such an allowance
without throwing out the ability to optimize expressions?  This has a
bearing on the original subject of this discussion - should Fortran
(or any language) have those extra assignment operators?  If functions
aren't allowed side effects, the extra assignments are not very interesting.

J. Giles
Los Alamos

ok@quintus.uucp (Richard A. O'Keefe) (09/21/88)

In article <3821@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
>Still, you haven't addressed the REAL issue that _I'm_ interested in -
>should side effects be allowed where Fortran currently doesn't allow
>them?  If so, how can the language be modified to make such an allowance
>without throwing out the ability to optimize expressions?  This has a
>bearing on the original subject of this discussion - should Fortran
>(or any language) have those extra assignment operators?  If functions
>aren't allowed side effects, the extra assignments are not very interesting.

I imagine that Fortran functions are allowed to change things in order that
they may be passed suitably dimensioned working storage, and that if/when
functions were/are allowed to have dynamically dimensioned local arrays,
function side effects could/can be dispensed with entirely.

This has essentially no bearing on the question of whether a language should
have assignment operators:  they are, like long identifiers, technically a
luxury, but in practice a useful aid to reading and writing code.

Note that C could dispense with update assignment:  for each type T and
operation O for which Lhs O= Rhs is currently defined, we could define
	T T_O_asg(T *Lhs, T Rhs) { return *Lhs = *Lhs O Rhs; }
e.g.	long long_plus_asg(long *Lhs, long Rhs) { return *Lhs = *Lhs + Rhs; }
and then use
	TOasg(&(Lhs), Rhs)
instead of Lhs O= Rhs, e.g.
	(void) long_plus_asg(&i, 2);
instead of i += 2;
This would take care of the "side effects occur only once" point, and with
C++'s inline functions would presumably result in the same code being
generated.  It's (now) a "readability" issue, not a "power" issue.