[comp.lang.fortran] FORTRAN array aliasing -- potential problems?

mckenney@ROBALO.NYU.EDU (Alan McKenney) (08/01/89)

[USENET posting via mail, since our posting software doesn't work]


Subject: FORTRAN array aliasing -- potential problems?
Newsgroups: comp.lang.fortran,comp.arch
Keywords: FORTRAN argument array aliasing
Followup-To: comp.lang.fortrancomp.lang.fortran
Reply-To: mckenney@robalo.nyu.edu

    I am considering a situation in which I would call a FORTRAN
subroutine which has several dummy argument arrays using the
same actual array.  Example:

	SUBROUTINE SUB1( N, A, LDA, B, LDB, C, LDC, .... )

    .... coding which sets and then uses A, then sets and uses B
	without referencing A again, then sets and uses C without
	referencing A or B again.  ...
	END
   ....
	DIMENSION WORK(LDW,M)
   ....
	CALL SUB1( M, WORK, LDW, WORK, LDW, WORK, LDW, ... )

In other words, if A, B, and C were not arguments, I could EQUIVALENCE
them without breaking things.  As SUB1 stands, if I am interested
in the final values of A, B, and C, I call SUB1 with distinct arrays,
if not, I call it with the same array.

    The call to SUB1 violates the FORTRAN standard, since A, B, and C
actually refer to the same array, and are modified.  In most
implementations that I know of, A, B, and C are effectively
EQUIVALENCEd to WORK, so it is as if I had used A instead of B and C.
The code I have described works on such a system.

    The FORTRAN on the DECsystem-10 used to copy scalar arguments
into local storage on entry and copy them back on exit; I can conceive
of systems which would do that for arrays as well, so WORK would be set
to either A, B, or C (it being undefined which, but it would be exactly
one of them.)  My code would also work on such a system.

    My question then: are there FORTRAN implementations, either
existing or which one could make an argument for using, under which
such code would break?


Alan McKenney        E-mail:  mckenney@robalo.nyu.edu          (INTERNET)
Courant Institute,NYU         ...!uunet!cmcl2!robalo!mckenney  (UUCP)

ags@mentor.cc.purdue.edu (Dave Seaman) (08/01/89)

In article <8907311910.AA15243@robalo.nyu.edu> mckenney@ROBALO.NYU.EDU (Alan McKenney) writes:

  [Description of subroutine call involving aliased arrays]

>    The call to SUB1 violates the FORTRAN standard, since A, B, and C
>actually refer to the same array, and are modified.  In most
>implementations that I know of, A, B, and C are effectively
>EQUIVALENCEd to WORK, so it is as if I had used A instead of B and C.
>The code I have described works on such a system.
>
>    The FORTRAN on the DECsystem-10 used to copy scalar arguments
>into local storage on entry and copy them back on exit; I can conceive
>of systems which would do that for arrays as well, so WORK would be set
>to either A, B, or C (it being undefined which, but it would be exactly
>one of them.)  My code would also work on such a system.
>
>    My question then: are there FORTRAN implementations, either
>existing or which one could make an argument for using, under which
>such code would break?

Several routines in the standard eigensystem package known as EISPACK make
similar assumptions about associated arrays, and some of the routines fail on
a CDC Cyber 205 under FTN200 (and presumably also on ETA machines running
ftn77, which is derived from FTN200).

The problem is that some of the routines contain associated output arrays
called E and E2, where the documentation says that if you don't care about E2,
then you can use the same array for both.  The code uses DO loops of the form,

	DO 200 I=1,N
	  E2(I) = ...something...
	  E(I)  = ...something else...
    200 CONTINUE

The code explicitly depends on the E values being stored after the E2 values.
The trouble is that an optimizing compiler has every right to assume that E
and E2 are not associated, and therefore that the values may be stored into E
and E2 in whichever order is convenient for maximum efficiency.

Getting back to the question that was asked, it appears to be true in this case
that the resulting values are either all E2 values or all E values, not a
mixture of the two.  Even that may change if the program contains conditional
stores, or DO loops with nonunit strides.

-- 
Dave Seaman	  					
ags@seaman.cc.purdue.edu

johnl@esegue.uucp (John Levine) (08/01/89)

In article <8907311910.AA15243@robalo.nyu.edu> mckenney@ROBALO.NYU.EDU (Alan McKenney) writes:
>    I am considering a situation in which I would call a FORTRAN
>subroutine which has several dummy argument arrays using the
>same actual array.  Example:
>
>	SUBROUTINE SUB1( N, A, LDA, B, LDB, C, LDC, .... )
>
>    .... coding which sets and then uses A, then sets and uses B
>	without referencing A again, then sets and uses C without
>	referencing A or B again.  ...

It's easy to imagine a situation where using the same array for A, B, and C
would get you into trouble, even though they are used in disjoint parts of
the program.  A modern compiler that uses graph coloring for register
assignment tends to get everything into registers that it can.  On a machine
with lots of registers, at the end of the routine it might well have values
of A(I) and B(J) lying around that it hadn't yet stored, and would then store
them away, clobbering C(I) and C(J) that you expect.  Fortran compiler
writers know and depend on the no aliasing rule and you'll certainly be
bitten sooner or later.
-- 
John R. Levine, Segue Software, POB 349, Cambridge MA 02238, +1 617 492 3869
{ima|lotus}!esegue!johnl, johnl@ima.isc.com, Levine@YALE.something
Massachusetts has 64 licensed drivers who are over 100 years old.  -The Globe

khb%chiba@Sun.COM (chiba) (08/01/89)

In article <8907311910.AA15243@robalo.nyu.edu> mckenney@ROBALO.NYU.EDU (Alan McKenney) writes:
>

>    I am considering a situation in which I would call a FORTRAN
>subroutine which has several dummy argument arrays using the
>same actual array.  Example:

>.... details deleted...


>    My question then: are there FORTRAN implementations, either
>existing or which one could make an argument for using, under which
>such code would break?

I used to support a collection of codes for kalman filtering. After
porting it to at least 30 different vendors machines we were
convienced that

	call rank1(a,a,...)

worked just fine, standard notwithstanding.

Then we got to the CDC7600. With no optimization, it worked (but one
also automagically got tons of debugging stuff put in). With even
minimal optimization, it broke. turned out that between code motion
and instruction scheduling, about 100 lines of code were rewritten by
the compiler so that line 100 was executed just after line 1. This
broke the code, thanks to the implicit aliasing.

As features from the 66xx, Cydra5 and Trace/xxx and other machines
which really benefit from code motion+instruction scheduling become
popular (e.g. SPARC in current and proposed incarnations) I suspect
that over time your code will begin to break more often.

Stick to the standard. If you must violate it, violate it with
something that smacks of good programming (e.g. malloc calls, record
variables, etc.).

cheers



Keith H. Bierman      |*My thoughts are my own. Only my work belongs to Sun*
It's Not My Fault     |	Marketing Technical Specialist    ! kbierman@sun.com
I Voted for Bill &    |   Languages and Performance Tools. 
Opus  (* strange as it may seem, I do more engineering now     *)

mike@hpfcdc.HP.COM (Mike McNelly) (08/01/89)

As most of the responses have indicated, the real problem with using
parameters to "equivalence" arrays as you've done is in optimization.
Most optimizers assume that formal arguments have distinct memory
locations so that they can be manipulated as distinct entities.  If that
is not the case, you will run into problems with most of the
conventional optimization techniques.

Some compilers have mechanisms to alter this assumption at the expense
of limiting optimization.  I'm familiar with HP 9000 machines which have
this capability.

Mike McNelly
mike%hpfcla@hplabs.hp.com