[comp.lang.fortran] Fortran 8x: pointers

hirchert@uxe.cso.uiuc.edu (07/09/89)

There has been a a lot of traffic in this newsgroup about Fortran 8x pointers.
I think it may be time to offer my $0.02.


The first thing which strikes most readers is that the Fortran 8x approach is
quite different from what they may have encountered in C or Pascal.  Instead of
pointers being a different data type with an explicit notation to get from
the pointer value to the object it points to (the "pointee") , pointers in
Fortran 8x are an attribute of the pointee, no special notation is necessary
to access the pointee, and special statements and procedures are used to access
the pointer behind the pointee.  Consider the following:

In FORTRAN 77, a PARAMETER statement binds a name directly to a value; an
ordinary variable declaration binds a name to a memory location that can be
bound to a value (1 level indirect), and the appearance of a name in a dummy
argument list has the effect in most implemenatations of binding the name to a
memory location that is bound to a memory location that can be bound to a value
(2 levels indirect).  The symbolic constant, the variable, and the dummy
may all be of the same type and are used in exactly the same way when a value
is needed in an expression.  Similarly, on the left hand side of an assignment
statement uses the same sytax for an ordinary variable and a dummy variable
in getting to an object that is 1 level indirect to rebind to a new value.

The Fortran 8x approach to pointers is a logical extension of this situation.
Ordinary objects with the POINTER attribute are 2 levels indirect and dummy
arguments with the POINTER attribute are 3 levels indirect.  Syntactic
uniformity is retained in the existing contexts.  In the new context, the
pointer assignment statement, the right hand side is uniformly interpreted at
1 level indirect and the left hand side at 2 levels indirect.

In other words, one of the reasons Fortran 8x pointers look the way the do is
because the committee chose to be consistent with FORTRAN rather than other
languages that already had pointers.  It also offers a number of other
benefits, like keeping simple the notation used to deal with the actual data
and flagging the manipulation of the pointers (the usual source of problems
in pointer algorithms gone wrong), but I consider these benefits secondary to
the consistency issue.

If you feel that pointers as a data type and explicit dereferencing is 
necessary to your programming wellbeing, you can do that with the Fortran 8x
facility.  First, create your pointer datatype as a derived type.
      TYPE PTRTO_REAL; REAL,POINTER::VAL; END TYPE
Then you can declare pointers of that type.
      TYPE(PTRTO_REAL)::P,Q
Ordinary derived-type assignment will do your pointer assignments.
      P=Q
Component selection is your explicit dereferencing operator.
      P%VAL=P%VAL+1
Unfortunately, .EQ. is no longer automatically defined for derived types and
objects with the POINTER attribute are no longer automatically in the
disassociated state, so you will have to define .EQ. yourself (using
the ASSOCIATED intrinsic) if you need it, and explicitly NULLIFY a variable
to serve as your NIL pointer, if you need that.  (I plan on trying to get
the committee to modify those decisions, so this style of usage would be more
convenient.)


Given that Fortran 8x uses a different syntactic style for pointers, an
obvious question is whether it is as expressive.

For algorithms not involving pointers to pointers or arrays of pointers,
it is possible to directly transcribe pointer algorithms in the Fortran 8x
notation.  In addition, many algorithms using pointers to pointers or arrays
of pointers can be massaged into algorithms that do not.  The one problem with
doing such transcriptions is that good pointer names may not always be good
pointee names (from a readability standpoint).

If you absolutely _must_ have pointers to pointers or arrays of pointers,
the creation of user-defined pointer types (illustrated above) will allow
you to have them.

Thus, any algorithm expressable using "conventional" pointers should also
be expressable in the Fortran 8x pointer facility.


The issue of optimization has come up.

Because ordinary variables that are to be pointed to must be declared with the
TARGET attribute, the presence of pointers should have _no_ effect on the
optimization of programs that do not use them (e.g. existing FORTRAN 77
programs).

The opportunities to optimize the pointers themselves are somewhat more
limited, although not as badly as some people have claimed.  Global pointers
offer many of the same optimization problems as global variables, but local
pointers can optimized fairly intelligently.  There are cases where giving
more information to the processor would allow it to optimize more effectively.
I intend to lobby for declarations which give the programmer the option of
providing this information.


How do Fortran 8x pointers compare with SET RANGE and IDENTIFY.

The pointer facility was political replacement for these two features, not
a technical one.  SET RANGE has frequently been misunderstood, as it has
nothing to do with setting up pointers or dope vectors.  Instead, it provides
a means for the programmer to specify the values to be used when array bounds
were omitted.  It could be implemented by a source to source translation.
It probably would have been dropped anyway because of recurring problems in
describing the scope and lifetime of these settable defaults.  IDENTIFY and
pointers are quite similar.  Pointers are more powerful in their ability
to implement recursive (and other linked) data structures.  IDENTIFY was more
powerful in its ability to map skew sections of arrays (e.g. the diagonal of
a square matrix).  Pointers would cover this latter functionality if there were
a stand-alone notation for skew sections.  I intend to lobby for such a
notation.  Because of its limitations, there were some cases where IDENTIFY
could be more fully optimized.  I believe most, if not all, of these cases
would be covered by the optimization declarations I mentioned above.


Should pointers be only addresses or something more?  Most Fortran 8x pointers
would be nothing but addresses.  They would contain extra information only
when there is additional structure to the pointee.  In such cases, it is much
less error prone to have the processor maintain this information with the
address than to have to programmer maintain it separately.  I would, however,
support an additional declaration that asserts that a pointer will point
only to contiguous objects.  Although being able to point to discontiguous
objects is occasionally useful, there will be many programs that will have
no need to pay this extra cost.  (I would even be willing to have this
"contiguous" assertion be the default, as long as there was an option to get
the discontiguous option.)


Finally, there has been some suggestion that recursive types and aliasing
should be separate facilities.  I have worked on programs where I really needed
only facility to do both and attempting to use separate facilities resulted
in kludgy code.

Also, despite implecations to the contrary, X3J3 _did_ look at the possibility
of directly putting recursive types into the language.  They just didn't
work well relative to Fortran assignment and comparison semantics.  (Among the
problem area were mutually recursive types, and the creation of circularly
linked lists and other organizations which have no "leaf nodes".)


I guess that's all for now.  Flame if you will this old gray head; I'm
leaving for the WG5 and X3J3 meetings in a few hours, and your postings
will probably expire before I get back.


Kurt W. Hirchert     hirchert@ncsa.uiuc.edu
National Center for Supercomputing Applications

bill@ssd.harris.com (Bill Leonard) (07/10/89)

> In FORTRAN 77, a PARAMETER statement binds a name directly to a value; an
> ordinary variable declaration binds a name to a memory location that can be
> bound to a value (1 level indirect), and the appearance of a name in a dummy
> argument list has the effect in most implemenatations of binding the name to a
> memory location that is bound to a memory location that can be bound to a value
> (2 levels indirect).  The symbolic constant, the variable, and the dummy
> may all be of the same type and are used in exactly the same way when a value
> is needed in an expression.  Similarly, on the left hand side of an assignment
> statement uses the same sytax for an ordinary variable and a dummy variable
> in getting to an object that is 1 level indirect to rebind to a new value.
> 
> The Fortran 8x approach to pointers is a logical extension of this situation.
> Ordinary objects with the POINTER attribute are 2 levels indirect and dummy
> arguments with the POINTER attribute are 3 levels indirect.  Syntactic
> uniformity is retained in the existing contexts.  In the new context, the
> pointer assignment statement, the right hand side is uniformly interpreted at
> 1 level indirect and the left hand side at 2 levels indirect.
> 
> In other words, one of the reasons Fortran 8x pointers look the way the do is
> because the committee chose to be consistent with FORTRAN rather than other
> languages that already had pointers.  It also offers a number of other
> benefits, like keeping simple the notation used to deal with the actual data
> and flagging the manipulation of the pointers (the usual source of problems
> in pointer algorithms gone wrong), but I consider these benefits secondary to
> the consistency issue.

I find this whole argument specious.  First of all, the discussion of dummy
arguments is based on how (most) processors implement them -- it isn't
something mandated by the standard.  Secondly, "being consistent with
FORTRAN" makes no sense; you might as well argue that structures should be
just like COMPLEX (i.e., using intrinsics to get at the parts), because
that is "consistent with FORTRAN".  The obvious retort is that it just
isn't what people expect when they think of structures; likewise, what
FORTRAN/8x has just isn't what people expect when they think of pointers.

> Thus, any algorithm expressable using "conventional" pointers should also
> be expressable in the Fortran 8x pointer facility.

And any input/output algorithm expressable in FORTRAN is also expressable
using only calls to a low-level "get character" routine -- so why have I/O?
You know why -- to improve efficiency and readability.  Many people seem
to have missed the point of the examples using one pointer versus two
for linked-list manipulation: it is much more efficient!  To a programmer
experienced in manipulating linked lists, requiring him to use two pointers
is similar to asking a numerical analyst to use back-substitution to solve
a system of linear equations.

> Pointers would cover this latter functionality if there were
> a stand-alone notation for skew sections.  I intend to lobby for such a
> notation.  Because of its limitations, there were some cases where IDENTIFY
> could be more fully optimized.  I believe most, if not all, of these cases
> would be covered by the optimization declarations I mentioned above.

God help us.  If you think Algol 60 is a great language, I'm sure you'll
love this.  For those of us who still have nightmares of implementing
"call by name", this is even worse: arbitrary functions to be executed
upon referencing a pointer!

--
Bill Leonard
Harris Computer Systems Division
2101 W. Cypress Creek Road
Fort Lauderdale, FL  33309
bill@ssd.harris.com or hcx1!bill@uunet.uu.net