hirchert@uxe.cso.uiuc.edu (07/09/89)
There has been a a lot of traffic in this newsgroup about Fortran 8x pointers. I think it may be time to offer my $0.02. The first thing which strikes most readers is that the Fortran 8x approach is quite different from what they may have encountered in C or Pascal. Instead of pointers being a different data type with an explicit notation to get from the pointer value to the object it points to (the "pointee") , pointers in Fortran 8x are an attribute of the pointee, no special notation is necessary to access the pointee, and special statements and procedures are used to access the pointer behind the pointee. Consider the following: In FORTRAN 77, a PARAMETER statement binds a name directly to a value; an ordinary variable declaration binds a name to a memory location that can be bound to a value (1 level indirect), and the appearance of a name in a dummy argument list has the effect in most implemenatations of binding the name to a memory location that is bound to a memory location that can be bound to a value (2 levels indirect). The symbolic constant, the variable, and the dummy may all be of the same type and are used in exactly the same way when a value is needed in an expression. Similarly, on the left hand side of an assignment statement uses the same sytax for an ordinary variable and a dummy variable in getting to an object that is 1 level indirect to rebind to a new value. The Fortran 8x approach to pointers is a logical extension of this situation. Ordinary objects with the POINTER attribute are 2 levels indirect and dummy arguments with the POINTER attribute are 3 levels indirect. Syntactic uniformity is retained in the existing contexts. In the new context, the pointer assignment statement, the right hand side is uniformly interpreted at 1 level indirect and the left hand side at 2 levels indirect. In other words, one of the reasons Fortran 8x pointers look the way the do is because the committee chose to be consistent with FORTRAN rather than other languages that already had pointers. It also offers a number of other benefits, like keeping simple the notation used to deal with the actual data and flagging the manipulation of the pointers (the usual source of problems in pointer algorithms gone wrong), but I consider these benefits secondary to the consistency issue. If you feel that pointers as a data type and explicit dereferencing is necessary to your programming wellbeing, you can do that with the Fortran 8x facility. First, create your pointer datatype as a derived type. TYPE PTRTO_REAL; REAL,POINTER::VAL; END TYPE Then you can declare pointers of that type. TYPE(PTRTO_REAL)::P,Q Ordinary derived-type assignment will do your pointer assignments. P=Q Component selection is your explicit dereferencing operator. P%VAL=P%VAL+1 Unfortunately, .EQ. is no longer automatically defined for derived types and objects with the POINTER attribute are no longer automatically in the disassociated state, so you will have to define .EQ. yourself (using the ASSOCIATED intrinsic) if you need it, and explicitly NULLIFY a variable to serve as your NIL pointer, if you need that. (I plan on trying to get the committee to modify those decisions, so this style of usage would be more convenient.) Given that Fortran 8x uses a different syntactic style for pointers, an obvious question is whether it is as expressive. For algorithms not involving pointers to pointers or arrays of pointers, it is possible to directly transcribe pointer algorithms in the Fortran 8x notation. In addition, many algorithms using pointers to pointers or arrays of pointers can be massaged into algorithms that do not. The one problem with doing such transcriptions is that good pointer names may not always be good pointee names (from a readability standpoint). If you absolutely _must_ have pointers to pointers or arrays of pointers, the creation of user-defined pointer types (illustrated above) will allow you to have them. Thus, any algorithm expressable using "conventional" pointers should also be expressable in the Fortran 8x pointer facility. The issue of optimization has come up. Because ordinary variables that are to be pointed to must be declared with the TARGET attribute, the presence of pointers should have _no_ effect on the optimization of programs that do not use them (e.g. existing FORTRAN 77 programs). The opportunities to optimize the pointers themselves are somewhat more limited, although not as badly as some people have claimed. Global pointers offer many of the same optimization problems as global variables, but local pointers can optimized fairly intelligently. There are cases where giving more information to the processor would allow it to optimize more effectively. I intend to lobby for declarations which give the programmer the option of providing this information. How do Fortran 8x pointers compare with SET RANGE and IDENTIFY. The pointer facility was political replacement for these two features, not a technical one. SET RANGE has frequently been misunderstood, as it has nothing to do with setting up pointers or dope vectors. Instead, it provides a means for the programmer to specify the values to be used when array bounds were omitted. It could be implemented by a source to source translation. It probably would have been dropped anyway because of recurring problems in describing the scope and lifetime of these settable defaults. IDENTIFY and pointers are quite similar. Pointers are more powerful in their ability to implement recursive (and other linked) data structures. IDENTIFY was more powerful in its ability to map skew sections of arrays (e.g. the diagonal of a square matrix). Pointers would cover this latter functionality if there were a stand-alone notation for skew sections. I intend to lobby for such a notation. Because of its limitations, there were some cases where IDENTIFY could be more fully optimized. I believe most, if not all, of these cases would be covered by the optimization declarations I mentioned above. Should pointers be only addresses or something more? Most Fortran 8x pointers would be nothing but addresses. They would contain extra information only when there is additional structure to the pointee. In such cases, it is much less error prone to have the processor maintain this information with the address than to have to programmer maintain it separately. I would, however, support an additional declaration that asserts that a pointer will point only to contiguous objects. Although being able to point to discontiguous objects is occasionally useful, there will be many programs that will have no need to pay this extra cost. (I would even be willing to have this "contiguous" assertion be the default, as long as there was an option to get the discontiguous option.) Finally, there has been some suggestion that recursive types and aliasing should be separate facilities. I have worked on programs where I really needed only facility to do both and attempting to use separate facilities resulted in kludgy code. Also, despite implecations to the contrary, X3J3 _did_ look at the possibility of directly putting recursive types into the language. They just didn't work well relative to Fortran assignment and comparison semantics. (Among the problem area were mutually recursive types, and the creation of circularly linked lists and other organizations which have no "leaf nodes".) I guess that's all for now. Flame if you will this old gray head; I'm leaving for the WG5 and X3J3 meetings in a few hours, and your postings will probably expire before I get back. Kurt W. Hirchert hirchert@ncsa.uiuc.edu National Center for Supercomputing Applications
bill@ssd.harris.com (Bill Leonard) (07/10/89)
> In FORTRAN 77, a PARAMETER statement binds a name directly to a value; an > ordinary variable declaration binds a name to a memory location that can be > bound to a value (1 level indirect), and the appearance of a name in a dummy > argument list has the effect in most implemenatations of binding the name to a > memory location that is bound to a memory location that can be bound to a value > (2 levels indirect). The symbolic constant, the variable, and the dummy > may all be of the same type and are used in exactly the same way when a value > is needed in an expression. Similarly, on the left hand side of an assignment > statement uses the same sytax for an ordinary variable and a dummy variable > in getting to an object that is 1 level indirect to rebind to a new value. > > The Fortran 8x approach to pointers is a logical extension of this situation. > Ordinary objects with the POINTER attribute are 2 levels indirect and dummy > arguments with the POINTER attribute are 3 levels indirect. Syntactic > uniformity is retained in the existing contexts. In the new context, the > pointer assignment statement, the right hand side is uniformly interpreted at > 1 level indirect and the left hand side at 2 levels indirect. > > In other words, one of the reasons Fortran 8x pointers look the way the do is > because the committee chose to be consistent with FORTRAN rather than other > languages that already had pointers. It also offers a number of other > benefits, like keeping simple the notation used to deal with the actual data > and flagging the manipulation of the pointers (the usual source of problems > in pointer algorithms gone wrong), but I consider these benefits secondary to > the consistency issue. I find this whole argument specious. First of all, the discussion of dummy arguments is based on how (most) processors implement them -- it isn't something mandated by the standard. Secondly, "being consistent with FORTRAN" makes no sense; you might as well argue that structures should be just like COMPLEX (i.e., using intrinsics to get at the parts), because that is "consistent with FORTRAN". The obvious retort is that it just isn't what people expect when they think of structures; likewise, what FORTRAN/8x has just isn't what people expect when they think of pointers. > Thus, any algorithm expressable using "conventional" pointers should also > be expressable in the Fortran 8x pointer facility. And any input/output algorithm expressable in FORTRAN is also expressable using only calls to a low-level "get character" routine -- so why have I/O? You know why -- to improve efficiency and readability. Many people seem to have missed the point of the examples using one pointer versus two for linked-list manipulation: it is much more efficient! To a programmer experienced in manipulating linked lists, requiring him to use two pointers is similar to asking a numerical analyst to use back-substitution to solve a system of linear equations. > Pointers would cover this latter functionality if there were > a stand-alone notation for skew sections. I intend to lobby for such a > notation. Because of its limitations, there were some cases where IDENTIFY > could be more fully optimized. I believe most, if not all, of these cases > would be covered by the optimization declarations I mentioned above. God help us. If you think Algol 60 is a great language, I'm sure you'll love this. For those of us who still have nightmares of implementing "call by name", this is even worse: arbitrary functions to be executed upon referencing a pointer! -- Bill Leonard Harris Computer Systems Division 2101 W. Cypress Creek Road Fort Lauderdale, FL 33309 bill@ssd.harris.com or hcx1!bill@uunet.uu.net