rowan@convexc.UUCP (02/02/88)
Summary of Major Additions Proposed in FORTRAN 8x This summary that follows is the Overview section of the Forward to the FORTRAN 8x draft: Array Operations: Computation involving large arrays is an important part of engineering and scientific use computing. Arrays may be used as entities in FORTRAN 8x. Operations for processing whole arrays and sub arrays (array sections) are included in the language for two principle reasons: (1) these features provide a more concise and higher level language that will allow programmers more quickly and reliably to development and maintain scientific/engineering applications, and (2) these features can significantly facilitate optimization of array operations on many computer architectures. The FORTRAN 77 arithmetic, logical, and character operations and intrinsic functions are extended to operate on array-valued operands. These include whole, partial, and masked array assignment, array-valued constants and expressions, and facilities to define user-supplied array-valued functions. New intrinsic functions are provided to manipulate and construct arrays, to perform gather/scatter operations, and to support extended computational capabilities involving arrays. Numerical Computation: Scientific computation is one of the principal application domains of FORTRAN, and a guiding objective for all of the technical work is to strengthen FORTRAN as a vehicle for implementing scientific software. Though nonnumeric computations are increasing dramatically in scientific applications, numeric computation remains dominant. Accordingly, the additions include portable control over numeric precision specification, inquiry as to the characteristics of numeric information representation, and improved control of the performance of numerical programs (for example, improved argument range reduction and scaling). Derived Data Types: "Derived data type" is the term given to that set of features in this standard that allows the programmer to define arbitrary data structures and operations on them. Data structures are user-defined aggregations of intrinsic and derived data type. Intrinsic operations on structured objects include assignment, input/output, and use as procedure arguments. With no additional derived-type operations defined by the user, the derived data type facility is a simple data structuring mechanism. with additional operation definitions, derived data types provide and effective implementation mechanism for data abstractions. Procedure definitions may be used to define operations on intrinsic or derived data types and nonintrinsic assignments for intrinsic and derived types. these procedures are essentially the same as external procedures, except that they also can be used to define infix operators. Modular Definitions: In FORTRAN 77, there is no way to define a global data area in only one place and have all the program units in an application use that definition. In addition, the ENTRY statement is awkward and restrictive for implementing a related set of procedures possibly involving common data objects. Finally, there is no means in FORTRAN 77 by which procedure definitions, especially interface information, may be made known locally to a program unit. These and other deficiencies are remedied by a new type of program unit that may contain any combination of data object declarations, derived date type definitions, procedure definitions, and procedure information information. This program unit, called a MODULE, may be considered as a generalization and replacement for the block data program unit. A module may be accessed by any program unit, thereby making the module contents available to the program unit. Thus, modules provide improved facilities for defining global data areas, procedure packages, and encapsulated data abstractions. Language Evolution: With the addition of new facilities, certain old features become redundant and may eventually be phased out of the language as use declines. For example, the numeric facilities alluded to above provide the functionality of double precision; with the new array facilities, nonconformable argument association (such as associating and array element with a dummy array) is unnecessary (and in fact is not useful as an array operation); and BLOCK DATA program units are redundant and inferior to modules. As part of the evolutions of the language, categories of language features (deleted, obsolescent, and deprecated) are provided which allow unused features of the language to be removed from future standards. *** End of summary from the draft proposed standard *** The following comments are my opinions from the first draft of my letter to X3. Array Operations: Array operations allow the user to write simpler code in some cases. There are also cases where the code may be more complicated to write or may not be possible to write using the array operation notation. Examples: A = B * C ! is what people think of as array operations DO I = 1,100,2 DO J = 1,100,2 A(I,J) = B(I,J) * C(I,J) D(I,J) = E(I,J) * A(I,J) ENDDO ENDDO becomes: A(1:100:2,1:100:2) = B(1:100:2,1:100:2) + C(1:100:2,1:100:2) D(1:100:2,1:200:2) = E(1:100:2,1:100:2) + A(1:100:2,1:100:2) The DO loop contains 72 characters and the array notation contains 112 characters. If you used identified arrays to define dynamic aliases for the 5 arrays above, then the statements would be simplified, but you'd have 5 IDENTIFY statements and 5 new names to add to the program. DO I = 2,100,2 A(I-1) = B(I) + C(I/2) * D(102-I) ENDDO becomes: A(1:100:2) = B(2:100:2) + C( :50) * D(100:2:-2) Which is easier to read and understand in the DO loop form. The bottom example saves 2 characters. DO I = 2,100 A(I) = A(I-1) + B(I) ENDDO Cannot be represented in array notation. A(2:100) = A(1:99) + B(2:100) ! answer is different to the DO loop Thus, not all array constructs can be represented in array notation. Other issues include: 1. Array sections are not allowed as subscripts. 2. Implied array temporaries may increase execution time. 3. Dope vectors will be required with function invocations involving array arguments increasing call overhead. Numerical Computation: REAL (PRECISION=10, RANGE=50) TEMPERATURE D = DIGITS (TEMPERATURE) E = EFFECTIVE_PRECISION (TEMPERATURE) Allows the program to determine attributes about the underlying implementation and allows support for more than 2 underlying REAL data types. Aids in moving programs to machines with different word size. Issues include: 1. May greatly increase the number of specific intrinsic functions 2. "If no method exists that satisfies the specified precision and exponent range, the results are processor dependent" Ada compilers are required to flag such situations as an error. Derived Data Types: TYPE PERSON INTEGER AGE CHARACTER (LEN=50) NAME REAL, ARRAY (2,2) :: RATES END TYPE PERSON TYPE PERSON CHAIRMAN CHAIRMAN%AGE = 50 Provides a record data structure for FORTRAN (which is different from the record data structures currently available in VMS FORTRAN.) Issues: 1. Overloaded operators can make programs harder to read and understand. 2. Generic user functions can cause and explosion in the object code if the function has many arguments. 3. Compiler cannot optimize operations on derived data types in the same fashion as with intrinsic data types. For example, if the BIT data type is defined in the language, the compiler can generate optimized code to deal with this data type. You can do a BIT data type using derived data types, but the compiler will not have the same amount of information available for optimization since the derived data type is a generalized function. Modular Definitions: MODULE POOL1 INTEGER X(1000) REAL Y(100,100) END MODULE USE POOL1 Allows for better modularization of programs. Interface errors between modules will be caught at compile/link time. Issues: 1. The dependent compilation model has not been tested in the FORTRAN arena and is not like the Ada dependent compilation. 2. Increases compilation complexity and requires changes in other areas of the system software such as the linkers and loaders. 3. Will cause compilers to be slower. Some argue that faster machines will overcome this, but if I have a FORTRAN 77 and a FORTRAN 8x compiler on the same machine, the FORTRAN 8x compiler will have to, by definition of the standard, do more work to compile the same program. 4. The INCLUDE statement, which performs the function of allowing a single definition of common code, is NOT a part of this proposed standard. Everyone has the right to form their own opinion about the proposed FORTRAN 8x standard. Remember that if you are going to give your opinion in writting to the committee, that letters must be received in Washington DC by February 23, 1988. The address is: X3 Secretariat ATTN: Gwendy Phillips Computer and Business Equipment Manufacturers Association Suite 500 311 First Street, NW Washington, DC 20001-2178 Steve Rowan Convex Computer Corp. {allegra,ihnp4,uiucdcs}!convex!rowan (214)952-0332
hirchert@uxe.cso.uiuc.edu (02/05/88)
Steve Rowan (rowan@convexc) provides a summary of the major new featuers of Fortran 8x and then comments on them. I won't respond to all of his comments (I even agree with some of them!), but I will respond to his comments in an area that is most interesting to me: > Modular Definitions: > > MODULE POOL1 > INTEGER X(1000) > REAL Y(100,100) > END MODULE > > > USE POOL1 > > > Allows for better modularization of programs. Interface errors between > modules will be caught at compile/link time. > > Issues: > > 1. The dependent compilation model has not been tested in the FORTRAN > arena and is not like the Ada dependent compilation. No, its more like the dependent compilation model I have seen in many assemblers - write out a binary symbol table and retrieve it later. > 2. Increases compilation complexity and requires changes in other > areas of the system software such as the linkers and loaders. The compiler needs to be able to a. write out the symbol table of a MODULE (presumably in binary) b. use the Fortran name of the MODULE to determine where to write the symbol table (and thus where to retrieve it later) c. read the binary symbol table written earlier d merge the symbol table read with the active symbol table in the current program unit (non-trivial because of the rename options, but not terribly difficult, either) With those added capabilities, the rest of handling MODULE/USE is essentially identical to the handling of either entities declare locally or in COMMON. This strikes me as only a marginal increase in compilation complexity, especially when compared with compilation tasks such as data flow optimization or vectorization. There is no reason for the linkers or loaders to have any special knowledge of MODULEs. (Many linkers and loaders will have to modified to handle the switch from 6 character names to 31 character names, and there are useful things that these processors _could_ do with special knowledge of MODULEs, but such knowledge is not required!) > 3. Will cause compilers to be slower. Some argue that faster machines > will overcome this, but if I have a FORTRAN 77 and a FORTRAN 8x > compiler on the same machine, the FORTRAN 8x compiler will have to, > by definition of the standard, do more work to compile the same > program. I would expect that reading useful information in binary would be _faster_ than reinterpreting the text that defines it. Why do you expect it to be slower? (I would agree that there are features in Fortran 8x that might make it slower to compile than FORTRAN 77; this just isn't one of them.) > 4. The INCLUDE statement, which performs the function of allowing a > single definition of common code, is NOT a part of this proposed > standard. True. USE is expected to be faster (see above), more flexible (USE of selected symbols and symbol renaming), more portable (Fortran name used rather than processor-dependent file name), and more reliable (the meaning of the symbols imported can't be changed by other declarations in the importing program unit) than INCLUDE. > >Everyone has the right to form their own opinion about the proposed FORTRAN >8x standard. Remember that if you are going to give your opinion in >writting to the committee, that letters must be received in Washington DC >by February 23, 1988. The address is: > > X3 Secretariat > ATTN: Gwendy Phillips > Computer and Business Equipment Manufacturers Association > Suite 500 > 311 First Street, NW > Washington, DC 20001-2178 Amen. > > > > >Steve Rowan >Convex Computer Corp. >{allegra,ihnp4,uiucdcs}!convex!rowan >(214)952-0332 Kurt W. Hirchert National Center for Supercomputing Applications