[comp.lang.fortran] Summary of Additions to Fortran 8x

rowan@convexc.UUCP (02/02/88)

        Summary of Major Additions Proposed in FORTRAN 8x

This summary that follows is the Overview section of the Forward to the
FORTRAN 8x draft:

Array Operations:

Computation involving large arrays is an important part of engineering and 
scientific use computing.  Arrays may be used as entities in FORTRAN 8x.
Operations for processing whole arrays and sub arrays (array sections) are 
included in the language for two principle reasons:  (1) these features 
provide a more concise and higher level language that will allow programmers 
more quickly and reliably to development and maintain scientific/engineering
applications, and (2) these features can significantly facilitate 
optimization of array operations on many computer architectures.

The FORTRAN 77 arithmetic, logical, and character operations and intrinsic 
functions are extended to operate on array-valued operands.  These include
whole, partial, and masked array assignment, array-valued constants and 
expressions, and facilities to define user-supplied array-valued functions.
New intrinsic functions are provided to manipulate and construct arrays,
to perform gather/scatter operations, and to support extended computational 
capabilities involving arrays. 

  
Numerical Computation:

Scientific computation is one of the principal application domains of FORTRAN,
and a guiding objective for all of the technical work is to strengthen FORTRAN 
as a vehicle for implementing scientific software.  Though nonnumeric 
computations are increasing dramatically in scientific applications, numeric
computation remains dominant.  Accordingly, the additions include portable
control over numeric precision specification, inquiry as to the characteristics
of numeric information representation, and improved control of the 
performance of numerical programs (for example, improved argument range
reduction and scaling).


Derived Data Types:

"Derived data type" is the term given to that set of features in this 
standard that allows the programmer to define arbitrary data structures and 
operations on them.  Data structures are user-defined aggregations of 
intrinsic and derived data type.  Intrinsic operations on structured objects
include assignment, input/output, and use as procedure arguments.  With no    
additional derived-type operations defined by the user, the derived
data type facility is a simple data structuring mechanism.  with additional
operation definitions, derived data types provide and effective 
implementation mechanism for data abstractions.

Procedure definitions may be used to define operations on intrinsic or 
derived data types and nonintrinsic assignments for intrinsic and derived
types.  these procedures are essentially the same as external procedures,
except that they also can be used to define infix operators.


Modular Definitions:

In FORTRAN 77, there is no way to define a global data area in only one
place and have all the program units in an application use that definition.
In addition, the ENTRY statement is awkward and restrictive for 
implementing a related set of procedures possibly involving common data
objects.  Finally, there is no means in FORTRAN 77 by which procedure
definitions, especially interface information, may be made known locally to 
a program unit.  These and other deficiencies are remedied by a new type
of program unit that may contain any combination of data object declarations,
derived date type definitions, procedure definitions, and procedure 
information information.  This program unit, called a MODULE, may be considered
as a generalization and replacement for the block data program unit.  A
module may be accessed by any program unit, thereby making the module
contents available to the program unit.  Thus, modules provide improved
facilities for defining global data areas, procedure packages, and  
encapsulated data abstractions.


Language Evolution:

With the addition of new facilities, certain old features become redundant
and may eventually be phased out of the language as use declines.  For  
example, the numeric facilities alluded to above provide the functionality
of double precision; with the new array facilities, nonconformable 
argument association (such as associating and array element with a dummy
array) is unnecessary (and in fact is not useful as an array operation); 
and BLOCK DATA program units are redundant and inferior to modules.

As part of the evolutions of the language, categories of language features
(deleted, obsolescent, and deprecated) are provided which allow unused
features of the language to be removed from future standards. 

  
***  End of summary from the draft proposed standard  ***


The following comments are my opinions from the first draft of my letter
to X3.


Array Operations:

Array operations allow the user to write simpler code in some cases.  
There are also cases where the code may be more complicated to write
or may not be possible to write using the array operation notation.

Examples:

  A = B * C        ! is what people think of as array operations

  DO I = 1,100,2
  DO J = 1,100,2
     A(I,J) = B(I,J) * C(I,J)
     D(I,J) = E(I,J) * A(I,J)
  ENDDO
  ENDDO

    becomes:

  A(1:100:2,1:100:2) = B(1:100:2,1:100:2) + C(1:100:2,1:100:2)
  D(1:100:2,1:200:2) = E(1:100:2,1:100:2) + A(1:100:2,1:100:2)

    The DO loop contains 72 characters and the array notation contains
     112 characters.  If you used identified arrays to define dynamic
     aliases for the 5 arrays above, then the statements would be 
     simplified, but you'd have 5 IDENTIFY statements and 5 new names
     to add to the program. 
  
  
  DO I = 2,100,2
      A(I-1) = B(I) + C(I/2) * D(102-I)
  ENDDO
  
     becomes:

  A(1:100:2) = B(2:100:2) + C( :50) * D(100:2:-2)  

     Which is easier to read and understand in the DO loop form.  The 
      bottom example saves 2 characters.


  DO I = 2,100
     A(I) = A(I-1) + B(I)
  ENDDO

     Cannot be represented in array notation.  

  A(2:100) = A(1:99) + B(2:100)  ! answer is different to the DO loop 
    
     Thus, not all array constructs can be represented in array notation.

  Other issues include:

   1.  Array sections are not allowed as subscripts.
   2.  Implied array temporaries may increase execution time.
   3.  Dope vectors will be required with function invocations involving 
         array arguments increasing call overhead. 


Numerical Computation:

   REAL (PRECISION=10, RANGE=50) TEMPERATURE
   D = DIGITS (TEMPERATURE)
   E = EFFECTIVE_PRECISION (TEMPERATURE)

  Allows the program to determine attributes about the underlying
  implementation and allows support for more than 2 underlying REAL
  data types.   Aids in moving programs to machines with different 
  word size.  

  Issues include:

  1.  May greatly increase the number of specific intrinsic functions
  2.  "If no method exists that satisfies the specified precision and
       exponent range, the results are processor dependent"   Ada 
       compilers are required to flag such situations as an error.


Derived Data Types:

   TYPE PERSON
      INTEGER AGE
      CHARACTER (LEN=50) NAME
      REAL, ARRAY (2,2) :: RATES
   END TYPE PERSON
   TYPE PERSON CHAIRMAN 

   CHAIRMAN%AGE = 50


   Provides a record data structure for FORTRAN (which is different from
   the record data structures currently available in VMS FORTRAN.)

   Issues:

   1.  Overloaded operators can make programs harder to read and understand.
   2.  Generic user functions can cause and explosion in the object code
	 if the function has many arguments. 
   3.  Compiler cannot optimize operations on derived data types in the 
	 same fashion as with intrinsic data types.  For example, if the
	 BIT data type is defined in the language, the compiler can generate
	 optimized code to deal with this data type.  You can do a BIT
	 data type using derived data types, but the compiler will not 
         have the same amount of information available for optimization
	 since the derived data type is a generalized function.


 Modular Definitions:

   MODULE POOL1
      INTEGER X(1000)
      REAL Y(100,100)
   END MODULE


   USE POOL1


     Allows for better modularization of programs.  Interface errors between
     modules will be caught at compile/link time.

   Issues:

    1.  The dependent compilation model has not been tested in the FORTRAN
	 arena and is not like the Ada dependent compilation.
    2.  Increases compilation complexity and requires changes in other
	 areas of the system software such as the linkers and loaders.
    3.  Will cause compilers to be slower.  Some argue that faster machines
	 will overcome this, but if I have a FORTRAN 77 and a FORTRAN 8x
	 compiler on the same machine, the FORTRAN 8x compiler will have to,
	 by definition of the standard, do more work to compile the same   
	 program.  
    4.  The INCLUDE statement, which performs the function of allowing a 
	 single definition of common code, is NOT a part of this proposed
	 standard.  


Everyone has the right to form their own opinion about the proposed FORTRAN
8x standard.  Remember that if you are going to give your opinion in 
writting to the committee, that letters must be received in Washington DC
by February 23, 1988.  The address is:

          X3 Secretariat
	  ATTN:  Gwendy Phillips
	  Computer and Business Equipment Manufacturers Association
	  Suite 500
	  311 First Street, NW
	  Washington, DC   20001-2178




Steve Rowan
Convex Computer Corp.
{allegra,ihnp4,uiucdcs}!convex!rowan
(214)952-0332

hirchert@uxe.cso.uiuc.edu (02/05/88)

Steve Rowan (rowan@convexc) provides a summary of the major new featuers of Fortran 8x and
then comments on them.  I won't respond to all of his comments (I even agree
with some of them!), but I will respond to his comments in an area that is most
interesting to me:
> Modular Definitions:
>
>   MODULE POOL1
>      INTEGER X(1000)
>      REAL Y(100,100)
>   END MODULE
>
>
>   USE POOL1
>
>
>     Allows for better modularization of programs.  Interface errors between
>     modules will be caught at compile/link time.
>
>   Issues:
>
>    1.  The dependent compilation model has not been tested in the FORTRAN
>	 arena and is not like the Ada dependent compilation.
No, its more like the dependent compilation model I have seen in many
assemblers - write out a binary symbol table and retrieve it later.
>    2.  Increases compilation complexity and requires changes in other
>	 areas of the system software such as the linkers and loaders.
The compiler needs to be able to
a. write out the symbol table of a MODULE (presumably in binary)
b. use the Fortran name of the MODULE to determine where to write the symbol
   table (and thus where to retrieve it later)
c. read the binary symbol table written earlier
d  merge the symbol table read with the active symbol table in the current
   program unit (non-trivial because of the rename options, but not terribly
   difficult, either)
With those added capabilities, the rest of handling MODULE/USE is essentially
identical to the handling of either entities declare locally or in COMMON.
This strikes me as only a marginal increase in compilation complexity,
especially when compared with compilation tasks such as data flow optimization
or vectorization.  There is no reason for the linkers or loaders to have any
special knowledge of MODULEs.  (Many linkers and loaders will have to modified
to handle the switch from 6 character names to 31 character names, and there
are useful things that these processors _could_ do with special knowledge of
MODULEs, but such knowledge is not required!)
>    3.  Will cause compilers to be slower.  Some argue that faster machines
>	 will overcome this, but if I have a FORTRAN 77 and a FORTRAN 8x
>	 compiler on the same machine, the FORTRAN 8x compiler will have to,
>	 by definition of the standard, do more work to compile the same   
>	 program.  
I would expect that reading useful information in binary would be _faster_
than reinterpreting the text that defines it.  Why do you expect it to be
slower?  (I would agree that there are features in Fortran 8x that might make
it slower to compile than FORTRAN 77; this just isn't one of them.)
>    4.  The INCLUDE statement, which performs the function of allowing a 
>	 single definition of common code, is NOT a part of this proposed
>	 standard.  
True.  USE is expected to be faster (see above), more flexible (USE of selected
symbols and symbol renaming), more portable (Fortran name used rather than
processor-dependent file name), and more reliable (the meaning of the symbols
imported can't be changed by other declarations in the importing program unit)
than INCLUDE.
>
>Everyone has the right to form their own opinion about the proposed FORTRAN
>8x standard.  Remember that if you are going to give your opinion in 
>writting to the committee, that letters must be received in Washington DC
>by February 23, 1988.  The address is:
>
>          X3 Secretariat
>	  ATTN:  Gwendy Phillips
>	  Computer and Business Equipment Manufacturers Association
>	  Suite 500
>	  311 First Street, NW
>	  Washington, DC   20001-2178
Amen.
>
>
>
>
>Steve Rowan
>Convex Computer Corp.
>{allegra,ihnp4,uiucdcs}!convex!rowan
>(214)952-0332
Kurt W. Hirchert          National Center for Supercomputing Applications