[comp.lang.fortran] Fortran 77 Style Guide

levine@ics.uci.edu (David Levine) (12/03/89)

Point-by-point justifications are omitted from the guide.  Unless
discussion of such details would be of general interest, please
e-mail questions to me.

David L. Levine, Dept. of ICS             Internet: levine@ics.uci.edu
University of California, Irvine                BITNET: levine@ucivmsa
Irvine, CA 92717                            UUCP: ucbvax!ucivax!levine


  O
   \
o---\----
     \
----------------------------------------------------------------
                         Fortran 77 Coding Guidelines

David L. Levine                                                    1 Dec 1989
levine@ics.uci.edu                                               714-640-8662


I. Introduction

    The following guidelines are designed to encourage consistent coding
across projects and programmers.  Many arbitrary low-level decisions are made
during coding.  While many of these have no effect on the machine code, they
do affect the appearance of the code.  And, consistent practices enhance
productivity and reusability.  Project requirements, when applicable, take
precedence.

    The goals of the guidelines are, in decreasing order of importance:

    1) understandability -- conveys the purpose of computations to the reader
    2) transportability -- between compilers on assorted modern operating
       systems
    3) maintainability -- can be readily enhanced
    4) efficiency -- execution speed


II. General
    1) Adhere to strict FORTRAN 77 as closely as possible, with the following
       exceptions.
       a)  In addition to the standard character set (where the notation [0-9]
           indicates the numerals 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9)
                space $ ' ( ) * +  , - . / : =  [0-9] [A-Z]
           the following characters may be used:
                ! # & [a-z]
       b)  Identifier names may be up to 31 characters long (the first 6
           remain significant).
       c)  INCLUDE statements may be used.
       d)  INTEGER*4 may be used when INTEGER defaults to 2 bytes.
    2) Avoid compiler directives in code, especially if an equivalent
       compiler option is available.  If a compiler directive must be used,
       comment its effect and target machine, operating system, and compiler,
       with revision numbers.
    3) Use a consistent set of compiler options, or maintain a makefile or
       command procedure to enable reconstruction.  Suggested compiler
       options are listed in Appendix II.


III. Project Organization
    1) Use a standardized comment header at the top of every subprogram (see
       example in Appendix III).
    2) Group related subprograms into a module.  The module is assigned a
       unique name and is typically stored in one or more files in a single
       subdirectory on the file system.  If used, a single object library
       contains the code for all the module components.
    3) The first two letters of the routine and file name correspond to the
       module and the remaining four uniquely identify the routine. 
       Additional characters may be used to create a more sensible name.


IV. Program Units
    1) Begin a main program with a PROGRAM statement.
    2) Order arguments as follows:  inputs, outputs, control, external names.
    3) FUNCTIONs must not have side effects.
    4) Reference external functions in an EXTERNAL statement.
    5) Do not use alternate returns.  Every subprogram should have single
       entry and exit points.


V. Statement format
    1) Use the standard source format (columns 1-5 for label, 6 for
       continuation, and 7-72 for statement).
    2) Indicate a non-blank comment with an * in column 1.
    3) Indicate a continuation with an & in column 6.
    4) Do not split a name across lines.
    5) Do not write more than one statement per line.


VI. Statement labels
    1) Assign a label only if necessary.
    2) Assign labels in ascending order.
    3) Assign a separate sequence of labels to FORMAT labels that are grouped
       at the end of a subprogram.
    4) Right-adjust labels.
    5) Do not use the label field of a continuation line.


VII. Capitalization
    1) Set keywords in all caps.
    2) Set symbolic names of constants (parameters) in all lower case.
    3) Set all other identifiers in initial caps with embedded words initial
       capped.


VIII. Spacing
    1) Do not use tabs.
    2) Write keywords without embedded spaces, but surround keywords with
       spaces.
    3) Do not put space between and array name and its index, Array(I).
    4) Put one space between a subprogram name and its argument list, e.g.,
       Subr (Arg1, Arg2).
    5) Except in argument lists, put a space after an open parenthesis and
       before a close parenthesis.
    6) Put one space between arguments.
    7) Use spacing in equations to reveal operators and reinforce precedence.
    8) Use indentation to reinforce control flow; each indent is 3 columns.
    9) Use whitespace to enhance readability.


IX. Identifier Selection
    1) Start with letter [A-Z], follow with only letters or digits.
    2) Limit to 31 characters; distinguish with first 6.
    3) Choose an identifier to represent the entity being modeled.
    4) Explain the significance of each variable and array in comments.
    5) Do not use a keyword as an identifier.
    6) Do not use a subprogram name as a COMMON block name.
    7) Do not abbreviate .TRUE. or .FALSE.
    8) Delimit character strings with apostrophes.


X. Constants
    1) Use PARAMETERs to symbolically name all compile-time constants.
    2) Use only constant expressions to define PARAMETERs.


XI. Typing
    1) Use the following data types:
           CHARACTER[*n]
           COMPLEX
           DOUBLE PRECISION
           INTEGER*4
           INTEGER
           LOGICAL
           REAL
    2) Use DOUBLE PRECISION instead of REAL*8.
    3) Use INTEGER for integers that are always in the range of two-byte
       integers [-32768,32767]; then use a compiler option to select two or
       four byte storage.
    4) Declare all variables.  Use a compiler option to ensure declaration. 
       If no such compiler option is available, setting the implicit type of
       all variables to a type that is not used in the program often snags
       undeclared variables.  An example is:  IMPLICIT COMPLEX*16 A-Z.
    5) Arrange identifiers in type declarations logically based on the
       entities which they describe.  Subprogram arguments may be declared in
       order of appearance in the argument list.  If there is no other
       obvious order, arrange alphabetically.
    6) Do not compare arithmetic expressions of different types; type convert
       explicitly.


XII. Operators
    1) Do not use .EQ. and .NE. between floating point expressions.
    2) Use .GE. or .LE., as appropriate, instead of .EQ. when checking for a
       threshold crossing.
    3) Compare unequal length character strings with LGE, LGT, LLE, and LLT.
    4) Use only the logical operators .AND., .OR., .EQV., .NEQV., and .NOT.,
       and only use them on LOGICAL operands.


XIII. Expressions
    1) Surround low precedence operators with space.
    2) Split an expression across lines after an operator.
    3) Indent continuation lines.
    4) Consider the types of operands and their effects on the values of
       subexpressions, e.g., 8 / 3 * 3.0 is 6.0, not 8.0.
    5) Be careful with "exact" values of floating point expressions, e.g.,
       assigning 30.0 / 0.1 to an integer may define it as 299!
       

XIV. Arrays
    1) Declare array dimensions in the type declaration rather than in a
       separate DIMENSION statement.
    2) Use only INTEGER subscript expressions.
    3) Preferably operate on arrays such that the first indices vary fastest
       and the last vary slowest.
    4) Specify all subscripts in any array reference.
    5) Do not exceed the bounds of declared array dimensions.


XV. Control structures
    1) Use GOTO carefully.  See Appendix I for loop constructs.  Comments at
       the target of a GOTO listing possible 'come froms' are very helpful.
    2) Terminate or begin every loop with a distinct CONTINUE.
    3) Do not jump into the middle of a loop or conditional.
    4) Use STOP only for abnormal termination, and include the reason in the
       character string message.


XVI. Arguments
    1) Match the actual arguments in the caller to the formal arguments of
       the callee in both number and type.
    2) Do not repeat an actual argument in any call.
    3) All arguments to an intrinsic function must be of the same type.
    4) Do not pass a constant as an actual argument unless it is to an IN
       formal (see Appendix III for definition).


XVII. COMMON blocks
    1) Only place data in COMMON blocks if necessary.
    2) Place COMMON block definitions in INCLUDE files.
    3) SAVE all COMMON blocks.
    4) Do not mix CHARACTER and non-character types in a COMMON block.
    5) Do not pass as an argument any variable referenced in a COMMON block
       in both the calling and called subprograms.
    6) Initialize COMMON blocks only in BLOCK DATA subprograms.
    7) Compile BLOCK DATA subprograms with another program unit in which it
       is referred to with an EXTERNAL statement.
    8) Use EQUIVALENCE with care, and only to economize on storage; avoid
       aliasing and then only if well commented.


XVIII. Input/Output
    1) Use error recovery options END=, ERR=, and IOSTAT=, and handle all
       such conditions gracefully.
    2) Position a FORMAT statement immediately following its reference. 
       Position FORMAT statements that are used more than once at the end of
       the subprogram.
    3) Use implied DO rather than DO loops.
    4) OPEN all files with STATUS = 'UNKNOWN' unless otherwise is required.


                                 BIBLIOGRAPHY

    The contributions of the following individuals are gratefully
acknowledged.  Many laudable suggestions were unilaterally discarded solely in
the interest of keeping this guide as brief as possible.  A style guide by its
very nature is subjective.  The primary intent of this guide is draw attention
to some of the intricacies of coding in the interest of encouraging
consistency.


Bierman, Keith, personal correspondence, <8911010025.AA06492@chiba.sun.com>
(31 Oct 1989) and <8911160932.AA05443@chiba.sun.com> (16 November 1989).

Caffin, R. N., "More on Fortran Coding Conventions," Fortran Forum 3:3 (ACM,
December 1984).

Calhoun, Myron A., personal correspondence,
<8911061456.AA28516@harris.cis.ksu.edu> (6 November 1989).

Cox, Robert W., personal correspondence,
<8910311522.AA03009@ilmarin.c3.lanl.gov> (31 Oct 1989).

Liesenfeld, Ulrich, personal correspondence,
<1955:uli@analyt.chemie.uni-bochum.dbp.de> (16 November 1989).

Metcalf, Michael, FORTRAN Optimization (New York:  Academic Press, Inc.,
1982).

Metcalf, Michael, "FORTRAN 77 Coding Conventions," ForTec Forum 2:4 (ACM,
December 1983).

Miller, Geoff, "Bureau of Transport Economics Computer Users Guide, Attachment
1 - FORTRAN Programming Standards" (Canberra:  Computer Centre, Australian
Defence Force Academy, 8 August 1986).

Montgomery, Peter, personal correspondence,
<8910311752.AA20351@sonia.math.ucla.edu> (31 Oct 1989).




                         APPENDIX I.  Loop Constructs

    1) iterative

      DO 10 I = 1, iterations
         . . .
  10  CONTINUE

       Do not modify the loop variable.


    2) while

  10  CONTINUE
         . . .
         IF ( condition ) GOTO 20
         . . .
         GOTO 10
  20  CONTINUE


    3) do-while or repeat-until (all statements in loop execute at least
       once)

  10  CONTINUE
         . . .
      IF ( condition ) GOTO 10




                   APPENDIX II.  Suggested Compiler Options

    1) VAX FORTRAN:
       /check=(bounds,overflow)/g_float/standard/warnings=all

    2) MS fl:
       /4I2 /4Yd /Ox           (/G2 for 80286/386 instructions)

    3) Sun f77:

           (fp is floating point hardware option, e.g., 68881
            j is optimization level)


       Sun3 f77 version 1.2 and earlier:
       -ansi -ffp -u -Oj foo.f /usr/lib/fp/libm.il /usr/lib/libm.il -lm

       Sun3 f77 version 1.3:
       -ansi -fast -u -O3 foo.f -lm

      Sun4 f77 prior to version 1.2:
       -ansi -u -Oj foo.f /usr/lib/f77/libm.il -lm

       Sun4 f77 version 1.2:
       -ansi -u -Oj foo.f /usr/lib/f77/libm.il /usr/lib/f77/libm.il -lm

       (and use -dalign if DOUBLE PRECISION is used; may cause core dump if
       code is not all double word aligned)

       Sun4 f77 version 1.3:
       -ansi -fast -u -O3 foo.f -lm

    4) Convex fc:
       -ep i -Oj               (i is number of processors and j is 
                               optimization level)




                             APPENDIX III. Example
notes on example:

    1) The dividing lines end in column 72, to serve as a visual aid when
       working with editors that do not display cursor position.

    2) The 'Units' field is useful in physical applications; other parameter
       properties may be of interest in other applications.

    3) Arguments and common block entities that are referenced in the
       subprogram are listed in the prologue.  Each may be classified as IN,
       OUT, or INOUT mode, following the Ada practice as shown below.  A
       variable is defined if its value is changed, such as by assignment.  A
       variable is used if its value, on entry to the subprogram, is
       referenced.

           mode         define           use 
           ----         ------           --- 
           IN           not allowed      allowed
           OUT          allowed          not allowed
           INOUT        allowed          allowed



      SUBROUTINE TEUpCase (String)
*
* ================== Prologue ==========================================
*
* Purpose:
*    Convert a string to all upper case characters.
*
* History:
*    Version   Programmer         Date       Description
*    -------   ----------         ----       -----------
*    1.0       D. Levine          3/16/89    created
*
* IN args/commons         Units      Description
* ---------------         -----      -----------
*
* OUT args/commons        Units      Description
* ----------------        -----      -----------
*
* INOUT args/commons      Units      Description
* ------------------      -----      -----------
* String                  n/a        string to be converted
*
* Processing:
*    Convert each lower case character of the String to upper case by
* adding the difference of the base upper and lower case characters,
* 'A' and 'a', respectively.
*
* Special requirements:
*    Assumes that [a..z] is mapped onto consecutive integers, and
* [A..Z] is mapped onto consecutive integers.
*
* ------------------ Include files -------------------------------------
* ------------------ Constant declarations -----------------------------
* ------------------ Argument declarations -----------------------------

      CHARACTER*(*) String

* ------------------ Global/External declarations ----------------------
* ------------------ Local declarations --------------------------------

      CHARACTER C
      INTEGER   Upper2Lower, Pos

* ------------------ Code ----------------------------------------------

      Upper2Lower = ICHAR ('A') - ICHAR ('a')

      DO 10 Pos = 1, LEN (String)
         C = String(Pos:Pos)
         IF ( LGE (C,'a')  .AND.  LLE (C,'z') ) THEN
            String(Pos:Pos) = CHAR (ICHAR (C) + Upper2Lower)
         ENDIF
  10  CONTINUE

      RETURN
      END

levine@ics.uci.edu (David Levine) (12/03/89)

Here is the f77 style guide that I have put together.  Many thanks to
those who provided suggestions.  As mentioned in the text, I did not
use them all for various reasons.

johna@runxtsa.runx.oz.au (John Arndt) (03/17/90)

In the latter part of 1989 someone in the United States broadcast an
article titled "Fortran 77 Style Guide". I copied the article to my
wordprocessor,cleared it and saved the empty text file - as well as
backing this void up. Would someone with a copy of the Style Guide
mind broadcasting it again?
Thanks in advance.

John Arndt.   ----------------
             
              ----------------

levine@crimee.ics.uci.edu (David Levine) (08/03/90)

--------
Here is the f77 style guide that I have put together.  Many thanks to
those who provided suggestions.  As mentioned in the text, I did not
use them all for various reasons.

Point-by-point justifications are omitted from the guide.  Unless
discussion of such details would be of general interest, please
e-mail questions to me.

David L. Levine, Dept. of ICS             Internet: levine@ics.uci.edu
University of California, Irvine                BITNET: levine@ucivmsa
Irvine, CA 92717                            UUCP: ucbvax!ucivax!levine


  O
   \
o---\----
     \
----------------------------------------------------------------
                         Fortran 77 Coding Guidelines

David L. Levine                                                   11 Dec 1989
levine@ics.uci.edu                                               714-640-8662


I. Introduction

    The following guidelines are designed to encourage consistent coding
across projects and programmers.  Many arbitrary low-level decisions are made
during coding.  While many of these have no effect on the machine code, they
do affect the appearance of the code.  And, consistent practices enhance
productivity and reusability.  Project requirements, when applicable, take
precedence.

    The goals of the guidelines are, in decreasing order of importance:

    1) understandability -- conveys the purpose of computations to the reader
    2) transportability -- between compilers on assorted modern operating
       systems
    3) maintainability -- can be readily enhanced
    4) efficiency -- execution speed


II. General
    1) Adhere to strict FORTRAN 77 as closely as possible, with the following
       exceptions.
       a)  In addition to the standard character set (where the notation [0-9]
           indicates the numerals 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9)
                space $ ' ( ) * +  , - . / : =  [0-9] [A-Z]
           the following characters may be used:
                ! # & [a-z]
       b)  Identifier names may be up to 31 characters long (the first 6
           remain significant).
       c)  INCLUDE statements may be used.
       d)  INTEGER*4 may be used when INTEGER defaults to 2 bytes.
    2) Avoid compiler directives in code, especially if an equivalent
       compiler option is available.  If a compiler directive must be used,
       comment its effect and target machine, operating system, and compiler,
       with revision numbers.
    3) Use a consistent set of compiler options, or maintain a makefile or
       command procedure to enable reconstruction.  Suggested compiler
       options are listed in Appendix II.


III. Project Organization
    1) Use a standardized comment header at the top of every subprogram (see
       example in Appendix III).
    2) Group related subprograms into a module.  The module is assigned a
       unique name and is typically stored in one or more files in a single
       subdirectory on the file system.  If used, a single object library
       contains the code for all the module components.
    3) The first two letters of the routine and file name correspond to the
       module and the remaining four uniquely identify the routine.
       Additional characters may be used to create a more sensible name.


IV. Program Units
    1) Begin a main program with a PROGRAM statement.
    2) Order arguments as follows:  inputs, outputs, control, external names.
    3) FUNCTIONs must not have side effects.
    4) Reference external functions in an EXTERNAL statement.
    5) Do not use alternate returns.  Every subprogram should have single
       entry and exit points.


V. Statement format
    1) Use the standard source format (columns 1-5 for label, 6 for
       continuation, and 7-72 for statement).
    2) Indicate a non-blank comment with an * in column 1.
    3) Indicate a continuation with an & in column 6.
    4) Do not split a name across lines.
    5) Do not write more than one statement per line.


VI. Statement labels
    1) Assign a label only if necessary.
    2) Assign labels in ascending order.
    3) Assign a separate sequence of labels to FORMAT labels that are grouped
       at the end of a subprogram.
    4) Right-adjust labels.
    5) Do not use the label field of a continuation line.


VII. Capitalization
    1) Set keywords in all caps.
    2) Set symbolic names of constants (parameters) in all lower case.
    3) Set all other identifiers in initial caps with embedded words initial
       capped.


VIII. Spacing
    1) Do not use tabs.
    2) Write keywords without embedded spaces, but surround keywords with
       spaces.
    3) Do not put space between and array name and its index, Array(I).
    4) Put one space between a subprogram name and its argument list, e.g.,
       Subr (Arg1, Arg2).
    5) Except in argument lists, put a space after an open parenthesis and
       before a close parenthesis.
    6) Put one space between arguments.
    7) Use spacing in equations to reveal operators and reinforce precedence.
    8) Use indentation to reinforce control flow; each indent is 3 columns.
    9) Use whitespace to enhance readability.


IX. Identifier Selection
    1) Start with letter [A-Z], follow with only letters or digits.
    2) Limit to 31 characters; distinguish with first 6.
    3) Choose an identifier to represent the entity being modeled.
    4) Explain the significance of each variable and array in comments.
    5) Do not use a keyword as an identifier.
    6) Do not use a subprogram name as a COMMON block name.
    7) Do not abbreviate .TRUE. or .FALSE.
    8) Delimit character strings with apostrophes.


X. Constants
    1) Use PARAMETERs to symbolically name all compile-time constants.
    2) Use only constant expressions to define PARAMETERs.


XI. Typing
    1) Use the following data types:
           CHARACTER[*n]
           COMPLEX
           DOUBLE PRECISION
           INTEGER*4
           INTEGER
           LOGICAL
           REAL
    2) Use DOUBLE PRECISION instead of REAL*8.
    3) Use INTEGER for integers that are always in the range of two-byte
       integers [-32768,32767]; then use a compiler option to select two or
       four byte storage.
    4) Declare all variables.  Use a compiler option to ensure declaration.
       If no such compiler option is available, setting the implicit type of
       all variables to a type that is not used in the program often snags
       undeclared variables.  An example is:  IMPLICIT COMPLEX*16 A-Z.
    5) Arrange identifiers in type declarations logically based on the
       entities which they describe.  Subprogram arguments may be declared in
       order of appearance in the argument list.  If there is no other
       obvious order, arrange alphabetically.
    6) Do not compare arithmetic expressions of different types; type convert
       explicitly.


XII. Operators
    1) Do not use .EQ. and .NE. between floating point expressions.
    2) Use .GE. or .LE., as appropriate, instead of .EQ. when checking for a
       threshold crossing.
    3) Compare unequal length character strings with LGE, LGT, LLE, and LLT.
    4) Use only the logical operators .AND., .OR., .EQV., .NEQV., and .NOT.,
       and only use them on LOGICAL operands.


XIII. Expressions
    1) Surround low precedence operators with space.
    2) Split an expression across lines after an operator.
    3) Indent continuation lines.
    4) Consider the types of operands and their effects on the values of
       subexpressions, e.g., 8 / 3 * 3.0 is 6.0, not 8.0.
    5) Be careful with "exact" values of floating point expressions, e.g.,
       assigning 30.0 / 0.1 to an integer may define it as 299!


XIV. Arrays
    1) Declare array dimensions in the type declaration rather than in a
       separate DIMENSION statement.
    2) Use only INTEGER subscript expressions.
    3) Preferably operate on arrays such that the first indices vary fastest
       and the last vary slowest.
    4) Specify all subscripts in any array reference.
    5) Do not exceed the bounds of declared array dimensions.


XV. Control structures
    1) Use GOTO carefully.  See Appendix I for loop constructs.  Comments at
       the target of a GOTO listing possible 'come froms' are very helpful.
    2) Terminate or begin every loop with a distinct CONTINUE.
    3) Do not jump into the middle of a loop or conditional.
    4) Use STOP only for abnormal termination, and include the reason in the
       character string message.


XVI. Arguments
    1) Match the actual arguments in the caller to the formal arguments of
       the callee in both number and type.
    2) Do not repeat an actual argument in any call.
    3) All arguments to an intrinsic function must be of the same type.
    4) Do not pass a constant as an actual argument unless it is to an IN
       formal (see Appendix III for definition).


XVII. COMMON blocks
    1) Only place data in COMMON blocks if necessary.
    2) Place COMMON block definitions in INCLUDE files.
    3) SAVE all COMMON blocks.
    4) Do not mix CHARACTER and non-character types in a COMMON block.
    5) Do not pass as an argument any variable referenced in a COMMON block
       in both the calling and called subprograms.
    6) Initialize COMMON blocks only in BLOCK DATA subprograms.
    7) Compile BLOCK DATA subprograms with another program unit in which it
       is referred to with an EXTERNAL statement.
    8) Use EQUIVALENCE with care, and only to economize on storage; avoid
       aliasing and then only if well commented.


XVIII. Input/Output
    1) Use error recovery options END=, ERR=, and IOSTAT=, and handle all
       such conditions gracefully.
    2) Position a FORMAT statement immediately following its reference.
       Position FORMAT statements that are used more than once at the end of
       the subprogram.
    3) Use implied DO rather than DO loops.
    4) OPEN all files with STATUS = 'UNKNOWN' unless otherwise is required.


                                 BIBLIOGRAPHY

    The contributions of the following individuals are gratefully
acknowledged.  Many laudable suggestions were unilaterally discarded solely in
the interest of keeping this guide as brief as possible.  A style guide by its
very nature is subjective.  The primary intent of this guide is draw attention
to some of the intricacies of coding in the interest of encouraging
consistency.


Bierman, Keith, personal correspondence, <8911010025.AA06492@chiba.sun.com>
(31 Oct 1989) and <8911160932.AA05443@chiba.sun.com> (16 November 1989).

Caffin, R. N., "More on Fortran Coding Conventions," Fortran Forum 3:3 (ACM,
December 1984).

Calhoun, Myron A., personal correspondence,
<8911061456.AA28516@harris.cis.ksu.edu> (6 November 1989).

Cox, Robert W., personal correspondence,
<8910311522.AA03009@ilmarin.c3.lanl.gov> (31 Oct 1989).

Liesenfeld, Ulrich, personal correspondence,
<1955:uli@analyt.chemie.uni-bochum.dbp.de> (16 November 1989).

Metcalf, Michael, FORTRAN Optimization (New York:  Academic Press, Inc.,
1982).

Metcalf, Michael, "FORTRAN 77 Coding Conventions," ForTec Forum 2:4 (ACM,
December 1983).

Miller, Geoff, "Bureau of Transport Economics Computer Users Guide, Attachment
1 - FORTRAN Programming Standards" (Canberra:  Computer Centre, Australian
Defence Force Academy, 8 August 1986).

Montgomery, Peter, personal correspondence,
<8910311752.AA20351@sonia.math.ucla.edu> (31 Oct 1989).

Watson, Ian, personal correspondence,
<8912061409.AA05518@Kodak.COM> (6 Dec 1989).




                         APPENDIX I.  Loop Constructs

    1) iterative

      DO 10 I = 1, iterations
         . . .
  10  CONTINUE

       Do not modify the loop variable.


    2) while

  10  CONTINUE
         . . .
         IF ( condition ) GOTO 20
         . . .
         GOTO 10
  20  CONTINUE


    3) do-while or repeat-until (all statements in loop execute at least
       once)

  10  CONTINUE
         . . .
      IF ( condition ) GOTO 10




                   APPENDIX II.  Suggested Compiler Options

    1) VAX FORTRAN:
       /check=(bounds,overflow)/g_float/standard/warnings=all

    2) MS fl:
       /4I2 /4Yd /Ox           (/G2 for 80286/386 instructions)

    3) Sun f77:

           (fp is floating point hardware option, e.g., 68881
            j is optimization level)


       Sun3 f77 version 1.2 and earlier:
       -ansi -ffp -u -Oj foo.f /usr/lib/fp/libm.il /usr/lib/libm.il -lm

       Sun3 f77 version 1.3:
       -ansi -fast -u -O3 foo.f -lm

       Sun4 f77 prior to version 1.2:
       -ansi -u -Oj foo.f /usr/lib/f77/libm.il -lm

       Sun4 f77 version 1.2:
       -ansi -u -Oj foo.f /usr/lib/f77/libm.il /usr/lib/f77/libm.il -lm

       (and use -dalign if DOUBLE PRECISION is used; may cause core dump if
       code is not all double word aligned)

       Sun4 f77 version 1.3:
       -ansi -fast -u -O3 foo.f -lm

    4) Convex fc:
       -ep i -Oj               (i is number of processors and j is
                               optimization level)




                             APPENDIX III. Example
notes on example:

    1) The dividing lines end in column 72, to serve as a visual aid when
       working with editors that do not display cursor position.

    2) The 'Units' field is useful in physical applications; other parameter
       properties may be of interest in other applications.

    3) Arguments and common block entities that are referenced in the
       subprogram are listed in the prologue.  Each may be classified as IN,
       OUT, or INOUT mode, following the Ada practice as shown below.  A
       variable is defined if its value is changed, such as by assignment.  A
       variable is used if its value, on entry to the subprogram, is
       referenced.

           mode         define           use
           ----         ------           ---
           IN           not allowed      allowed
           OUT          allowed          not allowed
           INOUT        allowed          allowed



      SUBROUTINE TEUpCase (String)
*
* ================== Prologue ==========================================
*
* Purpose:
*    Convert a string to all upper case characters.
*
* History:
*    Version   Programmer         Date       Description
*    -------   ----------         ----       -----------
*    1.0       D. Levine          03/16/89   created
*    1.1       D. Levine          12/11/89   changed to INDEX into
*                                            constant string from adding
*                                            'A' - 'a' to each char.
*
* IN args/commons         Units      Description
* ---------------         -----      -----------
*
* OUT args/commons        Units      Description
* ----------------        -----      -----------
*
* INOUT args/commons      Units      Description
* ------------------      -----      -----------
* String                  N/A        string to be converted
*
* Processing:
*    For each character in String, check to see if it is lower case by
* finding its INDEX in the string [a..z].  Then, replace with the
* character at the same position in the string [A..Z].
*
* Special requirements:
*    none
*
* ------------------ Include files -------------------------------------
* ------------------ Constant declarations -----------------------------
* ------------------ Argument declarations -----------------------------

      CHARACTER*(*) String

* ------------------ Global/External declarations ----------------------
* ------------------ Local declarations --------------------------------

      INTEGER Pos, I

      CHARACTER*26 lowers, uppers
      DATA lowers / 'abcdefghijklmnopqrstuvwxyz' /,
     &     uppers / 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' /

* ------------------ Code ----------------------------------------------

      DO 100 I = 1, LEN (String)

         Pos = INDEX (lowers, String(I:I))
         IF ( Pos .NE. 0 ) String(I:I) = uppers(Pos:Pos)

  100 CONTINUE

      RETURN
      END

jlg@lanl.gov (Jim Giles) (08/03/90)

This is not a disagreement with the style guide David Levine gave
but merely a commentary on it.  I say this because I will only
discuss those parts of his submission that I disagree with. As
a result, I might be mistaken for a dissenter.  In fact, as will
be seen, I don't discuss most of what he says (and I therefore
agree with those parts).

From article <26B89BE1.4349@ics.uci.edu>, by levine@crimee.ics.uci.edu (David Levine):
> [...]
>        a)  In addition to the standard character set (where the notation [0-9]
>            indicates the numerals 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9)
>                 space $ ' ( ) * +  , - . / : =  [0-9] [A-Z]
>            the following characters may be used:
>                 ! # & [a-z]

Believe it or not, there are still implementations without these characters
(or, at least, compilers which still don't recognize them).  You can use
these as long as you have access to some tool to remove or replace them
when you need to (this may be never - so maybe you're safe).

>        b)  Identifier names may be up to 31 characters long (the first 6
>            remain significant).

_NEVER_ use more characters in an identifier than the compiler considers
significant.  _NEVER_ buy a compiler which allows an identifier to
contain insignificant characters.  Yes, the 6 character limit of standard
Fortran is irritating.  Tricking yourself with identifiers that _look_
different to you, but might not be to the compiler is _not_ a good
solution.

>        c)  INCLUDE statements may be used.

Well, just don't count on the _compiler_ to do it.  Many compilers don't.
Any half-way acceptable text editor ought to have sufficient macro definition
capability to expand the INCLUDE.

>        d)  INTEGER*4 may be used when INTEGER defaults to 2 bytes.

Not available everywhere.  Just like the lower case letters, you should
only use this if you have an automatic tool to remove them - otherwise
you got to do it by hand.

>     2) Avoid compiler directives in code, especially if an equivalent
>        compiler option is available.

If the directives in code do the same thing that command line directives
do, there's no reason for the vendor to supply both.  You can't make
rigid rules about this issue without assuming things that you don't
know about what the directives might do and what the priorities of the
programmers are.

>     3) The first two letters of the routine and file name correspond to the
>        module and the remaining four uniquely identify the routine.
>        Additional characters may be used to create a more sensible name.

See above for comments on insignificant identifier characters.

> VI. Statement labels
>     1) Assign a label only if necessary.
>     2) Assign labels in ascending order.

My only complaint here is wording.  In Fortran, the word "ASSIGN" has a
specific meaning which isn't intended here.  Replace "assign" with "declare"
in all the stuff above, and I'll agree with it.

>     3) Assign a separate sequence of labels to FORMAT labels that are grouped
>        at the end of a subprogram.

Same comment about "assign" plus the following: whenever possible, put
the format into the I/O statement which uses it.  Use labeled formats
only when several I/O statements share the same (long) format.

> VII. Capitalization
>     1) Set keywords in all caps.
>     2) Set symbolic names of constants (parameters) in all lower case.
>     3) Set all other identifiers in initial caps with embedded words initial
>        capped.

Way too rigid.  In a language like C, where case is significant, such
rigid rules are important in order to share code among users (or across
time).  In Fortran, case should be used freely (with comments) for
emphasis and documentation.

> VIII. Spacing
>      [...]
>     3) Do not put space between and array name and its index, Array(I).
>     4) Put one space between a subprogram name and its argument list, e.g.,
>        Subr (Arg1, Arg2).
>     5) Except in argument lists, put a space after an open parenthesis and
>        before a close parenthesis.
>     6) Put one space between arguments.
>      [...]
>     8) Use indentation to reinforce control flow; each indent is 3 columns.
>      [...]

Again too rigid.  Spaces should be used to enhance legibility, not to
conform to some rigid style rules.  All these things, but particularly
how many columns to indent, should be up to the individual programmer.
Any other decision doesn't allow for the different constraints of
different applications and users.  As long as a program uses space
in a reasonably consistent way, it doesn't matter what the particular
rules are.

>     2) Limit to 31 characters; distinguish with first 6.

Again, _NEVER_ use an identifier with more characters that are significant.

>     1) Use PARAMETERs to symbolically name all compile-time constants.

A little to rigid.  If the constant is something that's likely to
change in future versions of the code, name it.  If the constant
is hard to remember (or long and prone to typing errors), then name
it.  Otherwise don't name it.  I've seen people who've named 1 (one),
and then used that everywhere - this doesn't enhance legibility or
maintainability at all.

> XI. Typing
>            [...]
>            INTEGER*4

Not universally available.

>        [...]
>     4) Declare all variables.  Use a compiler option to ensure declaration.
>        If no such compiler option is available, setting the implicit type of
>        all variables to a type that is not used in the program often snags
>        undeclared variables.  An example is:  IMPLICIT COMPLEX*16 A-Z.

A better 'snag' is IMPLICIT LOGICAL A-Z.  Complex has the disadvantage
that arithmetic (the most common variable use) is still legal on it.
Logical is better because it is legal in fewer contexts.

> XII. Operators
>        [...]
>     3) Compare unequal length character strings with LGE, LGT, LLE, and LLT.

Compare _all_ character strings this way.  Otherwise you get a non-portable
lexicographic ordering from some machines.

> XIII. Expressions
>        [...]
>     2) Split an expression across lines after an operator.

Obviously, only if the line _needs_ splitting :-).

> XIV. Arrays
>     1) Declare array dimensions in the type declaration rather than in a
>        separate DIMENSION statement.

In fact, _never_ use the DIMENSION statement.  Means the same thing,
I just like it stated a clearly as possible.

>     3) Preferably operate on arrays such that the first indices vary fastest
>        and the last vary slowest.

An important issue in bygone days and still an optimization isse
on scalar machines with small caches.  This isn't a disagreement,
but I'd like to point out that there are other orders which might
be desireable.  For example, on a vector machine, you want the index
with the longest dimension to be on the inner loop (longer vectors).

> XVII. COMMON blocks
>        [...]
>     3) SAVE all COMMON blocks.

I've requested the Fortran committee for years to make SAVE the default
for COMMON.

>     6) Initialize COMMON blocks only in BLOCK DATA subprograms.

In spite of the fact that the standard calls for this, I am not sure
that I agree.  If MODULES existed, the obvious preferred place to
initialize globals would be in the MODULE.  But, for COMMON, I think
BLOCK DATA was always a clumbsy solution.  Anyway, _wherever_ you
do it, you _should_ initialize _all_ common data.

>     7) Compile BLOCK DATA subprograms with another program unit in which it
>        is referred to with an EXTERNAL statement.

This is why I'm not sure I agree with using BLOCK DATA at all.

> XVIII. Input/Output
>     2) Position a FORMAT statement immediately following its reference.

Position formats _in_ the I/O statement that refer to them if possible.

>                          APPENDIX I.  Loop Constructs
>     2) while
>   10  CONTINUE
>          . . .
>          IF ( condition ) GOTO 20
>          . . .
>          GOTO 10
>   20  CONTINUE

If it's _really_ a while construct, there should _NEVER_ be any code
between the beginning of the loop and the test.  This more strongly
resembles an infinite loop with an exit condition buried in the middle.

J. Giles

mccalpin@perelandra.cms.udel.edu (John D. McCalpin) (08/03/90)

In article <59012@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
>
> [...(quoting David Levine's style guide) ...]
>
>>     1) Use PARAMETERs to symbolically name all compile-time constants.
> 
> A little to rigid.  If the constant is something that's likely to
> change in future versions of the code, name it.  If the constant
> is hard to remember (or long and prone to typing errors), then name
> it.  Otherwise don't name it.  I've seen people who've named 1 (one),
> and then used that everywhere - this doesn't enhance legibility or
> maintainability at all.

There is not much need to declare a parameter for integer 1, but there
is an advantage in declaring parameters for fractional floating-point
numbers, since the precision can then be changed by an IMPLICIT
statement rather than searching/replacing through the whole code.
Unfortunately, X3J3 seems to have ignored my requests that something
intelligent be done about this in Fortran-90....

An another subject, the spacing and capitalization rules can (almost)
all be handled by the TOOLPACK utility 'pol'.  The only thing that I
can think of that it cannot do is capitalize the first letter of each
word in identifiers which are not keywords or variables (I guess that
means subroutine and function names).  TOOLPACK currently limits the
user to 6-character identifiers, but also provides an interactive
naming facility.  I don't know if it would be difficult to hack
31-character names into the code....
--
John D. McCalpin			mccalpin@perelandra.cms.udel.edu
Assistant Professor			mccalpin@vax1.udel.edu
College of Marine Studies, U. Del.	J.MCCALPIN/OMNET

julian@cernvax.UUCP (julian bunn) (08/03/90)

David Levine has recently posted an interesting set of Fortran 77
coding guidelines, itemized in 18 parts. In this posting I describe
which of Levine's guidelines may be automatically checked using
Floppy, a Fortran coding convention checker, the source of which
was posted to comp.sources.misc some weeks ago. I also describe
which guidelines are most commonly agreed to here at CERN, which
may also be checked by Floppy, and which do not appear in Levine's
list.


----------------------------------------------------------------------
The guideline headings and numbers below are as they appeared in David
Levine's posting.

IV. Program Units
    3) FUNCTIONs must not have side effects.
Floppy checks for I/O in functions.
    4) Reference external functions in an EXTERNAL statement.
Floppy checks this.
    5) Do not use alternate returns.  Every subprogram should have single
       entry and exit points.
Floppy checks for alternate returns.

V. Statement format
    2) Indicate a non-blank comment with an * in column 1.
Floppy will check for comment lines not beginning with "C"

VI. Statement labels
    2) Assign labels in ascending order.
Floppy will optionally re-order all statement labels in ascending order,
with fixed start and step size.
    3) Assign a separate sequence of labels to FORMAT labels that are grouped
       at the end of a subprogram.
Floppy will optionally do this, too.
    4) Right-adjust labels.
Floppy checks that labels do not begin in column 1.

VIII. Spacing
    2) Write keywords without embedded spaces, but surround keywords with
       spaces.
Floppy checks for embedded blanks, not only in keywords, but also in
variable names etc..
    8) Use indentation to reinforce control flow; each indent is 3 columns.
Floppy will optionally re-indent the source, the indent step for each level
being user-defined, between 1 and 5.

IX. Identifier Selection
    5) Do not use a keyword as an identifier.
Floppy checks this.
    6) Do not use a subprogram name as a COMMON block name.
Floppy checks this.

XI. Typing
    1) Use the following data types:
           CHARACTER[*n]
           COMPLEX
           DOUBLE PRECISION
           INTEGER*4
Floppy warns against INTEGER*4!
           INTEGER
           LOGICAL
           REAL
    2) Use DOUBLE PRECISION instead of REAL*8.
Floppy warns against REAL*8.
    6) Do not compare arithmetic expressions of different types; type convert
       explicitly.
Floppy checks for mixed mode expressions, e.g. A = B/I.

XII. Operators
    3) Compare unequal length character strings with LGE, LGT, LLE, and LLT.
Floppy checks this.

XV. Control structures
    4) Use STOP only for abnormal termination, and include the reason in the
       character string message.
Floppy checks that a STOP statement is immediately preceded by a WRITE.

XVIII. Input/Output
    2) Position a FORMAT statement immediately following its reference.
       Position FORMAT statements that are used more than once at the end of
       the subprogram.
Floppy will optionally move all FORMAT statement to the end of the module.
-----------------------------------------------------------------------------

Other guidelines in common use at CERN. (The numbers at the left refer
to the rule number in Floppy.)

IV. Program Units
  1   Avoid comment lines after end of module
  2   End all program modules with the END statement
  11  Avoid comment lines before module declaration
  12  Module names should not be the same as intrinsic names
  13  First statement in a module should be declaration
  14  Module should begin with at least 3 comment lines
  29  Avoid the use of ENTRY in FUNCTIONs
  36  Module names should all be different

V. Statement format
  40  Separate Statement Functions by comment lines
  41  No names in Statement Function definitions elsewhere

VI. Statement labels
  27  Statement labels should not begin in column 1

IX. Identifier Selection
  9   Integer variables should begin with I to N
  6   Variable names should be 6 or fewer characters long

XIII. Expressions
  16  No comment lines between continuation lines

XV. Control structures
  26  Avoid the use of PAUSE statements

XVI. Arguments
  38  Length of passed CHARACTER variables should be *
  44  Passed arguments should be dimensioned * in module

XVII. COMMON blocks
  3   Declared COMMON blocks must be used in the module
  4   COMPLEX and DOUBLEPRECISION variables at end of COMMON
  5   COMMON block definitions should not change between modules
  18  Avoid multiple COMMON definitions per line
  19  Do not dimension COMMON variables outside COMMON
  32  COMMON block names should not equal variable names

XVIII. Input/Output
  22  Avoid the use of PRINT statements (use WRITE)
  24  Avoid WRITE(* construction
  30  Avoid using I/O in FUNCTIONs

desj@ccrwest.UUCP (David desJardins) (08/04/90)

In article <59012@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
>>     6) Initialize COMMON blocks only in BLOCK DATA subprograms.
>
>[...]  Anyway, _wherever_ you
>do it, you _should_ initialize _all_ common data.

   I find it hard to believe that you really mean this (especially as
you are writing from LANL).  On Crays at least, and probably on a lot
of other machines, initialized storage is included, byte for byte, in
the executable (a.out) file.  In fact, if any item in the common block
is initialized, the whole block is included.
   This makes it *extremely* undesirable to write code like

	INTEGER ARRAY(10**8)
	LOGICAL FLAG
	COMMON /FOO/ ARRAY, FLAG
	DATA FLAG/.FALSE./

much less to initialize the array itself.  (The above is a natural
thing to write if you want to initialize a table the first time it is
used, but not subsequently.)  Doing this causes you to discover
rapidly just how much (or how little) disk space is available to you
:-).

   For that matter, I disagree with the whole premise that arrays
should be initialized.  If the initialization is not necessary for the
functioning of the program, it seems likely to mislead the reader, who
will probably think that the initialized value has some particular
purpose.  And, as noted above, it is probably better to initialize
large arrays in code rather than with DATA statements anyway.

   -- David desJardins

jlg@lanl.gov (Jim Giles) (08/04/90)

From article <348@ccrwest.UUCP>, by desj@ccrwest.UUCP (David desJardins):
> In article <59012@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
>>[...]  Anyway, _wherever_ you
>>do it, you _should_ initialize _all_ common data.
> 
>    I find it hard to believe that you really mean this (especially as
> you are writing from LANL).  On Crays at least, and probably on a lot
> of other machines, initialized storage is included, byte for byte, in
> the executable (a.out) file.  In fact, if any item in the common block
> is initialized, the whole block is included.

Yes, I know.  It's one of the things that I've been asked to look
into.  I do local site analysis on SEGLDR, the loader on UNICOS.
Still, I don't see why you don't want the arrays initialized.  There
_are_ ohter mechanisms that DATA syayements you know.  For example,
there is a SEGLDR directive that is _supposed_ to initialize all memory
that isn't otherwise initialized to a number you can supply at load time -
uninitialized data are still compressed out of the executable and the
initialization is done by the startup routine (unfortunately, it's not
implemented yet - another thing to "look into" - which means _I_ probably
have to fix it.)

> [...]                   If the initialization is not necessary for the
> functioning of the program, it seems likely to mislead the reader, who
> will probably think that the initialized value has some particular
> purpose.

It _does_ have a purpose.  It is to prevent accidental use of meaningless
data.  I initialize stuff to NANs or (on the Cray) to infinity.  Integers
are a bit harder.  Still, I want accidental use of meaningless data to
stand out as clearly as possible.

> [...]      And, as noted above, it is probably better to initialize
> large arrays in code rather than with DATA statements anyway.

Yes, that's a way too.  (It is, in fact what the SEGLDR directive is
supposed to do.)  Like I say, you should _always_ initialize your data.

J. Giles

burley@world.std.com (James C Burley) (08/05/90)

I'm not too enamoured with the idea of encoding FORMAT strings as constants
in I/O statements.  Separate FORMAT statements, as ugly as they are, offer
two advantages:

1)  The compiler is forced to check the contents of the format for syntactic
    correctness, at least to parse hollerith and character data.  It is not
    so required for an in-line format, which it need only ensure is a valid
    character constant.  So programmer errors are more likely to be detected
    at compile time with a separate statement.

2)  Some systems are likely to optimize a FORMAT statement by compiling it
    into an intermediate representation at compile time, but postpone doing
    anything about a constant FORMAT specifier in an I/O statement until
    a given execution of that I/O statement.  So formatted I/O is likely to
    take less CPU time with separate FORMAT statements.

Note that a system that translates constant FORMATs in I/O statements also
must effectively check it for legitimacy, so if #2 does not apply to a given
system, #1 probably doesn't also, and vice versa.

Also note that I intend to make GNU Fortran check and compile FORMAT constants
just like FORMAT statements, if I can, because aside from these two
considerations, I think separate FORMAT statements are pretty silly anyway.
Even if you want to use a FORMAT specifier for more than one statement, I'd
prefer people use the more general PARAMETER mechanism to achieve that.

But if your target system(s) suffers from either or both of the above
ailments, then it might be best to stick with separate FORMAT statements
everywhere you can.

James Craig Burley, Software Craftsperson    burley@world.std.com

burley@world.std.com (James C Burley) (08/05/90)

XVI. Arguments
  38  Length of passed CHARACTER variables should be *
  44  Passed arguments should be dimensioned * in module

I'd be interested in knowing the reasoning behind these two.  If you KNOW
you want to pass, say, a 6-character string, why burden the code generation
(and thus the run-time speed) of the receiving procedure with having to
retrieve the actual length from the caller?  Furthermore, a checking compiler
might be able to figure out that a passed variable's length disagrees with
the receiver's declared length more easily and more certainly than figure
out that an arbitrary reference to a *-length dummy (such as FOO(6:6), the
sixth character) is invalid because the actual argument is not long enough.

Similar question about 44, since a checking compiler can't easily check array
bounds once the array is passed into a procedure that declares it as an
assumed-size array.  Also, based on personal experience, if you try and port
your application to a host/attached-processor combination that allows you to
split routines between host (typically I/O-bound routines) and attached
processor (typically compute-bound routines), it won't know how much data
of an assumed-size array to pass.  So instead of

      REAL FUNCTION SUM(ARRAY,SIZE)
      REAL ARRAY(*)
      INTEGER SIZE

You say

      REAL FUNCTION SUM(ARRAY,SIZE)
      INTEGER SIZE
      REAL ARRAY(SIZE)

Which either generates the exact same code for SUM or, for a checking
compiler, adds bounds checking to references to ARRAY within SUM, but also
which provides size information for ARRAY usable in passing the contents
of dummy arguments across processor boundaries.

Other than some of these minor questions, I think all these postings about
coding styles look pretty good, though some of the things (like capitalization)
I'd rather leave to automated tools than breaking my fingers trying to obey!!

James Craig Burley, Software Craftsperson    burley@world.std.com

ghm@ccadfa.adfa.oz.au (Geoff Miller) (08/06/90)

jlg@lanl.gov (Jim Giles) writes:

>This is not a disagreement with the style guide David Levine gave
>but merely a commentary on it.  I say this because I will only
>discuss those parts of his submission that I disagree with. As
>a result, I might be mistaken for a dissenter.  In fact, as will
>be seen, I don't discuss most of what he says (and I therefore
>agree with those parts).

>>     3) Assign a separate sequence of labels to FORMAT labels that are grouped
>>        at the end of a subprogram.

>Same comment about "assign" plus the following: whenever possible, put
>the format into the I/O statement which uses it.  Use labeled formats
>only when several I/O statements share the same (long) format.

I may not have been the only person to contribute this recommendation to David,
but I did contribute it so I'll defend it.  If you can guarantee that every
FORMAT statement will appear immediately under (or in) the PRINT statement
that uses it, fine.  As soon as you start using the same FORMAT statement in
different places, you either have to hunt through the code or locate the
FORMAT statement separately, probably at the end.  I feel that it is more 
consistent to have a simple rule which puts them all in one place.

>> XI. Typing
>>        [...]
>>     4) Declare all variables.  Use a compiler option to ensure declaration.
>>        If no such compiler option is available, setting the implicit type of
>>        all variables to a type that is not used in the program often snags
>>        undeclared variables.  An example is:  IMPLICIT COMPLEX*16 A-Z.

>A better 'snag' is IMPLICIT LOGICAL A-Z.  Complex has the disadvantage
>that arithmetic (the most common variable use) is still legal on it.
>Logical is better because it is legal in fewer contexts.

IMPLICIT NULL is even better if your compiler allows it.  If you use 
IMPLICIT LOGICAL A-Z you may decide not to use logical variables at all,
which frankly I never found much loss.

Geoff Miller  (ghm@cc.adfa.oz.au)
Computer Centre, Australian Defence Force Academy

mac@harris.cis.ksu.edu (Myron A. Calhoun) (09/10/90)

In article <1794@ccadfa.adfa.oz.au> you write:

>>>  3) Assign a separate sequence of labels to FORMAT labels that are grouped
>>>     at the end of a subprogram.

>>Same comment about "assign" plus the following: whenever possible, put
>>the format into the I/O statement which uses it.  Use labeled formats
>>only when several I/O statements share the same (long) format.

>I may not have been the only person to contribute this recommendation to David,
>but I did contribute it so I'll defend it.  If you can guarantee that every
>FORMAT statement will appear immediately under (or in) the PRINT statement
>that uses it, fine.  As soon as you start using the same FORMAT statement in
>different places, you either have to hunt through the code or locate the
>FORMAT statement separately, probably at the end.  I feel that it is more 
>consistent to have a simple rule which puts them all in one place.

Unfortunately, re-used FORMATs cause programs to "break" during maintenance.  
The scenario is that someone modifies a FORMAT and ONE of its associated
I/O statements but doesn't notice there are others.

Voila!  Instant "broken" program.

It is EXTREMELY EASY to "guarantee that every FORMAT statement will
appear immediately under (or in) the [I/O] statement that uses it".
Just duplicate it.  Shouldn't take more than a few keystrokes per
FORMAT statement on any reasonable editor; maybe a bit more work if
one is still using cards!

My credentials?  I routinely am called upon to modify programs written
more than 20 years ago.  I've learned a lot in 20 years, and one thing
I've learned is to NOT reuse FORMAT statements.
--Myron.
--
# Myron A. Calhoun, Ph.D. E.E.; Associate Professor   (913) 539-4448 home
# INTERNET: mac@harris.cis.ksu.edu   (129.130.10.2)         532-6350 work
# UUCP: ...{rutgers, texbell}!ksuvax1!harry!mac             532-7004 fax
# AT&T Mail:  attmail!ksuvax1!mac

FC138001@ysub.ysu.edu (Phil Munro) (09/11/90)

(Myron A. Calhoun) says:
>
>
>Unfortunately, re-used FORMATs cause programs to "break" during maintenance.
>...
>It is EXTREMELY EASY to "guarantee that every FORMAT statement will
>appear immediately under (or in) the [I/O] statement that uses it".
>Just duplicate it.  Shouldn't take more than a few keystrokes per
>FORMAT statement on any reasonable editor; maybe a bit more work if
>one is still using cards!
>
  I think duplicated FORMAT statements mean added memory allocations
when the program is compiled.  Is this not right?

  It seems just as easy to use "any reasonable editor" to find every
use of a FORMAT as to duplicate them and waste memory.  On the other
hand, except for the machine-code memory problem, it is an appealing
idea to put WRITEs and FORMATs together.  --Phil

mac@harris.cis.ksu.edu (Myron A. Calhoun) (09/12/90)

In article <90254.120334FC138001@ysub.ysu.edu> Phil Munro <FC138001@ysub.ysu.edu writes:
>  I think duplicated FORMAT statements mean added memory allocations
>when the program is compiled.  Is this not right?

>  It seems just as easy to use "any reasonable editor" to find every
>use of a FORMAT as to duplicate them and waste memory.  On the other
>hand, except for the machine-code memory problem, it is an appealing
>idea to put WRITEs and FORMATs together.  --Phil

OK, let me make some unreasonable assumptions:

Assume some program has 50 different FORMAT statements which appear
an average of 3 times EACH (is this UNreasonable enough?)  The
** average ** FORMAT statement probably fits in well under 100 bytes.
  50 * (3 - 1) * 100 = 10,000 extra bytes.

On a 64K machine (does anyone still use one of these), and extra 10K might
be rather important, but on modern computers with modern-sized memories,
10K seems rather piddling to me.

Perhaps a stronger argument for "one I/O statement, one FORMAT" is the
modern idea that code modules should be readable in one forward top-to-
bottom pass.  Re-used FORMAT statements require flipping back and forth
through the code and thus violate the "one pass top-to-bottom" convention.
But duplicating the (almost always a) few FORMAT statements follows this
convention.

Besides, if you have a "zillion" FORMAT statements that are re-useable,
why not put them in a separate I/O routine and just have one copy of the
I/O statements, too!
--Myron.
--
# Myron A. Calhoun, Ph.D. E.E.; Associate Professor   (913) 539-4448 home
# INTERNET: mac@harris.cis.ksu.edu   (129.130.10.2)         532-6350 work
# UUCP: ...{rutgers, texbell}!ksuvax1!harry!mac             532-7004 fax
# AT&T Mail:  attmail!ksuvax1!mac

djo7613@hardy.u.washington.edu (Dick O'Connor) (09/13/90)

In article <1990Sep11.191119.22682@maverick.ksu.ksu.edu> mac@harris.cis.ksu.edu (Myron A. Calhoun) writes:
>...
>Perhaps a stronger argument for "one I/O statement, one FORMAT" is the
>modern idea that code modules should be readable in one forward top-to-
>bottom pass.  Re-used FORMAT statements require flipping back and forth
>through the code and thus violate the "one pass top-to-bottom" convention.
>But duplicating the (almost always a) few FORMAT statements follows this
>convention.

What's the effect on executable size and execution time if every WRITE is
followed by a FORMAT, with all but the first one commented out?  This
is to avoid all that flipping back and forth, of course.

Curious...
"Moby" Dick O'Connor                         djo7613@u.washington.edu 
Washington Department of Fisheries           *I brake for salmonids*

Jeff Boyd <BOYDJ@QUCDN.QueensU.CA> (09/13/90)

If you wanted to manage your FORMATs a little more carfully, use
PARAMATERs, eg.

       PARAMETER (UNIT1=10,FMT1=1)

     1 FORMAT ( ... )

and later

       WRITE (UNIT1,FMT1)  var_list

providing your Fortran allows a named constant in the FMT option. I've
worked on some that didn't, but it's handy sometimes for managing odd
problems.

maine@elxsi.dfrf.nasa.gov (Richard Maine) (09/13/90)

On 10 Sep 90 14:16:39 GMT, mac@harris.cis.ksu.edu (Myron A. Calhoun) said:

Myron> Unfortunately, re-used FORMATs cause programs to "break" during
Myron> maintenance.  The scenario is that someone modifies a FORMAT
Myron> and ONE of its associated I/O statements but doesn't notice
Myron> there are others.

Myron> Voila!  Instant "broken" program.

Hmmm. I'd been tempted to make a very simillar point, but on the opposite
side.  The scenario is that two or more I/O statements each have separate
but equal (where have I heard that phrase :-)) format statements.  The
time comes to change the format (in some way that doesn't require explicit
change in the I/O statement.  You change one format, but miss the other(s).
Voila, "broken" program.  Perhaps it can't read it's own files back in
correctly.

Myron> ...one thing I've learned is to NOT reuse FORMAT statements.

I partly agree, but with some small quibbles.

If two format statements are inherently required to be the same (a
prime example being the format statements used to write and read the
same data), then I think there should be only one format statement
(where practical.  If the read and write are in separate subroutines,
this may not be practical, though I have at least one case where I put
several formats in an include file to insure that they were the same
in the read and write routines).

I view this as a case of the general programming principle that
you should avoid hidden dependencies between separate areas of
code.  If two pieces of code look identical and must remain
identical, perhaps there should be only one piece of code, called
as a subroutine or whatever.

By the same token, I agree that it is detrimental to maintainability
to combine 2 completely unrelated format statements that coincidentally
happen to look the same in the current version of the code.  That
would be just as bad an idea as replacing every instance of the
literal constant 10 with a PARAMETER named TEN.  (Might well be
a good idea to use parameters instead of the literal constants,
but they should be named something more functionally obvious and
those with different functions should have different names).

Myron> ....  I've learned a lot in 20 years

...not that I haven't made all of the mistakes alluded to above in
my 20 years (sigh).  Seems like there ought to be a better way to learn
than by making all the mistakes myself :-(.

--

Richard Maine
maine@elxsi.dfrf.nasa.gov [130.134.64.6]

jlg@lanl.gov (Jim Giles) (09/14/90)

From article <MAINE.90Sep13084658@altair.dfrf.nasa.gov>, by maine@elxsi.dfrf.nasa.gov (Richard Maine):
> On 10 Sep 90 14:16:39 GMT, mac@harris.cis.ksu.edu (Myron A. Calhoun) said:
> 
> Myron> Unfortunately, re-used FORMATs cause programs to "break" during
> Myron> maintenance.  The scenario is that someone modifies a FORMAT
> Myron> and ONE of its associated I/O statements but doesn't notice
> Myron> there are others.
> 
> Myron> Voila!  Instant "broken" program.
> 
> Hmmm. I'd been tempted to make a very simillar point, but on the opposite
> side.   [...]

I agree with both sides.  The reason I made the recommendation which
started this thread was to address this very issue.  So I'll make the
recommendation again and point out the relevance:

      The format specification for an I/O statement should be included
      _within_ the I/O statement itself, unless the same format is used
      by several different I/O statements.

So, if a format only applies to one I/O statement, it is _within_ that
statement.  If it applies to more than one, it is specified separately.
So, the "Myron" problem cannot arise because if someone modifies a FORMAT
statement and only ONE of the associated I/O requests, he has violated
the convention: a _separate_ FORMAT _always_ applies to more than one I/O
request.  The user should have looked for all of them.

J. Giles

misner@cod.NOSC.MIL (John Misner) (09/14/90)

In article <62893@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
>
>      The format specification for an I/O statement should be included
>      _within_ the I/O statement itself, unless the same format is used
>      by several different I/O statements.
>
>So, if a format only applies to one I/O statement, it is _within_ that
>statement.  If it applies to more than one, it is specified separately.
>
>J. Giles

The problem with this is that some (e.g. UNIVAC and VAX/VMS) compilers
do only 0 level checks on the in-line FORMAT - possibly limited to making
sure the format is a character variable or, if it is a constant or a
character PARAMETER, an additional check that it begins and ends with
'(' and ')'.  The run-time I/O then is instructed to do a complete parsing
of the format *EACH TIME* the statement is executed.  Apart from not
finding out until run time whether the format is valid, this adds
considerable processing (and therefore time) to the execution of the
code, even if there are no multiple executions involved.

I think (though I dont usually follow my own advice) that all format
statements should, for maintainability, be grouped together between
the STOP or RETURN statement and the END statement.  Most editors also
allow you to find out what line you are editing, hop to the nearest
END<endline>, and then return to the line you were editing.  My use
of in-line FORMATs is restricted to quick-and-dirty debug I/O.

J. Misner

burley@world.std.com (James C Burley) (09/14/90)

In article <90256.122601BOYDJ@QUCDN.BITNET> BOYDJ@QUCDN.QueensU.CA (Jeff Boyd) writes:

   If you wanted to manage your FORMATs a little more carfully, use
   PARAMATERs, eg.

	  PARAMETER (UNIT1=10,FMT1=1)

	1 FORMAT ( ... )

   and later

	  WRITE (UNIT1,FMT1)  var_list

   providing your Fortran allows a named constant in the FMT option. I've
   worked on some that didn't, but it's handy sometimes for managing odd
   problems.

I don't think the example you give is valid -- but the wording "allows a
named constant" at the end suggests you knew what you wanted to do, just
got confused typing it.

A format specifier may not be a named constant (PARAMETER name).

It may be an integer variable to which one has ASSIGNed the desired FORMAT
statement (I don't recommend this for this situation).

It may also be a character array name or character expression, including a
named constant, which is what I think you meant.

(It may also be *, but that's list-directed formatting).

So your example perhaps should have been something like this:

      INTEGER UNIT1
      CHARACTER*(*) FORMAT1
      PARAMETER (UNIT1=10,FORMAT1='(...)')

      WRITE(UNIT1,FORMAT1) var_list

I.e. the FORMAT itself is within the PARAMETER statement.

As others have pointed out (and I did in an earlier discussion on this
same topic), the advantage of using a named constant for a FORMAT may be
outweighed by the disadvantage of having your compiler not syntax-check
the FORMAT for correctness to the same extent it does a FORMAT statement and/or
the implementation of the program require reinterpretation of the named-
constant FORMAT each time it is referenced in an I/O statement at run-time
rather than once during compile time.  Not all compilers/systems suffer both
(or even either) disadvantage, but it can be a performance issue on some.

James Craig Burley, Software Craftsperson    burley@world.std.com

burley@world.std.com (James C Burley) (09/14/90)

Oops, I meant "A FORMAT specifier may not be an INTEGER named constant
(PARAMETER name)."  Forgot the "INTEGER" qualifier.  As I said later, it may
of course be a CHARACTER named constant.