levine@ics.uci.edu (David Levine) (12/03/89)
Point-by-point justifications are omitted from the guide. Unless discussion of such details would be of general interest, please e-mail questions to me. David L. Levine, Dept. of ICS Internet: levine@ics.uci.edu University of California, Irvine BITNET: levine@ucivmsa Irvine, CA 92717 UUCP: ucbvax!ucivax!levine O \ o---\---- \ ---------------------------------------------------------------- Fortran 77 Coding Guidelines David L. Levine 1 Dec 1989 levine@ics.uci.edu 714-640-8662 I. Introduction The following guidelines are designed to encourage consistent coding across projects and programmers. Many arbitrary low-level decisions are made during coding. While many of these have no effect on the machine code, they do affect the appearance of the code. And, consistent practices enhance productivity and reusability. Project requirements, when applicable, take precedence. The goals of the guidelines are, in decreasing order of importance: 1) understandability -- conveys the purpose of computations to the reader 2) transportability -- between compilers on assorted modern operating systems 3) maintainability -- can be readily enhanced 4) efficiency -- execution speed II. General 1) Adhere to strict FORTRAN 77 as closely as possible, with the following exceptions. a) In addition to the standard character set (where the notation [0-9] indicates the numerals 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9) space $ ' ( ) * + , - . / : = [0-9] [A-Z] the following characters may be used: ! # & [a-z] b) Identifier names may be up to 31 characters long (the first 6 remain significant). c) INCLUDE statements may be used. d) INTEGER*4 may be used when INTEGER defaults to 2 bytes. 2) Avoid compiler directives in code, especially if an equivalent compiler option is available. If a compiler directive must be used, comment its effect and target machine, operating system, and compiler, with revision numbers. 3) Use a consistent set of compiler options, or maintain a makefile or command procedure to enable reconstruction. Suggested compiler options are listed in Appendix II. III. Project Organization 1) Use a standardized comment header at the top of every subprogram (see example in Appendix III). 2) Group related subprograms into a module. The module is assigned a unique name and is typically stored in one or more files in a single subdirectory on the file system. If used, a single object library contains the code for all the module components. 3) The first two letters of the routine and file name correspond to the module and the remaining four uniquely identify the routine. Additional characters may be used to create a more sensible name. IV. Program Units 1) Begin a main program with a PROGRAM statement. 2) Order arguments as follows: inputs, outputs, control, external names. 3) FUNCTIONs must not have side effects. 4) Reference external functions in an EXTERNAL statement. 5) Do not use alternate returns. Every subprogram should have single entry and exit points. V. Statement format 1) Use the standard source format (columns 1-5 for label, 6 for continuation, and 7-72 for statement). 2) Indicate a non-blank comment with an * in column 1. 3) Indicate a continuation with an & in column 6. 4) Do not split a name across lines. 5) Do not write more than one statement per line. VI. Statement labels 1) Assign a label only if necessary. 2) Assign labels in ascending order. 3) Assign a separate sequence of labels to FORMAT labels that are grouped at the end of a subprogram. 4) Right-adjust labels. 5) Do not use the label field of a continuation line. VII. Capitalization 1) Set keywords in all caps. 2) Set symbolic names of constants (parameters) in all lower case. 3) Set all other identifiers in initial caps with embedded words initial capped. VIII. Spacing 1) Do not use tabs. 2) Write keywords without embedded spaces, but surround keywords with spaces. 3) Do not put space between and array name and its index, Array(I). 4) Put one space between a subprogram name and its argument list, e.g., Subr (Arg1, Arg2). 5) Except in argument lists, put a space after an open parenthesis and before a close parenthesis. 6) Put one space between arguments. 7) Use spacing in equations to reveal operators and reinforce precedence. 8) Use indentation to reinforce control flow; each indent is 3 columns. 9) Use whitespace to enhance readability. IX. Identifier Selection 1) Start with letter [A-Z], follow with only letters or digits. 2) Limit to 31 characters; distinguish with first 6. 3) Choose an identifier to represent the entity being modeled. 4) Explain the significance of each variable and array in comments. 5) Do not use a keyword as an identifier. 6) Do not use a subprogram name as a COMMON block name. 7) Do not abbreviate .TRUE. or .FALSE. 8) Delimit character strings with apostrophes. X. Constants 1) Use PARAMETERs to symbolically name all compile-time constants. 2) Use only constant expressions to define PARAMETERs. XI. Typing 1) Use the following data types: CHARACTER[*n] COMPLEX DOUBLE PRECISION INTEGER*4 INTEGER LOGICAL REAL 2) Use DOUBLE PRECISION instead of REAL*8. 3) Use INTEGER for integers that are always in the range of two-byte integers [-32768,32767]; then use a compiler option to select two or four byte storage. 4) Declare all variables. Use a compiler option to ensure declaration. If no such compiler option is available, setting the implicit type of all variables to a type that is not used in the program often snags undeclared variables. An example is: IMPLICIT COMPLEX*16 A-Z. 5) Arrange identifiers in type declarations logically based on the entities which they describe. Subprogram arguments may be declared in order of appearance in the argument list. If there is no other obvious order, arrange alphabetically. 6) Do not compare arithmetic expressions of different types; type convert explicitly. XII. Operators 1) Do not use .EQ. and .NE. between floating point expressions. 2) Use .GE. or .LE., as appropriate, instead of .EQ. when checking for a threshold crossing. 3) Compare unequal length character strings with LGE, LGT, LLE, and LLT. 4) Use only the logical operators .AND., .OR., .EQV., .NEQV., and .NOT., and only use them on LOGICAL operands. XIII. Expressions 1) Surround low precedence operators with space. 2) Split an expression across lines after an operator. 3) Indent continuation lines. 4) Consider the types of operands and their effects on the values of subexpressions, e.g., 8 / 3 * 3.0 is 6.0, not 8.0. 5) Be careful with "exact" values of floating point expressions, e.g., assigning 30.0 / 0.1 to an integer may define it as 299! XIV. Arrays 1) Declare array dimensions in the type declaration rather than in a separate DIMENSION statement. 2) Use only INTEGER subscript expressions. 3) Preferably operate on arrays such that the first indices vary fastest and the last vary slowest. 4) Specify all subscripts in any array reference. 5) Do not exceed the bounds of declared array dimensions. XV. Control structures 1) Use GOTO carefully. See Appendix I for loop constructs. Comments at the target of a GOTO listing possible 'come froms' are very helpful. 2) Terminate or begin every loop with a distinct CONTINUE. 3) Do not jump into the middle of a loop or conditional. 4) Use STOP only for abnormal termination, and include the reason in the character string message. XVI. Arguments 1) Match the actual arguments in the caller to the formal arguments of the callee in both number and type. 2) Do not repeat an actual argument in any call. 3) All arguments to an intrinsic function must be of the same type. 4) Do not pass a constant as an actual argument unless it is to an IN formal (see Appendix III for definition). XVII. COMMON blocks 1) Only place data in COMMON blocks if necessary. 2) Place COMMON block definitions in INCLUDE files. 3) SAVE all COMMON blocks. 4) Do not mix CHARACTER and non-character types in a COMMON block. 5) Do not pass as an argument any variable referenced in a COMMON block in both the calling and called subprograms. 6) Initialize COMMON blocks only in BLOCK DATA subprograms. 7) Compile BLOCK DATA subprograms with another program unit in which it is referred to with an EXTERNAL statement. 8) Use EQUIVALENCE with care, and only to economize on storage; avoid aliasing and then only if well commented. XVIII. Input/Output 1) Use error recovery options END=, ERR=, and IOSTAT=, and handle all such conditions gracefully. 2) Position a FORMAT statement immediately following its reference. Position FORMAT statements that are used more than once at the end of the subprogram. 3) Use implied DO rather than DO loops. 4) OPEN all files with STATUS = 'UNKNOWN' unless otherwise is required. BIBLIOGRAPHY The contributions of the following individuals are gratefully acknowledged. Many laudable suggestions were unilaterally discarded solely in the interest of keeping this guide as brief as possible. A style guide by its very nature is subjective. The primary intent of this guide is draw attention to some of the intricacies of coding in the interest of encouraging consistency. Bierman, Keith, personal correspondence, <8911010025.AA06492@chiba.sun.com> (31 Oct 1989) and <8911160932.AA05443@chiba.sun.com> (16 November 1989). Caffin, R. N., "More on Fortran Coding Conventions," Fortran Forum 3:3 (ACM, December 1984). Calhoun, Myron A., personal correspondence, <8911061456.AA28516@harris.cis.ksu.edu> (6 November 1989). Cox, Robert W., personal correspondence, <8910311522.AA03009@ilmarin.c3.lanl.gov> (31 Oct 1989). Liesenfeld, Ulrich, personal correspondence, <1955:uli@analyt.chemie.uni-bochum.dbp.de> (16 November 1989). Metcalf, Michael, FORTRAN Optimization (New York: Academic Press, Inc., 1982). Metcalf, Michael, "FORTRAN 77 Coding Conventions," ForTec Forum 2:4 (ACM, December 1983). Miller, Geoff, "Bureau of Transport Economics Computer Users Guide, Attachment 1 - FORTRAN Programming Standards" (Canberra: Computer Centre, Australian Defence Force Academy, 8 August 1986). Montgomery, Peter, personal correspondence, <8910311752.AA20351@sonia.math.ucla.edu> (31 Oct 1989). APPENDIX I. Loop Constructs 1) iterative DO 10 I = 1, iterations . . . 10 CONTINUE Do not modify the loop variable. 2) while 10 CONTINUE . . . IF ( condition ) GOTO 20 . . . GOTO 10 20 CONTINUE 3) do-while or repeat-until (all statements in loop execute at least once) 10 CONTINUE . . . IF ( condition ) GOTO 10 APPENDIX II. Suggested Compiler Options 1) VAX FORTRAN: /check=(bounds,overflow)/g_float/standard/warnings=all 2) MS fl: /4I2 /4Yd /Ox (/G2 for 80286/386 instructions) 3) Sun f77: (fp is floating point hardware option, e.g., 68881 j is optimization level) Sun3 f77 version 1.2 and earlier: -ansi -ffp -u -Oj foo.f /usr/lib/fp/libm.il /usr/lib/libm.il -lm Sun3 f77 version 1.3: -ansi -fast -u -O3 foo.f -lm Sun4 f77 prior to version 1.2: -ansi -u -Oj foo.f /usr/lib/f77/libm.il -lm Sun4 f77 version 1.2: -ansi -u -Oj foo.f /usr/lib/f77/libm.il /usr/lib/f77/libm.il -lm (and use -dalign if DOUBLE PRECISION is used; may cause core dump if code is not all double word aligned) Sun4 f77 version 1.3: -ansi -fast -u -O3 foo.f -lm 4) Convex fc: -ep i -Oj (i is number of processors and j is optimization level) APPENDIX III. Example notes on example: 1) The dividing lines end in column 72, to serve as a visual aid when working with editors that do not display cursor position. 2) The 'Units' field is useful in physical applications; other parameter properties may be of interest in other applications. 3) Arguments and common block entities that are referenced in the subprogram are listed in the prologue. Each may be classified as IN, OUT, or INOUT mode, following the Ada practice as shown below. A variable is defined if its value is changed, such as by assignment. A variable is used if its value, on entry to the subprogram, is referenced. mode define use ---- ------ --- IN not allowed allowed OUT allowed not allowed INOUT allowed allowed SUBROUTINE TEUpCase (String) * * ================== Prologue ========================================== * * Purpose: * Convert a string to all upper case characters. * * History: * Version Programmer Date Description * ------- ---------- ---- ----------- * 1.0 D. Levine 3/16/89 created * * IN args/commons Units Description * --------------- ----- ----------- * * OUT args/commons Units Description * ---------------- ----- ----------- * * INOUT args/commons Units Description * ------------------ ----- ----------- * String n/a string to be converted * * Processing: * Convert each lower case character of the String to upper case by * adding the difference of the base upper and lower case characters, * 'A' and 'a', respectively. * * Special requirements: * Assumes that [a..z] is mapped onto consecutive integers, and * [A..Z] is mapped onto consecutive integers. * * ------------------ Include files ------------------------------------- * ------------------ Constant declarations ----------------------------- * ------------------ Argument declarations ----------------------------- CHARACTER*(*) String * ------------------ Global/External declarations ---------------------- * ------------------ Local declarations -------------------------------- CHARACTER C INTEGER Upper2Lower, Pos * ------------------ Code ---------------------------------------------- Upper2Lower = ICHAR ('A') - ICHAR ('a') DO 10 Pos = 1, LEN (String) C = String(Pos:Pos) IF ( LGE (C,'a') .AND. LLE (C,'z') ) THEN String(Pos:Pos) = CHAR (ICHAR (C) + Upper2Lower) ENDIF 10 CONTINUE RETURN END
levine@ics.uci.edu (David Levine) (12/03/89)
Here is the f77 style guide that I have put together. Many thanks to those who provided suggestions. As mentioned in the text, I did not use them all for various reasons.
johna@runxtsa.runx.oz.au (John Arndt) (03/17/90)
In the latter part of 1989 someone in the United States broadcast an article titled "Fortran 77 Style Guide". I copied the article to my wordprocessor,cleared it and saved the empty text file - as well as backing this void up. Would someone with a copy of the Style Guide mind broadcasting it again? Thanks in advance. John Arndt. ---------------- ----------------
levine@crimee.ics.uci.edu (David Levine) (08/03/90)
-------- Here is the f77 style guide that I have put together. Many thanks to those who provided suggestions. As mentioned in the text, I did not use them all for various reasons. Point-by-point justifications are omitted from the guide. Unless discussion of such details would be of general interest, please e-mail questions to me. David L. Levine, Dept. of ICS Internet: levine@ics.uci.edu University of California, Irvine BITNET: levine@ucivmsa Irvine, CA 92717 UUCP: ucbvax!ucivax!levine O \ o---\---- \ ---------------------------------------------------------------- Fortran 77 Coding Guidelines David L. Levine 11 Dec 1989 levine@ics.uci.edu 714-640-8662 I. Introduction The following guidelines are designed to encourage consistent coding across projects and programmers. Many arbitrary low-level decisions are made during coding. While many of these have no effect on the machine code, they do affect the appearance of the code. And, consistent practices enhance productivity and reusability. Project requirements, when applicable, take precedence. The goals of the guidelines are, in decreasing order of importance: 1) understandability -- conveys the purpose of computations to the reader 2) transportability -- between compilers on assorted modern operating systems 3) maintainability -- can be readily enhanced 4) efficiency -- execution speed II. General 1) Adhere to strict FORTRAN 77 as closely as possible, with the following exceptions. a) In addition to the standard character set (where the notation [0-9] indicates the numerals 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9) space $ ' ( ) * + , - . / : = [0-9] [A-Z] the following characters may be used: ! # & [a-z] b) Identifier names may be up to 31 characters long (the first 6 remain significant). c) INCLUDE statements may be used. d) INTEGER*4 may be used when INTEGER defaults to 2 bytes. 2) Avoid compiler directives in code, especially if an equivalent compiler option is available. If a compiler directive must be used, comment its effect and target machine, operating system, and compiler, with revision numbers. 3) Use a consistent set of compiler options, or maintain a makefile or command procedure to enable reconstruction. Suggested compiler options are listed in Appendix II. III. Project Organization 1) Use a standardized comment header at the top of every subprogram (see example in Appendix III). 2) Group related subprograms into a module. The module is assigned a unique name and is typically stored in one or more files in a single subdirectory on the file system. If used, a single object library contains the code for all the module components. 3) The first two letters of the routine and file name correspond to the module and the remaining four uniquely identify the routine. Additional characters may be used to create a more sensible name. IV. Program Units 1) Begin a main program with a PROGRAM statement. 2) Order arguments as follows: inputs, outputs, control, external names. 3) FUNCTIONs must not have side effects. 4) Reference external functions in an EXTERNAL statement. 5) Do not use alternate returns. Every subprogram should have single entry and exit points. V. Statement format 1) Use the standard source format (columns 1-5 for label, 6 for continuation, and 7-72 for statement). 2) Indicate a non-blank comment with an * in column 1. 3) Indicate a continuation with an & in column 6. 4) Do not split a name across lines. 5) Do not write more than one statement per line. VI. Statement labels 1) Assign a label only if necessary. 2) Assign labels in ascending order. 3) Assign a separate sequence of labels to FORMAT labels that are grouped at the end of a subprogram. 4) Right-adjust labels. 5) Do not use the label field of a continuation line. VII. Capitalization 1) Set keywords in all caps. 2) Set symbolic names of constants (parameters) in all lower case. 3) Set all other identifiers in initial caps with embedded words initial capped. VIII. Spacing 1) Do not use tabs. 2) Write keywords without embedded spaces, but surround keywords with spaces. 3) Do not put space between and array name and its index, Array(I). 4) Put one space between a subprogram name and its argument list, e.g., Subr (Arg1, Arg2). 5) Except in argument lists, put a space after an open parenthesis and before a close parenthesis. 6) Put one space between arguments. 7) Use spacing in equations to reveal operators and reinforce precedence. 8) Use indentation to reinforce control flow; each indent is 3 columns. 9) Use whitespace to enhance readability. IX. Identifier Selection 1) Start with letter [A-Z], follow with only letters or digits. 2) Limit to 31 characters; distinguish with first 6. 3) Choose an identifier to represent the entity being modeled. 4) Explain the significance of each variable and array in comments. 5) Do not use a keyword as an identifier. 6) Do not use a subprogram name as a COMMON block name. 7) Do not abbreviate .TRUE. or .FALSE. 8) Delimit character strings with apostrophes. X. Constants 1) Use PARAMETERs to symbolically name all compile-time constants. 2) Use only constant expressions to define PARAMETERs. XI. Typing 1) Use the following data types: CHARACTER[*n] COMPLEX DOUBLE PRECISION INTEGER*4 INTEGER LOGICAL REAL 2) Use DOUBLE PRECISION instead of REAL*8. 3) Use INTEGER for integers that are always in the range of two-byte integers [-32768,32767]; then use a compiler option to select two or four byte storage. 4) Declare all variables. Use a compiler option to ensure declaration. If no such compiler option is available, setting the implicit type of all variables to a type that is not used in the program often snags undeclared variables. An example is: IMPLICIT COMPLEX*16 A-Z. 5) Arrange identifiers in type declarations logically based on the entities which they describe. Subprogram arguments may be declared in order of appearance in the argument list. If there is no other obvious order, arrange alphabetically. 6) Do not compare arithmetic expressions of different types; type convert explicitly. XII. Operators 1) Do not use .EQ. and .NE. between floating point expressions. 2) Use .GE. or .LE., as appropriate, instead of .EQ. when checking for a threshold crossing. 3) Compare unequal length character strings with LGE, LGT, LLE, and LLT. 4) Use only the logical operators .AND., .OR., .EQV., .NEQV., and .NOT., and only use them on LOGICAL operands. XIII. Expressions 1) Surround low precedence operators with space. 2) Split an expression across lines after an operator. 3) Indent continuation lines. 4) Consider the types of operands and their effects on the values of subexpressions, e.g., 8 / 3 * 3.0 is 6.0, not 8.0. 5) Be careful with "exact" values of floating point expressions, e.g., assigning 30.0 / 0.1 to an integer may define it as 299! XIV. Arrays 1) Declare array dimensions in the type declaration rather than in a separate DIMENSION statement. 2) Use only INTEGER subscript expressions. 3) Preferably operate on arrays such that the first indices vary fastest and the last vary slowest. 4) Specify all subscripts in any array reference. 5) Do not exceed the bounds of declared array dimensions. XV. Control structures 1) Use GOTO carefully. See Appendix I for loop constructs. Comments at the target of a GOTO listing possible 'come froms' are very helpful. 2) Terminate or begin every loop with a distinct CONTINUE. 3) Do not jump into the middle of a loop or conditional. 4) Use STOP only for abnormal termination, and include the reason in the character string message. XVI. Arguments 1) Match the actual arguments in the caller to the formal arguments of the callee in both number and type. 2) Do not repeat an actual argument in any call. 3) All arguments to an intrinsic function must be of the same type. 4) Do not pass a constant as an actual argument unless it is to an IN formal (see Appendix III for definition). XVII. COMMON blocks 1) Only place data in COMMON blocks if necessary. 2) Place COMMON block definitions in INCLUDE files. 3) SAVE all COMMON blocks. 4) Do not mix CHARACTER and non-character types in a COMMON block. 5) Do not pass as an argument any variable referenced in a COMMON block in both the calling and called subprograms. 6) Initialize COMMON blocks only in BLOCK DATA subprograms. 7) Compile BLOCK DATA subprograms with another program unit in which it is referred to with an EXTERNAL statement. 8) Use EQUIVALENCE with care, and only to economize on storage; avoid aliasing and then only if well commented. XVIII. Input/Output 1) Use error recovery options END=, ERR=, and IOSTAT=, and handle all such conditions gracefully. 2) Position a FORMAT statement immediately following its reference. Position FORMAT statements that are used more than once at the end of the subprogram. 3) Use implied DO rather than DO loops. 4) OPEN all files with STATUS = 'UNKNOWN' unless otherwise is required. BIBLIOGRAPHY The contributions of the following individuals are gratefully acknowledged. Many laudable suggestions were unilaterally discarded solely in the interest of keeping this guide as brief as possible. A style guide by its very nature is subjective. The primary intent of this guide is draw attention to some of the intricacies of coding in the interest of encouraging consistency. Bierman, Keith, personal correspondence, <8911010025.AA06492@chiba.sun.com> (31 Oct 1989) and <8911160932.AA05443@chiba.sun.com> (16 November 1989). Caffin, R. N., "More on Fortran Coding Conventions," Fortran Forum 3:3 (ACM, December 1984). Calhoun, Myron A., personal correspondence, <8911061456.AA28516@harris.cis.ksu.edu> (6 November 1989). Cox, Robert W., personal correspondence, <8910311522.AA03009@ilmarin.c3.lanl.gov> (31 Oct 1989). Liesenfeld, Ulrich, personal correspondence, <1955:uli@analyt.chemie.uni-bochum.dbp.de> (16 November 1989). Metcalf, Michael, FORTRAN Optimization (New York: Academic Press, Inc., 1982). Metcalf, Michael, "FORTRAN 77 Coding Conventions," ForTec Forum 2:4 (ACM, December 1983). Miller, Geoff, "Bureau of Transport Economics Computer Users Guide, Attachment 1 - FORTRAN Programming Standards" (Canberra: Computer Centre, Australian Defence Force Academy, 8 August 1986). Montgomery, Peter, personal correspondence, <8910311752.AA20351@sonia.math.ucla.edu> (31 Oct 1989). Watson, Ian, personal correspondence, <8912061409.AA05518@Kodak.COM> (6 Dec 1989). APPENDIX I. Loop Constructs 1) iterative DO 10 I = 1, iterations . . . 10 CONTINUE Do not modify the loop variable. 2) while 10 CONTINUE . . . IF ( condition ) GOTO 20 . . . GOTO 10 20 CONTINUE 3) do-while or repeat-until (all statements in loop execute at least once) 10 CONTINUE . . . IF ( condition ) GOTO 10 APPENDIX II. Suggested Compiler Options 1) VAX FORTRAN: /check=(bounds,overflow)/g_float/standard/warnings=all 2) MS fl: /4I2 /4Yd /Ox (/G2 for 80286/386 instructions) 3) Sun f77: (fp is floating point hardware option, e.g., 68881 j is optimization level) Sun3 f77 version 1.2 and earlier: -ansi -ffp -u -Oj foo.f /usr/lib/fp/libm.il /usr/lib/libm.il -lm Sun3 f77 version 1.3: -ansi -fast -u -O3 foo.f -lm Sun4 f77 prior to version 1.2: -ansi -u -Oj foo.f /usr/lib/f77/libm.il -lm Sun4 f77 version 1.2: -ansi -u -Oj foo.f /usr/lib/f77/libm.il /usr/lib/f77/libm.il -lm (and use -dalign if DOUBLE PRECISION is used; may cause core dump if code is not all double word aligned) Sun4 f77 version 1.3: -ansi -fast -u -O3 foo.f -lm 4) Convex fc: -ep i -Oj (i is number of processors and j is optimization level) APPENDIX III. Example notes on example: 1) The dividing lines end in column 72, to serve as a visual aid when working with editors that do not display cursor position. 2) The 'Units' field is useful in physical applications; other parameter properties may be of interest in other applications. 3) Arguments and common block entities that are referenced in the subprogram are listed in the prologue. Each may be classified as IN, OUT, or INOUT mode, following the Ada practice as shown below. A variable is defined if its value is changed, such as by assignment. A variable is used if its value, on entry to the subprogram, is referenced. mode define use ---- ------ --- IN not allowed allowed OUT allowed not allowed INOUT allowed allowed SUBROUTINE TEUpCase (String) * * ================== Prologue ========================================== * * Purpose: * Convert a string to all upper case characters. * * History: * Version Programmer Date Description * ------- ---------- ---- ----------- * 1.0 D. Levine 03/16/89 created * 1.1 D. Levine 12/11/89 changed to INDEX into * constant string from adding * 'A' - 'a' to each char. * * IN args/commons Units Description * --------------- ----- ----------- * * OUT args/commons Units Description * ---------------- ----- ----------- * * INOUT args/commons Units Description * ------------------ ----- ----------- * String N/A string to be converted * * Processing: * For each character in String, check to see if it is lower case by * finding its INDEX in the string [a..z]. Then, replace with the * character at the same position in the string [A..Z]. * * Special requirements: * none * * ------------------ Include files ------------------------------------- * ------------------ Constant declarations ----------------------------- * ------------------ Argument declarations ----------------------------- CHARACTER*(*) String * ------------------ Global/External declarations ---------------------- * ------------------ Local declarations -------------------------------- INTEGER Pos, I CHARACTER*26 lowers, uppers DATA lowers / 'abcdefghijklmnopqrstuvwxyz' /, & uppers / 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' / * ------------------ Code ---------------------------------------------- DO 100 I = 1, LEN (String) Pos = INDEX (lowers, String(I:I)) IF ( Pos .NE. 0 ) String(I:I) = uppers(Pos:Pos) 100 CONTINUE RETURN END
jlg@lanl.gov (Jim Giles) (08/03/90)
This is not a disagreement with the style guide David Levine gave but merely a commentary on it. I say this because I will only discuss those parts of his submission that I disagree with. As a result, I might be mistaken for a dissenter. In fact, as will be seen, I don't discuss most of what he says (and I therefore agree with those parts). From article <26B89BE1.4349@ics.uci.edu>, by levine@crimee.ics.uci.edu (David Levine): > [...] > a) In addition to the standard character set (where the notation [0-9] > indicates the numerals 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9) > space $ ' ( ) * + , - . / : = [0-9] [A-Z] > the following characters may be used: > ! # & [a-z] Believe it or not, there are still implementations without these characters (or, at least, compilers which still don't recognize them). You can use these as long as you have access to some tool to remove or replace them when you need to (this may be never - so maybe you're safe). > b) Identifier names may be up to 31 characters long (the first 6 > remain significant). _NEVER_ use more characters in an identifier than the compiler considers significant. _NEVER_ buy a compiler which allows an identifier to contain insignificant characters. Yes, the 6 character limit of standard Fortran is irritating. Tricking yourself with identifiers that _look_ different to you, but might not be to the compiler is _not_ a good solution. > c) INCLUDE statements may be used. Well, just don't count on the _compiler_ to do it. Many compilers don't. Any half-way acceptable text editor ought to have sufficient macro definition capability to expand the INCLUDE. > d) INTEGER*4 may be used when INTEGER defaults to 2 bytes. Not available everywhere. Just like the lower case letters, you should only use this if you have an automatic tool to remove them - otherwise you got to do it by hand. > 2) Avoid compiler directives in code, especially if an equivalent > compiler option is available. If the directives in code do the same thing that command line directives do, there's no reason for the vendor to supply both. You can't make rigid rules about this issue without assuming things that you don't know about what the directives might do and what the priorities of the programmers are. > 3) The first two letters of the routine and file name correspond to the > module and the remaining four uniquely identify the routine. > Additional characters may be used to create a more sensible name. See above for comments on insignificant identifier characters. > VI. Statement labels > 1) Assign a label only if necessary. > 2) Assign labels in ascending order. My only complaint here is wording. In Fortran, the word "ASSIGN" has a specific meaning which isn't intended here. Replace "assign" with "declare" in all the stuff above, and I'll agree with it. > 3) Assign a separate sequence of labels to FORMAT labels that are grouped > at the end of a subprogram. Same comment about "assign" plus the following: whenever possible, put the format into the I/O statement which uses it. Use labeled formats only when several I/O statements share the same (long) format. > VII. Capitalization > 1) Set keywords in all caps. > 2) Set symbolic names of constants (parameters) in all lower case. > 3) Set all other identifiers in initial caps with embedded words initial > capped. Way too rigid. In a language like C, where case is significant, such rigid rules are important in order to share code among users (or across time). In Fortran, case should be used freely (with comments) for emphasis and documentation. > VIII. Spacing > [...] > 3) Do not put space between and array name and its index, Array(I). > 4) Put one space between a subprogram name and its argument list, e.g., > Subr (Arg1, Arg2). > 5) Except in argument lists, put a space after an open parenthesis and > before a close parenthesis. > 6) Put one space between arguments. > [...] > 8) Use indentation to reinforce control flow; each indent is 3 columns. > [...] Again too rigid. Spaces should be used to enhance legibility, not to conform to some rigid style rules. All these things, but particularly how many columns to indent, should be up to the individual programmer. Any other decision doesn't allow for the different constraints of different applications and users. As long as a program uses space in a reasonably consistent way, it doesn't matter what the particular rules are. > 2) Limit to 31 characters; distinguish with first 6. Again, _NEVER_ use an identifier with more characters that are significant. > 1) Use PARAMETERs to symbolically name all compile-time constants. A little to rigid. If the constant is something that's likely to change in future versions of the code, name it. If the constant is hard to remember (or long and prone to typing errors), then name it. Otherwise don't name it. I've seen people who've named 1 (one), and then used that everywhere - this doesn't enhance legibility or maintainability at all. > XI. Typing > [...] > INTEGER*4 Not universally available. > [...] > 4) Declare all variables. Use a compiler option to ensure declaration. > If no such compiler option is available, setting the implicit type of > all variables to a type that is not used in the program often snags > undeclared variables. An example is: IMPLICIT COMPLEX*16 A-Z. A better 'snag' is IMPLICIT LOGICAL A-Z. Complex has the disadvantage that arithmetic (the most common variable use) is still legal on it. Logical is better because it is legal in fewer contexts. > XII. Operators > [...] > 3) Compare unequal length character strings with LGE, LGT, LLE, and LLT. Compare _all_ character strings this way. Otherwise you get a non-portable lexicographic ordering from some machines. > XIII. Expressions > [...] > 2) Split an expression across lines after an operator. Obviously, only if the line _needs_ splitting :-). > XIV. Arrays > 1) Declare array dimensions in the type declaration rather than in a > separate DIMENSION statement. In fact, _never_ use the DIMENSION statement. Means the same thing, I just like it stated a clearly as possible. > 3) Preferably operate on arrays such that the first indices vary fastest > and the last vary slowest. An important issue in bygone days and still an optimization isse on scalar machines with small caches. This isn't a disagreement, but I'd like to point out that there are other orders which might be desireable. For example, on a vector machine, you want the index with the longest dimension to be on the inner loop (longer vectors). > XVII. COMMON blocks > [...] > 3) SAVE all COMMON blocks. I've requested the Fortran committee for years to make SAVE the default for COMMON. > 6) Initialize COMMON blocks only in BLOCK DATA subprograms. In spite of the fact that the standard calls for this, I am not sure that I agree. If MODULES existed, the obvious preferred place to initialize globals would be in the MODULE. But, for COMMON, I think BLOCK DATA was always a clumbsy solution. Anyway, _wherever_ you do it, you _should_ initialize _all_ common data. > 7) Compile BLOCK DATA subprograms with another program unit in which it > is referred to with an EXTERNAL statement. This is why I'm not sure I agree with using BLOCK DATA at all. > XVIII. Input/Output > 2) Position a FORMAT statement immediately following its reference. Position formats _in_ the I/O statement that refer to them if possible. > APPENDIX I. Loop Constructs > 2) while > 10 CONTINUE > . . . > IF ( condition ) GOTO 20 > . . . > GOTO 10 > 20 CONTINUE If it's _really_ a while construct, there should _NEVER_ be any code between the beginning of the loop and the test. This more strongly resembles an infinite loop with an exit condition buried in the middle. J. Giles
mccalpin@perelandra.cms.udel.edu (John D. McCalpin) (08/03/90)
In article <59012@lanl.gov> jlg@lanl.gov (Jim Giles) writes: > > [...(quoting David Levine's style guide) ...] > >> 1) Use PARAMETERs to symbolically name all compile-time constants. > > A little to rigid. If the constant is something that's likely to > change in future versions of the code, name it. If the constant > is hard to remember (or long and prone to typing errors), then name > it. Otherwise don't name it. I've seen people who've named 1 (one), > and then used that everywhere - this doesn't enhance legibility or > maintainability at all. There is not much need to declare a parameter for integer 1, but there is an advantage in declaring parameters for fractional floating-point numbers, since the precision can then be changed by an IMPLICIT statement rather than searching/replacing through the whole code. Unfortunately, X3J3 seems to have ignored my requests that something intelligent be done about this in Fortran-90.... An another subject, the spacing and capitalization rules can (almost) all be handled by the TOOLPACK utility 'pol'. The only thing that I can think of that it cannot do is capitalize the first letter of each word in identifiers which are not keywords or variables (I guess that means subroutine and function names). TOOLPACK currently limits the user to 6-character identifiers, but also provides an interactive naming facility. I don't know if it would be difficult to hack 31-character names into the code.... -- John D. McCalpin mccalpin@perelandra.cms.udel.edu Assistant Professor mccalpin@vax1.udel.edu College of Marine Studies, U. Del. J.MCCALPIN/OMNET
julian@cernvax.UUCP (julian bunn) (08/03/90)
David Levine has recently posted an interesting set of Fortran 77 coding guidelines, itemized in 18 parts. In this posting I describe which of Levine's guidelines may be automatically checked using Floppy, a Fortran coding convention checker, the source of which was posted to comp.sources.misc some weeks ago. I also describe which guidelines are most commonly agreed to here at CERN, which may also be checked by Floppy, and which do not appear in Levine's list. ---------------------------------------------------------------------- The guideline headings and numbers below are as they appeared in David Levine's posting. IV. Program Units 3) FUNCTIONs must not have side effects. Floppy checks for I/O in functions. 4) Reference external functions in an EXTERNAL statement. Floppy checks this. 5) Do not use alternate returns. Every subprogram should have single entry and exit points. Floppy checks for alternate returns. V. Statement format 2) Indicate a non-blank comment with an * in column 1. Floppy will check for comment lines not beginning with "C" VI. Statement labels 2) Assign labels in ascending order. Floppy will optionally re-order all statement labels in ascending order, with fixed start and step size. 3) Assign a separate sequence of labels to FORMAT labels that are grouped at the end of a subprogram. Floppy will optionally do this, too. 4) Right-adjust labels. Floppy checks that labels do not begin in column 1. VIII. Spacing 2) Write keywords without embedded spaces, but surround keywords with spaces. Floppy checks for embedded blanks, not only in keywords, but also in variable names etc.. 8) Use indentation to reinforce control flow; each indent is 3 columns. Floppy will optionally re-indent the source, the indent step for each level being user-defined, between 1 and 5. IX. Identifier Selection 5) Do not use a keyword as an identifier. Floppy checks this. 6) Do not use a subprogram name as a COMMON block name. Floppy checks this. XI. Typing 1) Use the following data types: CHARACTER[*n] COMPLEX DOUBLE PRECISION INTEGER*4 Floppy warns against INTEGER*4! INTEGER LOGICAL REAL 2) Use DOUBLE PRECISION instead of REAL*8. Floppy warns against REAL*8. 6) Do not compare arithmetic expressions of different types; type convert explicitly. Floppy checks for mixed mode expressions, e.g. A = B/I. XII. Operators 3) Compare unequal length character strings with LGE, LGT, LLE, and LLT. Floppy checks this. XV. Control structures 4) Use STOP only for abnormal termination, and include the reason in the character string message. Floppy checks that a STOP statement is immediately preceded by a WRITE. XVIII. Input/Output 2) Position a FORMAT statement immediately following its reference. Position FORMAT statements that are used more than once at the end of the subprogram. Floppy will optionally move all FORMAT statement to the end of the module. ----------------------------------------------------------------------------- Other guidelines in common use at CERN. (The numbers at the left refer to the rule number in Floppy.) IV. Program Units 1 Avoid comment lines after end of module 2 End all program modules with the END statement 11 Avoid comment lines before module declaration 12 Module names should not be the same as intrinsic names 13 First statement in a module should be declaration 14 Module should begin with at least 3 comment lines 29 Avoid the use of ENTRY in FUNCTIONs 36 Module names should all be different V. Statement format 40 Separate Statement Functions by comment lines 41 No names in Statement Function definitions elsewhere VI. Statement labels 27 Statement labels should not begin in column 1 IX. Identifier Selection 9 Integer variables should begin with I to N 6 Variable names should be 6 or fewer characters long XIII. Expressions 16 No comment lines between continuation lines XV. Control structures 26 Avoid the use of PAUSE statements XVI. Arguments 38 Length of passed CHARACTER variables should be * 44 Passed arguments should be dimensioned * in module XVII. COMMON blocks 3 Declared COMMON blocks must be used in the module 4 COMPLEX and DOUBLEPRECISION variables at end of COMMON 5 COMMON block definitions should not change between modules 18 Avoid multiple COMMON definitions per line 19 Do not dimension COMMON variables outside COMMON 32 COMMON block names should not equal variable names XVIII. Input/Output 22 Avoid the use of PRINT statements (use WRITE) 24 Avoid WRITE(* construction 30 Avoid using I/O in FUNCTIONs
desj@ccrwest.UUCP (David desJardins) (08/04/90)
In article <59012@lanl.gov> jlg@lanl.gov (Jim Giles) writes: >> 6) Initialize COMMON blocks only in BLOCK DATA subprograms. > >[...] Anyway, _wherever_ you >do it, you _should_ initialize _all_ common data. I find it hard to believe that you really mean this (especially as you are writing from LANL). On Crays at least, and probably on a lot of other machines, initialized storage is included, byte for byte, in the executable (a.out) file. In fact, if any item in the common block is initialized, the whole block is included. This makes it *extremely* undesirable to write code like INTEGER ARRAY(10**8) LOGICAL FLAG COMMON /FOO/ ARRAY, FLAG DATA FLAG/.FALSE./ much less to initialize the array itself. (The above is a natural thing to write if you want to initialize a table the first time it is used, but not subsequently.) Doing this causes you to discover rapidly just how much (or how little) disk space is available to you :-). For that matter, I disagree with the whole premise that arrays should be initialized. If the initialization is not necessary for the functioning of the program, it seems likely to mislead the reader, who will probably think that the initialized value has some particular purpose. And, as noted above, it is probably better to initialize large arrays in code rather than with DATA statements anyway. -- David desJardins
jlg@lanl.gov (Jim Giles) (08/04/90)
From article <348@ccrwest.UUCP>, by desj@ccrwest.UUCP (David desJardins): > In article <59012@lanl.gov> jlg@lanl.gov (Jim Giles) writes: >>[...] Anyway, _wherever_ you >>do it, you _should_ initialize _all_ common data. > > I find it hard to believe that you really mean this (especially as > you are writing from LANL). On Crays at least, and probably on a lot > of other machines, initialized storage is included, byte for byte, in > the executable (a.out) file. In fact, if any item in the common block > is initialized, the whole block is included. Yes, I know. It's one of the things that I've been asked to look into. I do local site analysis on SEGLDR, the loader on UNICOS. Still, I don't see why you don't want the arrays initialized. There _are_ ohter mechanisms that DATA syayements you know. For example, there is a SEGLDR directive that is _supposed_ to initialize all memory that isn't otherwise initialized to a number you can supply at load time - uninitialized data are still compressed out of the executable and the initialization is done by the startup routine (unfortunately, it's not implemented yet - another thing to "look into" - which means _I_ probably have to fix it.) > [...] If the initialization is not necessary for the > functioning of the program, it seems likely to mislead the reader, who > will probably think that the initialized value has some particular > purpose. It _does_ have a purpose. It is to prevent accidental use of meaningless data. I initialize stuff to NANs or (on the Cray) to infinity. Integers are a bit harder. Still, I want accidental use of meaningless data to stand out as clearly as possible. > [...] And, as noted above, it is probably better to initialize > large arrays in code rather than with DATA statements anyway. Yes, that's a way too. (It is, in fact what the SEGLDR directive is supposed to do.) Like I say, you should _always_ initialize your data. J. Giles
burley@world.std.com (James C Burley) (08/05/90)
I'm not too enamoured with the idea of encoding FORMAT strings as constants in I/O statements. Separate FORMAT statements, as ugly as they are, offer two advantages: 1) The compiler is forced to check the contents of the format for syntactic correctness, at least to parse hollerith and character data. It is not so required for an in-line format, which it need only ensure is a valid character constant. So programmer errors are more likely to be detected at compile time with a separate statement. 2) Some systems are likely to optimize a FORMAT statement by compiling it into an intermediate representation at compile time, but postpone doing anything about a constant FORMAT specifier in an I/O statement until a given execution of that I/O statement. So formatted I/O is likely to take less CPU time with separate FORMAT statements. Note that a system that translates constant FORMATs in I/O statements also must effectively check it for legitimacy, so if #2 does not apply to a given system, #1 probably doesn't also, and vice versa. Also note that I intend to make GNU Fortran check and compile FORMAT constants just like FORMAT statements, if I can, because aside from these two considerations, I think separate FORMAT statements are pretty silly anyway. Even if you want to use a FORMAT specifier for more than one statement, I'd prefer people use the more general PARAMETER mechanism to achieve that. But if your target system(s) suffers from either or both of the above ailments, then it might be best to stick with separate FORMAT statements everywhere you can. James Craig Burley, Software Craftsperson burley@world.std.com
burley@world.std.com (James C Burley) (08/05/90)
XVI. Arguments 38 Length of passed CHARACTER variables should be * 44 Passed arguments should be dimensioned * in module I'd be interested in knowing the reasoning behind these two. If you KNOW you want to pass, say, a 6-character string, why burden the code generation (and thus the run-time speed) of the receiving procedure with having to retrieve the actual length from the caller? Furthermore, a checking compiler might be able to figure out that a passed variable's length disagrees with the receiver's declared length more easily and more certainly than figure out that an arbitrary reference to a *-length dummy (such as FOO(6:6), the sixth character) is invalid because the actual argument is not long enough. Similar question about 44, since a checking compiler can't easily check array bounds once the array is passed into a procedure that declares it as an assumed-size array. Also, based on personal experience, if you try and port your application to a host/attached-processor combination that allows you to split routines between host (typically I/O-bound routines) and attached processor (typically compute-bound routines), it won't know how much data of an assumed-size array to pass. So instead of REAL FUNCTION SUM(ARRAY,SIZE) REAL ARRAY(*) INTEGER SIZE You say REAL FUNCTION SUM(ARRAY,SIZE) INTEGER SIZE REAL ARRAY(SIZE) Which either generates the exact same code for SUM or, for a checking compiler, adds bounds checking to references to ARRAY within SUM, but also which provides size information for ARRAY usable in passing the contents of dummy arguments across processor boundaries. Other than some of these minor questions, I think all these postings about coding styles look pretty good, though some of the things (like capitalization) I'd rather leave to automated tools than breaking my fingers trying to obey!! James Craig Burley, Software Craftsperson burley@world.std.com
ghm@ccadfa.adfa.oz.au (Geoff Miller) (08/06/90)
jlg@lanl.gov (Jim Giles) writes: >This is not a disagreement with the style guide David Levine gave >but merely a commentary on it. I say this because I will only >discuss those parts of his submission that I disagree with. As >a result, I might be mistaken for a dissenter. In fact, as will >be seen, I don't discuss most of what he says (and I therefore >agree with those parts). >> 3) Assign a separate sequence of labels to FORMAT labels that are grouped >> at the end of a subprogram. >Same comment about "assign" plus the following: whenever possible, put >the format into the I/O statement which uses it. Use labeled formats >only when several I/O statements share the same (long) format. I may not have been the only person to contribute this recommendation to David, but I did contribute it so I'll defend it. If you can guarantee that every FORMAT statement will appear immediately under (or in) the PRINT statement that uses it, fine. As soon as you start using the same FORMAT statement in different places, you either have to hunt through the code or locate the FORMAT statement separately, probably at the end. I feel that it is more consistent to have a simple rule which puts them all in one place. >> XI. Typing >> [...] >> 4) Declare all variables. Use a compiler option to ensure declaration. >> If no such compiler option is available, setting the implicit type of >> all variables to a type that is not used in the program often snags >> undeclared variables. An example is: IMPLICIT COMPLEX*16 A-Z. >A better 'snag' is IMPLICIT LOGICAL A-Z. Complex has the disadvantage >that arithmetic (the most common variable use) is still legal on it. >Logical is better because it is legal in fewer contexts. IMPLICIT NULL is even better if your compiler allows it. If you use IMPLICIT LOGICAL A-Z you may decide not to use logical variables at all, which frankly I never found much loss. Geoff Miller (ghm@cc.adfa.oz.au) Computer Centre, Australian Defence Force Academy
mac@harris.cis.ksu.edu (Myron A. Calhoun) (09/10/90)
In article <1794@ccadfa.adfa.oz.au> you write: >>> 3) Assign a separate sequence of labels to FORMAT labels that are grouped >>> at the end of a subprogram. >>Same comment about "assign" plus the following: whenever possible, put >>the format into the I/O statement which uses it. Use labeled formats >>only when several I/O statements share the same (long) format. >I may not have been the only person to contribute this recommendation to David, >but I did contribute it so I'll defend it. If you can guarantee that every >FORMAT statement will appear immediately under (or in) the PRINT statement >that uses it, fine. As soon as you start using the same FORMAT statement in >different places, you either have to hunt through the code or locate the >FORMAT statement separately, probably at the end. I feel that it is more >consistent to have a simple rule which puts them all in one place. Unfortunately, re-used FORMATs cause programs to "break" during maintenance. The scenario is that someone modifies a FORMAT and ONE of its associated I/O statements but doesn't notice there are others. Voila! Instant "broken" program. It is EXTREMELY EASY to "guarantee that every FORMAT statement will appear immediately under (or in) the [I/O] statement that uses it". Just duplicate it. Shouldn't take more than a few keystrokes per FORMAT statement on any reasonable editor; maybe a bit more work if one is still using cards! My credentials? I routinely am called upon to modify programs written more than 20 years ago. I've learned a lot in 20 years, and one thing I've learned is to NOT reuse FORMAT statements. --Myron. -- # Myron A. Calhoun, Ph.D. E.E.; Associate Professor (913) 539-4448 home # INTERNET: mac@harris.cis.ksu.edu (129.130.10.2) 532-6350 work # UUCP: ...{rutgers, texbell}!ksuvax1!harry!mac 532-7004 fax # AT&T Mail: attmail!ksuvax1!mac
FC138001@ysub.ysu.edu (Phil Munro) (09/11/90)
(Myron A. Calhoun) says: > > >Unfortunately, re-used FORMATs cause programs to "break" during maintenance. >... >It is EXTREMELY EASY to "guarantee that every FORMAT statement will >appear immediately under (or in) the [I/O] statement that uses it". >Just duplicate it. Shouldn't take more than a few keystrokes per >FORMAT statement on any reasonable editor; maybe a bit more work if >one is still using cards! > I think duplicated FORMAT statements mean added memory allocations when the program is compiled. Is this not right? It seems just as easy to use "any reasonable editor" to find every use of a FORMAT as to duplicate them and waste memory. On the other hand, except for the machine-code memory problem, it is an appealing idea to put WRITEs and FORMATs together. --Phil
mac@harris.cis.ksu.edu (Myron A. Calhoun) (09/12/90)
In article <90254.120334FC138001@ysub.ysu.edu> Phil Munro <FC138001@ysub.ysu.edu writes: > I think duplicated FORMAT statements mean added memory allocations >when the program is compiled. Is this not right? > It seems just as easy to use "any reasonable editor" to find every >use of a FORMAT as to duplicate them and waste memory. On the other >hand, except for the machine-code memory problem, it is an appealing >idea to put WRITEs and FORMATs together. --Phil OK, let me make some unreasonable assumptions: Assume some program has 50 different FORMAT statements which appear an average of 3 times EACH (is this UNreasonable enough?) The ** average ** FORMAT statement probably fits in well under 100 bytes. 50 * (3 - 1) * 100 = 10,000 extra bytes. On a 64K machine (does anyone still use one of these), and extra 10K might be rather important, but on modern computers with modern-sized memories, 10K seems rather piddling to me. Perhaps a stronger argument for "one I/O statement, one FORMAT" is the modern idea that code modules should be readable in one forward top-to- bottom pass. Re-used FORMAT statements require flipping back and forth through the code and thus violate the "one pass top-to-bottom" convention. But duplicating the (almost always a) few FORMAT statements follows this convention. Besides, if you have a "zillion" FORMAT statements that are re-useable, why not put them in a separate I/O routine and just have one copy of the I/O statements, too! --Myron. -- # Myron A. Calhoun, Ph.D. E.E.; Associate Professor (913) 539-4448 home # INTERNET: mac@harris.cis.ksu.edu (129.130.10.2) 532-6350 work # UUCP: ...{rutgers, texbell}!ksuvax1!harry!mac 532-7004 fax # AT&T Mail: attmail!ksuvax1!mac
djo7613@hardy.u.washington.edu (Dick O'Connor) (09/13/90)
In article <1990Sep11.191119.22682@maverick.ksu.ksu.edu> mac@harris.cis.ksu.edu (Myron A. Calhoun) writes: >... >Perhaps a stronger argument for "one I/O statement, one FORMAT" is the >modern idea that code modules should be readable in one forward top-to- >bottom pass. Re-used FORMAT statements require flipping back and forth >through the code and thus violate the "one pass top-to-bottom" convention. >But duplicating the (almost always a) few FORMAT statements follows this >convention. What's the effect on executable size and execution time if every WRITE is followed by a FORMAT, with all but the first one commented out? This is to avoid all that flipping back and forth, of course. Curious... "Moby" Dick O'Connor djo7613@u.washington.edu Washington Department of Fisheries *I brake for salmonids*
Jeff Boyd <BOYDJ@QUCDN.QueensU.CA> (09/13/90)
If you wanted to manage your FORMATs a little more carfully, use PARAMATERs, eg. PARAMETER (UNIT1=10,FMT1=1) 1 FORMAT ( ... ) and later WRITE (UNIT1,FMT1) var_list providing your Fortran allows a named constant in the FMT option. I've worked on some that didn't, but it's handy sometimes for managing odd problems.
maine@elxsi.dfrf.nasa.gov (Richard Maine) (09/13/90)
On 10 Sep 90 14:16:39 GMT, mac@harris.cis.ksu.edu (Myron A. Calhoun) said: Myron> Unfortunately, re-used FORMATs cause programs to "break" during Myron> maintenance. The scenario is that someone modifies a FORMAT Myron> and ONE of its associated I/O statements but doesn't notice Myron> there are others. Myron> Voila! Instant "broken" program. Hmmm. I'd been tempted to make a very simillar point, but on the opposite side. The scenario is that two or more I/O statements each have separate but equal (where have I heard that phrase :-)) format statements. The time comes to change the format (in some way that doesn't require explicit change in the I/O statement. You change one format, but miss the other(s). Voila, "broken" program. Perhaps it can't read it's own files back in correctly. Myron> ...one thing I've learned is to NOT reuse FORMAT statements. I partly agree, but with some small quibbles. If two format statements are inherently required to be the same (a prime example being the format statements used to write and read the same data), then I think there should be only one format statement (where practical. If the read and write are in separate subroutines, this may not be practical, though I have at least one case where I put several formats in an include file to insure that they were the same in the read and write routines). I view this as a case of the general programming principle that you should avoid hidden dependencies between separate areas of code. If two pieces of code look identical and must remain identical, perhaps there should be only one piece of code, called as a subroutine or whatever. By the same token, I agree that it is detrimental to maintainability to combine 2 completely unrelated format statements that coincidentally happen to look the same in the current version of the code. That would be just as bad an idea as replacing every instance of the literal constant 10 with a PARAMETER named TEN. (Might well be a good idea to use parameters instead of the literal constants, but they should be named something more functionally obvious and those with different functions should have different names). Myron> .... I've learned a lot in 20 years ...not that I haven't made all of the mistakes alluded to above in my 20 years (sigh). Seems like there ought to be a better way to learn than by making all the mistakes myself :-(. -- Richard Maine maine@elxsi.dfrf.nasa.gov [130.134.64.6]
jlg@lanl.gov (Jim Giles) (09/14/90)
From article <MAINE.90Sep13084658@altair.dfrf.nasa.gov>, by maine@elxsi.dfrf.nasa.gov (Richard Maine): > On 10 Sep 90 14:16:39 GMT, mac@harris.cis.ksu.edu (Myron A. Calhoun) said: > > Myron> Unfortunately, re-used FORMATs cause programs to "break" during > Myron> maintenance. The scenario is that someone modifies a FORMAT > Myron> and ONE of its associated I/O statements but doesn't notice > Myron> there are others. > > Myron> Voila! Instant "broken" program. > > Hmmm. I'd been tempted to make a very simillar point, but on the opposite > side. [...] I agree with both sides. The reason I made the recommendation which started this thread was to address this very issue. So I'll make the recommendation again and point out the relevance: The format specification for an I/O statement should be included _within_ the I/O statement itself, unless the same format is used by several different I/O statements. So, if a format only applies to one I/O statement, it is _within_ that statement. If it applies to more than one, it is specified separately. So, the "Myron" problem cannot arise because if someone modifies a FORMAT statement and only ONE of the associated I/O requests, he has violated the convention: a _separate_ FORMAT _always_ applies to more than one I/O request. The user should have looked for all of them. J. Giles
misner@cod.NOSC.MIL (John Misner) (09/14/90)
In article <62893@lanl.gov> jlg@lanl.gov (Jim Giles) writes: > > The format specification for an I/O statement should be included > _within_ the I/O statement itself, unless the same format is used > by several different I/O statements. > >So, if a format only applies to one I/O statement, it is _within_ that >statement. If it applies to more than one, it is specified separately. > >J. Giles The problem with this is that some (e.g. UNIVAC and VAX/VMS) compilers do only 0 level checks on the in-line FORMAT - possibly limited to making sure the format is a character variable or, if it is a constant or a character PARAMETER, an additional check that it begins and ends with '(' and ')'. The run-time I/O then is instructed to do a complete parsing of the format *EACH TIME* the statement is executed. Apart from not finding out until run time whether the format is valid, this adds considerable processing (and therefore time) to the execution of the code, even if there are no multiple executions involved. I think (though I dont usually follow my own advice) that all format statements should, for maintainability, be grouped together between the STOP or RETURN statement and the END statement. Most editors also allow you to find out what line you are editing, hop to the nearest END<endline>, and then return to the line you were editing. My use of in-line FORMATs is restricted to quick-and-dirty debug I/O. J. Misner
burley@world.std.com (James C Burley) (09/14/90)
In article <90256.122601BOYDJ@QUCDN.BITNET> BOYDJ@QUCDN.QueensU.CA (Jeff Boyd) writes:
If you wanted to manage your FORMATs a little more carfully, use
PARAMATERs, eg.
PARAMETER (UNIT1=10,FMT1=1)
1 FORMAT ( ... )
and later
WRITE (UNIT1,FMT1) var_list
providing your Fortran allows a named constant in the FMT option. I've
worked on some that didn't, but it's handy sometimes for managing odd
problems.
I don't think the example you give is valid -- but the wording "allows a
named constant" at the end suggests you knew what you wanted to do, just
got confused typing it.
A format specifier may not be a named constant (PARAMETER name).
It may be an integer variable to which one has ASSIGNed the desired FORMAT
statement (I don't recommend this for this situation).
It may also be a character array name or character expression, including a
named constant, which is what I think you meant.
(It may also be *, but that's list-directed formatting).
So your example perhaps should have been something like this:
INTEGER UNIT1
CHARACTER*(*) FORMAT1
PARAMETER (UNIT1=10,FORMAT1='(...)')
WRITE(UNIT1,FORMAT1) var_list
I.e. the FORMAT itself is within the PARAMETER statement.
As others have pointed out (and I did in an earlier discussion on this
same topic), the advantage of using a named constant for a FORMAT may be
outweighed by the disadvantage of having your compiler not syntax-check
the FORMAT for correctness to the same extent it does a FORMAT statement and/or
the implementation of the program require reinterpretation of the named-
constant FORMAT each time it is referenced in an I/O statement at run-time
rather than once during compile time. Not all compilers/systems suffer both
(or even either) disadvantage, but it can be a performance issue on some.
James Craig Burley, Software Craftsperson burley@world.std.com
burley@world.std.com (James C Burley) (09/14/90)
Oops, I meant "A FORMAT specifier may not be an INTEGER named constant (PARAMETER name)." Forgot the "INTEGER" qualifier. As I said later, it may of course be a CHARACTER named constant.