cdb@hpclcdb.HP.COM (Carl Burch) (07/07/88)
On systems like UN*X and MS-DOS with byte-stream file systems, the Fortran I/O library has to impose a data file format to support Fortran's record-oriented file model. On HP-UX, these take the following forms : Sequential Formatted file format : ASCII files delimited with the newline character (ACSII 10 decimal). Sequential Unformatted file format : Binary data preceded and followed by four bytes holding the record length (in bytes). The "green word" at the end is necessary to BACKSPACE the file correctly. Direct Formatted file format : ASCII fixed-length records not physically separated. Unwritten bytes in the record are padded with blanks. Direct Unformatted file format : Binary fixed-length records not physically separated. Unwritten bytes in the record are padded with zero bytes (ASCII Nulls). Bell's f77(1) compiler uses this scheme as well. On MS-DOS, (at least my copy of) Microsoft Fortran uses the above formats except that Direct Unformatted files are also padded with blanks and the Sequential Unformatted format uses only a one-byte "green word" to hold the length of each record. In the latter case, there is an escape value saying that the following record is full to the max (256?) and there will be following records. Given this much similarity, I wonder if we may have a de facto standard evolving here. If we could do something about the data format in binary files (e.g., the IEEE floating point format), it might be possible to use systems like NFS considerably more transparently than currently possible. I'd like examples of other byte-stream file systems' Fortran compilers' solutions to this problem. Are they as similar as those around my shop? Carl Burch
corbett@beatnix.UUCP (Bob Corbett) (07/12/88)
In article <6690019@hpclcdb.HP.COM> cdb@hpclcdb.HP.COM (Carl Burch) writes: > > Given this much similarity, I wonder if we may have a de facto standard >evolving here. If we could do something about the data format in binary >files (e.g., the IEEE floating point format), it might be possible to use >systems like NFS considerably more transparently than currently possible. > > I'd like examples of other byte-stream file systems' Fortran compilers' >solutions to this problem. Are they as similar as those around my shop? > > Carl Burch I wish that FORTRAN file formats were as similar as Mr. Burch has so far found them to be. The fact that AT&T's f77 and HP-UX's f77 use the same file formats comes as no surprise since HP's implementation was derived from AT&T's. The AT&T file formats probably are the de facto standard for UNIX FORTRAN implementations. However, there are irritating variations even among UNIX FORTRANs. One such variation is that the size of the count fields for sequential unformatted records are different on various machines. For example, f77 on the PDP-11 uses a 16-bit count field, while f77 on the VAX uses 32-bit fields. Other annoying differences arise for sequential formatted files. One variation I has seen is to use the escape character (ASCII 27) to escape characters. In particular, an escape followed by a new-line character is treated as a new-line character within the a record rather than an escape followed by the end of record. An escape followed by an escape is treated as a single escape. I believe the reason for this convention is that people used A format to write binary data. However, a case can be made that it should be possible to write any character in the processor character set under an A format. Another variation is to use a control character to denote end of file. I used to believe that this variation arose to avoid having to truncate files on close (a painful operation on System V based UNIX systems). However, I now believe it is done to emulate VMS FORTRAN. Operating systems other than UNIX use a wide variety of FORTRAN file formats. VMS FORTRAN features four major file formats: fixed, variable, segmented, and stream. The file format to be used can be specified in the OPEN statement at the time the file is created. An existing file's format is known to the OS. A fixed-length record file consists of fixed-length records with no physical separators. A variable-length record file consists of a byte count follow by the data record. The byte count is two bytes long for disk files and four bytes for tapes. A variable-length record file opened for relative access is stored on disk as a fixed-length record file. Segmented record files are basically variable-length record files plus control information that allows a single logical record to be stored as one or more variable-length records. Stream-type files use characters to indicate end of record. Stream-type files come in three varieties. One form uses a carriage return followed by a line feed to terminate records. The other two use either carriage return alone or line feed alone to terminate records. Sequential formatted files and sequential unformatted segmented files can contain embedded end of file records. An end of file record consists of a one-byte record containing a sub character (ASCII 26). Sequential formatted files may use any of the four file formats. The default format for sequential formatted files is variable. CDC FORTRAN for the CYBER 170 series uses four file formats: Z, W, U, and S. There are four additional file formats (F, R, D, and T) that are not commonly used. CDC files consist of 60-bit words. A Z file indicates end of record by 12 zero bits in the low-order part of a word. A record may have to be padded. If a record ends with blanks, those blanks may be trimmed. A W file precedes each data record with a control word which contains the length of the data record. U and S files are record manager files. The record manager stores the location and length of the data records apart from the data records themselves apart from themselves. I have a description of IBM's file formats, but it is too complex to be worth describing in detail. Suffice it to say that the length of variable- length records of all types are indicated by counts. Robert Paul Corbett ucbvax!sun!elxsi!corbett uunet!elxsi!corbett