[net.lang] Slow Fortran I/O

jlg@lanl-a.UUCP (06/28/84)

>I have no complaint with FORTRAN's number-crunching abilities, but when it
>comes to I/O the language is a big loss.  Just recently, on our RT-11 V5.1
>system, we replaced one output subroutine in a graphics library which did its
>I/O with a FORTRAN WRITE with the equivalent subroutine using RT-11 Syslib I/O.
>This resulted in a speedup of something like 600%.  The net result was that the
>Prof. in charge bitched us out for not having done it 2 years ago, since it
>only took 5 minutes to make the change.

This isn't Fortran's fault, it's the I/O library!  I'm not surprised either
since the VMS I/O library is VERY slow.  There is no good reason for Fortran
I/O to be any slower than direct system calls (except for a small ammount
of overhead related to library table maintenance).  For some reason, DEC
chose to put a whole extra level of complexity between Fortran and the
system (called 'Record Manager' on VMS) which is as slow as molasses.

DON'T BLAME THE LANGUAGE FOR AN IMPLEMENTATION DEFECT!!

steven@mcvax.UUCP (Steven Pemberton) (07/03/84)

> This isn't Fortran's fault, it's the I/O library! There is no good reason
> for Fortran I/O to be any slower than direct system calls (except for a
> small amount of overhead related to library table maintenance).

I'm not sure about this. Surely interpreting formats at runtime slows I/O a
lot. It could be possible to compile the formats, but Fortran also allows
you to read the format at run-time, so you still need your format
interpreter around. Does any Fortran compiler compile formats?

Steven Pemberton, CWI, Amsterdam. steven@mcvax

guy@rlgvax.UUCP (Guy Harris) (07/06/84)

> I'm not sure about this. Surely interpreting formats at runtime slows I/O a
> lot. It could be possible to compile the formats, but Fortran also allows
> you to read the format at run-time, so you still need your format
> interpreter around. Does any Fortran compiler compile formats?

Yes - the RT-11 Fortran compiler, about which the complaint of slow Fortran
I/O was originally made.  I suspect the VMS compiler also does so.  There
seem to be two philosophies about this - the compilers which directly
interpret the string at run time (of which there are several, including
Stu Feldman's UNIX "f77" compiler, upon which the Berkeley compiler is
based) and the compilers which compile it first (of which there are several,
including DEC's compilers).  In the latter case, you can include the format
compiler as part of the Fortran I/O library, only referencing it if the
format is actually specified at run-time.

	Guy Harris
	{seismo,ihnp4,allegra}!rlgvax!guy

rpw3@fortune.UUCP (07/06/84)

#R:mcvax:-588400:fortune:15100014:000:476
fortune!rpw3    Jul  5 21:44:00 1984

"Does any FORTRAN compiler compile formats?"

Yes, the old DECsystem-10 compiler FORTRAN-10 did (TOPS-10 O/S). (Note: NOT
the even older "F40".) Formats were compiled into a table somewhat like a
channel command list, and were re-compiled if written into. (Since FORTRAN-10
had some data-flow analysis, that wasn't too hard, I think.)

Rob Warnock

UUCP:	{ihnp4,ucbvax!amd}!fortune!redwood!rpw3
DDD:	(415)369-7437
USPS:	Suite 203, 4012 Farm Hill Blvd, Redwood City, CA  94061

david@bragvax.UUCP (David DiGiacomo) (07/06/84)

"Does any FORTRAN compiler compile formats?"

The DEC PDP-11 Fortran compilers do; this is particularly interesting in
the Fortran-IV threaded code mode.  Fortran I/O is still sluggish,
though.

cbspt002@abnjh.UUCP (Marc E. Kenig ) (07/08/84)

<3Hbug>

  So far as I know, the VMS FORTRAN-77 compiler only checks syntax of the
FORMAT statement's data descriptors.  The format string is interpreted at
runtime.  VMS also allows variable repeat and accuracy qualifiers in format
statements, which of course requires interpreting. (You put a program vble.
in angle brackets in place of the constant as in: 
                    FORMAT(X,TR<VAR1*2>,<IVAR2>F<IVAR3>.3)
which evaluates the current values of VAR1, VAR2, and VAR3, CHECKS THEM FOR
LEGALITY (>=1 and w&d in F, E, and G formats sane, etc), and interprets that
as the format to be printed).  This handy feature eliminates the need for
building format specifications via internal files (ENCODE for you F-IV, V, and
F66 folks). (If C had a similar feature, it would nuke the need for lots of
sprintf's I'd bet!).  This feature also undoubtedly contributes to the 
rotten I/O speed.
   Performance is also problematic because of the historical basis of FORTRAN
I/O on line printers and tapes.  The output record is usually built in a buffer 
and output all at once.  Remember, in FORTRAN you can tab backwards on the
output record, for instance. Also, in trying to be all things to all people,
FORTRAN has lots more output descriptors than C or any other language besides
perhaps PL/I, and the syntax is a bit tricky (especially with repeats that span
records, etc).  This can't help much either. Not to mention the fact that
FORTRAN is handcuffed into being compatible with past versions.  With G
formatting, why bother keeping relics like E and F around? (ANSI, that's why).
More to the point, try this for a shock:

                    READ(5,100)!where unit 5 is tty I/O
            100     FORMAT(50H                                      )
                    WRITE(5,100)
Not only does it still compile, but it still works just like it did from
FORTRAN I. Can any language historians out there explain what it was for?

For a project once, I wrote some C/UNIX-like FORTRAN I/O routines on a DEC-10.
Of course they were faster. (No I don't still have the code, it was MACRO-10
anyway...CVTDBO and all that!:-)

M. Kenig                 "The statements herein are not to be taken as excuses
...abnjh!cbspt002         for bad I/O performance. No flames please!"

ggs@ulysses.UUCP (Griff Smith) (07/08/84)

I don't know how VMS processes formats, but the Fortran-10 system
(TOPS-10/TOPS-20) did run-time compilation as of 1981 and still does
for all I know.  Formats were not compiled by the Fortran compiler,
they were compiled by the run-time system.  The address of the format
was used to search through a table of compiled formats.  An attribute
field in the call to the run-time system indicated whether the format
was a constant (safe to re-use the compiled format) or an array
(re-compile every time).

Surprisingly, the UNIX(TM) Fortran run-time system almost does the same
thing.  Unless I am badly misreading the code, a format is compiled
into an internal data structure before it is used to direct data
translation.  The compiled format could be re-used if someone would
find a cheap way to determine that the original hasn't changed since
the last compilation.

Based on some experience I have had writing I/O routines in tight
assembly language, a carefully written general-purpose number
conversion will take about 60/MIPS microseconds per character (MIPS =
speed of processor).  This includes the usual paranoia, but leaves no
room for fancy features such as variable format fields.  The UNIX stdio
package on my VAX is within 50 percent of this value; the UNIX Fortran
system misses it by a factor of four.  The best Fortran systems that I
have seen have missed it by no worse that 50 percent.  There are ways
to beat the guideline value by a factor of 6, but the techniques used
don't adapt easily to a general-purpose I/O system.

Trademarks: UNIX - AT&T Bell Laboratories; VAX, VMS - Digital Equipment Corp.
-- 

Griff Smith	AT&T Bell Laboratories, Murray Hill
Phone:		(201) 582-7736
Internet:	ggs@ulysses.uucp
UUCP:		ulysses!ggs

zben@umcp-cs.UUCP (07/08/84)

One implementation for the Univac 1100 I saw had the format string compiled
(sort of, actually more like P-code-ified) the first time the format was used.
Some kind of flag (like a word of zeroes) was placed in the first word of the
storage used by the format, as a flag not to compile it again.  The big win of
this approach was that even if the format were dynamically built it would be
compiled the first time it was used.  If it was subsequently changed to be
something else, the flag was blown away, and it got compiled again.  I don't
remember what it did to ensure that the compiled format was not longer that
the core allocated by the Fortran compiler though...

I suppose one could always malloc/getmain/mcore$ a buffer for the compiled
format and put a *pointer* to it in the *second* word of the core area...

-- 
Ben Cranston   ...seismo!umcp-cs!zben      zben@umd2.ARPA

darryl@ism780.UUCP (07/23/84)

#R:abnjh:-72900:ism780:14700005:000:1163
ism780!darryl    Jul  9 06:36:00 1984

***** ism780:net.lang / abnjh!cbspt002 /  3:53 pm  Jul  8, 1984
>More to the point, try this for a shock:
>
>                    READ(5,100)!where unit 5 is tty I/O
>            100     FORMAT(50H                                      )
>                    WRITE(5,100)
>Not only does it still compile, but it still works just like it did from
>FORTRAN I. Can any language historians out there explain what it was for?

Sure.  Back in Fortran II, you probably wrote a stress analysis
program (didn't everyone? :-) or some such general purpose program,
and you wanted to put headers on the output listing:

	"STRESS ANALYSIS FOR SMITH JONES CO. PROJECT 37"

But, alas, there was no character input to get such a title, and
recompiling for each run was out of the question.  So you read
the title into a format statement and then used the same format
to write headings on each page.  Doesn't Fortran 77 finally
outlaw this crap?  I thought that Hollerith data was right out.
(Of course, your compiler may support this so that older programs
will port easily.)  Characters?  I don't need no stinkin' characters!

	    --Darryl Richman
	    ...!cca!ima!ism780!darryl