[comp.lang.fortran] Question on unformatted i/o - SOLUTION

ej@gauss.Princeton.EDU (Eric Jackson) (03/12/90)

In article <14409@phoenix.Princeton.EDU> ej@acm.Princeton.EDU I wrote:
>When an unformatted write is done...two control words are written at the
>beginning and end of the write with the number of bytes written...
>...Does anyone know how to suppress the writing of these control words?

My thanks to those who replied to this posting. I am posting the
solution since it may be helpful to others and it raises another
interesting issue.

As pointed out by John Fisher and Dan Whipple, writing without record
separators can be accomplished using direct access i/o. A further
portability problem does arise here, however: a record length must be
given in the open statement, but whether the length is given in words
or bytes depends on the implementation (bytes on a sun, words on sgi).
Anyway, there is a code fragment at the end of this post illustrating
the solution on a Silicon Graphics.

The reason I was interested in this is that we have developed a set of
programs for data analysis and visualization on the Silicon Graphics;
for the most part these programs require a binary input file. Since
the floating point representations on Suns, SGIs and the Titan are the
same, there is no need for conversion filters to use files produced by
any of them, other than the record separator problem, which up to now
I've solved by using a filter in C that throws them out.

This all does raise a more general issue in the case when machines do
not share a single representation. Indeed, I actually do very little
computation on our local machines --- for the most part I work on
remote supercomputers like the Crays and the (soon-to-disappear-and-
about-time-too) ETA10 (sorry, I couldn't resist), while using the
local machines for post-analysis and visualization. I often need to
output large datasets for local analysis (large <= many Gigabytes) and
using ascii leads to enormous amounts of wasted space (and network
bandwidth). One person wrote me about a machine-independent
data format implementation that he developed that I will be looking
into (I am checking whether he is willing to have his address posted
for inquiries). My question for the net is what other solutions to this problem
have been developed? Our own requirements are that the routines be
implementable on a wide variety of machines (suns, SGIs, Crays,
Connection Machine, Intel Hypercube, etc) in both C and Fortran, and
that they be very fast. They should also probably include or be usable
with a data format library like netCDF to allow self-describing datasets. 
Any thoughts or suggestions?


Eric Jackson
ej@acm.princeton.edu
Princeton University

***********************************************************************
CODE FRAGMENT:

      open(unit=12,form='unformatted',access='direct',
     $     recl=100,status='new',file='fname')
      do 1 i=1,100
         arr1(i) = float(i)
 1    continue
c now write it out 10 times
      do 2 k=1,10
         write(12,rec=k)(arr1(i),i=1,100)
2     continue

***********************************************************************
EOT

calvin@dinkum.sgi.com (Calvin H. Vu) (03/14/90)

In article <14439@phoenix.Princeton.EDU> ej@gauss.Princeton.EDU (Eric Jackson) writes:

>In article <14409@phoenix.Princeton.EDU> ej@acm.Princeton.EDU I wrote:
>>When an unformatted write is done...two control words are written at the
>>beginning and end of the write with the number of bytes written...
>>...Does anyone know how to suppress the writing of these control words?

>As pointed out by John Fisher and Dan Whipple, writing without record
>separators can be accomplished using direct access i/o. A further
>portability problem does arise here, however: a record length must be
>given in the open statement, but whether the length is given in words
>or bytes depends on the implementation (bytes on a sun, words on sgi).
>Anyway, there is a code fragment at the end of this post illustrating
>the solution on a Silicon Graphics.
    Unless all of your records are of comparable sizes this solution can
    be just as bad as using ascii data since all records have to be zero-
    padded to the length of the longest records.  
    On SGI 3000 series machines you can do exactly what you want
    by opening the file with FORM='BINARY'.  On the SGI 4D machines, 
    in the next release of the compiler, you can do the same thing by 
    opening the file as an unformatted direct access file with record 
    length equal to 1 _byte_, e.g:
	OPEN(999, file='binarysystemfile', access='direct',recl=1,
	form='unformatted')
    and then do sequential I/O on the file.  In this way, the file will 
    be treated like an ordinary system file (as byte string in which each 
    byte is addressable).   A READ or WRITE request on such files will 
    comsume bytes until satisfied, rather than restricting itself to a 
    single record.
    One correction on your statement above, on SGI 4D machines we change
    the record length interpretation from byte to word in the 3.2 release
    to conform to the ANSI standard.   However if you prefer to have your
    record length interpreted as bytes you can do that by compiling your
    program with -old_rl option.



>Eric Jackson
>ej@acm.princeton.edu
>Princeton University

- calvin
--------------------------------------------------------------------------
Calvin H. Vu			   | "We are each of us angels with
Silicon Graphics Computer Systems  | only one wing.  And we can only
calvin@sgi.com   (415) 962-3679	   | fly embracing each other."