ej@gauss.Princeton.EDU (Eric Jackson) (03/12/90)
In article <14409@phoenix.Princeton.EDU> ej@acm.Princeton.EDU I wrote: >When an unformatted write is done...two control words are written at the >beginning and end of the write with the number of bytes written... >...Does anyone know how to suppress the writing of these control words? My thanks to those who replied to this posting. I am posting the solution since it may be helpful to others and it raises another interesting issue. As pointed out by John Fisher and Dan Whipple, writing without record separators can be accomplished using direct access i/o. A further portability problem does arise here, however: a record length must be given in the open statement, but whether the length is given in words or bytes depends on the implementation (bytes on a sun, words on sgi). Anyway, there is a code fragment at the end of this post illustrating the solution on a Silicon Graphics. The reason I was interested in this is that we have developed a set of programs for data analysis and visualization on the Silicon Graphics; for the most part these programs require a binary input file. Since the floating point representations on Suns, SGIs and the Titan are the same, there is no need for conversion filters to use files produced by any of them, other than the record separator problem, which up to now I've solved by using a filter in C that throws them out. This all does raise a more general issue in the case when machines do not share a single representation. Indeed, I actually do very little computation on our local machines --- for the most part I work on remote supercomputers like the Crays and the (soon-to-disappear-and- about-time-too) ETA10 (sorry, I couldn't resist), while using the local machines for post-analysis and visualization. I often need to output large datasets for local analysis (large <= many Gigabytes) and using ascii leads to enormous amounts of wasted space (and network bandwidth). One person wrote me about a machine-independent data format implementation that he developed that I will be looking into (I am checking whether he is willing to have his address posted for inquiries). My question for the net is what other solutions to this problem have been developed? Our own requirements are that the routines be implementable on a wide variety of machines (suns, SGIs, Crays, Connection Machine, Intel Hypercube, etc) in both C and Fortran, and that they be very fast. They should also probably include or be usable with a data format library like netCDF to allow self-describing datasets. Any thoughts or suggestions? Eric Jackson ej@acm.princeton.edu Princeton University *********************************************************************** CODE FRAGMENT: open(unit=12,form='unformatted',access='direct', $ recl=100,status='new',file='fname') do 1 i=1,100 arr1(i) = float(i) 1 continue c now write it out 10 times do 2 k=1,10 write(12,rec=k)(arr1(i),i=1,100) 2 continue *********************************************************************** EOT
calvin@dinkum.sgi.com (Calvin H. Vu) (03/14/90)
In article <14439@phoenix.Princeton.EDU> ej@gauss.Princeton.EDU (Eric Jackson) writes: >In article <14409@phoenix.Princeton.EDU> ej@acm.Princeton.EDU I wrote: >>When an unformatted write is done...two control words are written at the >>beginning and end of the write with the number of bytes written... >>...Does anyone know how to suppress the writing of these control words? >As pointed out by John Fisher and Dan Whipple, writing without record >separators can be accomplished using direct access i/o. A further >portability problem does arise here, however: a record length must be >given in the open statement, but whether the length is given in words >or bytes depends on the implementation (bytes on a sun, words on sgi). >Anyway, there is a code fragment at the end of this post illustrating >the solution on a Silicon Graphics. Unless all of your records are of comparable sizes this solution can be just as bad as using ascii data since all records have to be zero- padded to the length of the longest records. On SGI 3000 series machines you can do exactly what you want by opening the file with FORM='BINARY'. On the SGI 4D machines, in the next release of the compiler, you can do the same thing by opening the file as an unformatted direct access file with record length equal to 1 _byte_, e.g: OPEN(999, file='binarysystemfile', access='direct',recl=1, form='unformatted') and then do sequential I/O on the file. In this way, the file will be treated like an ordinary system file (as byte string in which each byte is addressable). A READ or WRITE request on such files will comsume bytes until satisfied, rather than restricting itself to a single record. One correction on your statement above, on SGI 4D machines we change the record length interpretation from byte to word in the 3.2 release to conform to the ANSI standard. However if you prefer to have your record length interpreted as bytes you can do that by compiling your program with -old_rl option. >Eric Jackson >ej@acm.princeton.edu >Princeton University - calvin -------------------------------------------------------------------------- Calvin H. Vu | "We are each of us angels with Silicon Graphics Computer Systems | only one wing. And we can only calvin@sgi.com (415) 962-3679 | fly embracing each other."