[comp.lang.fortran] SGI comments?

willis@cs.athabascau.ca (Tony Willis) (03/02/91)

My organization is considering the purchase of a SGI 340/380
combo for heavy duty number cruching? We're pretty pleased with
what we've seen and heard so far but I have one nagging
doubt. Most of our heavy-duty applications are parallelizable
to a large degree (at least conceptually!) and the benchmarks
run for us by SGI show that we can indeed get really great
throughput in the compute intensive parts of our programs.
However, I note that the FORTRAN compiler itself will not
handle parallel I/O. This means that when we have 4 or 5 users
read/writing big 1024 x 1024 real*4 images to/from disk
the 380 essentially operates in a linear mode, i.e. if one
job takes 2 minutes, 4 jobs running simultaneously take
about 8 or more minutes.

Now the local SGI sales rep did send me a copy of the latest
SGI FORTRAN manual. I noted in the RELEASE notes that it
said that the SGI C compiler could to I/O in parallel and
that FORTRAN users could do this by making calls to a
C subroutine. So my question is is this really possible?

Most of my FORTRAN I/O comes from reading/writing
direct access files more or less like

	DO 300 I = 1, 1024
		WRITE(UNIT=12,REC=I)(DATA(J,I),J=1,1024)
300	CONTINUE

Could I replace this sort of thing by calls to C and
speed up the I/O, particularly in multi-user mode? (Come on
you SGI guys in Mountain View - help make a sale!!)

Also I'd be interested in any general comments from SGI users
about their satisfaction or otherwise with SGI products?

Please e-mail to me directly at

twillis@drao.nrc.ca

Thanks,
Tony Willis

bron@bronze.wpd.sgi.com (Bron Campbell Nelson) (03/05/91)

In article <958@vax.cs.athabascau.ca>, willis@cs.athabascau.ca (Tony Willis) writes:
> My organization is considering the purchase of a SGI 340/380
> combo for heavy duty number cruching? We're pretty pleased with
> what we've seen and heard so far but I have one nagging
> doubt. Most of our heavy-duty applications are parallelizable
> to a large degree (at least conceptually!) and the benchmarks
> run for us by SGI show that we can indeed get really great
> throughput in the compute intensive parts of our programs.
> However, I note that the FORTRAN compiler itself will not
> handle parallel I/O. This means that when we have 4 or 5 users
> read/writing big 1024 x 1024 real*4 images to/from disk
> the 380 essentially operates in a linear mode, i.e. if one
> job takes 2 minutes, 4 jobs running simultaneously take
> about 8 or more minutes.

If I read this right, then no.  If your users are running *separate* jobs,
then all those jobs will run just fine in parallel.  What the FORTRAN manual
is trying to say is that if you have a *single* job that uses multiple
threads in parallel, the FORTRAN I/O library does not allow I/O parallelism
within the *same* job (in general, only one thread of the job can
do I/O).  Multiple independent jobs do not have this problem.  (Of course,
if you're I/O bound and don't have separate disks/controllers you may not
get good speedup because your jobs may linearize waiting for the disk,
but any system has that problem.)


> Now the local SGI sales rep did send me a copy of the latest
> SGI FORTRAN manual. I noted in the RELEASE notes that it
> said that the SGI C compiler could to I/O in parallel and
> that FORTRAN users could do this by making calls to a
> C subroutine. So my question is is this really possible?
> 
> Most of my FORTRAN I/O comes from reading/writing
> direct access files more or less like
> 
> 	DO 300 I = 1, 1024
> 		WRITE(UNIT=12,REC=I)(DATA(J,I),J=1,1024)
> 300	CONTINUE
> 
> Could I replace this sort of thing by calls to C and
> speed up the I/O, particularly in multi-user mode? (Come on
> you SGI guys in Mountain View - help make a sale!!)
> 

The meaning of what I said above is that you can NOT do

C$DOACROSS
 	DO 300 I = 1, 1024
 		WRITE(UNIT=12,REC=I)(DATA(J,I),J=1,1024)
 300	CONTINUE

and expect it to work correctly (it won't!).  You CAN have 2 or more
separate jobs doing this at the same time.

The C I/O library is semaphored to allow multiple threads to use it.
If you want to call this from a Fortran program, you need to add the
statement:
	call usconfig(CONF_STHREADIOON)
since in FORTRAN the semaphoring is off by default.
 
The use of direct access file complicates interfacing to C a bit.
You'd need to know the format of records on the disk, which sadly
I do not.  I'll ask my co-worker who knows this stuff to comment.

--
Bron Campbell Nelson
bron@sgi.com  or possibly  ..!ames!sgi!bron
These statements are my own, not those of Silicon Graphics.

bron@bronze.wpd.sgi.com (Bron Campbell Nelson) (03/05/91)

In article <88675@sgi.sgi.com>, bron@bronze.wpd.sgi.com (Bron Campbell Nelson) writes:
> In article <958@vax.cs.athabascau.ca>, willis@cs.athabascau.ca (Tony Willis) writes:
> > Most of my FORTRAN I/O comes from reading/writing
> > direct access files more or less like
> > 
> > 	DO 300 I = 1, 1024
> > 		WRITE(UNIT=12,REC=I)(DATA(J,I),J=1,1024)
> > 300	CONTINUE
> > 
> > Could I replace this sort of thing by calls to C and
> > speed up the I/O, particularly in multi-user mode? (Come on
> > you SGI guys in Mountain View - help make a sale!!)
> > 
> 
> The use of direct access file complicates interfacing to C a bit.
> You'd need to know the format of records on the disk, which sadly
> I do not.  I'll ask my co-worker who knows this stuff to comment.
> 

I talked to the man with the answers, and he tells me that direct
access files are in fact laid out in the simple straight forward
manner: each record has RECORD-LENGTH bytes in it, and record number
one begins at byte zero in the file.  If I understand the above loop
correctly, we could interface directly to the "write" system call
and replace the loop with:
	istatus = write(12, data, 1024*1024*recordLength)
which should speed up this I/O operation by a LOT.


--
Bron Campbell Nelson
bron@sgi.com  or possibly  ..!ames!sgi!bron
These statements are my own, not those of Silicon Graphics.