[comp.unix.questions] Asynchronous output

garry@batcomputer.tn.cornell.edu (Garry Wiegand) (12/18/86)

I have an application which entails computing and churning out vast
quantities of data and, for speed, I'd like to have the I/O happening
in parallel with the computing. After reading the BSD and SysV manuals, 
I'm puzzled: does the system give us *any way* to do asynchronous output?

I've thought of writing a (presumably buffered) pipe to "cat" and thence to
my device. Is there anything else?

thanks -

[apologies if an incorrect posting on this (I just cancelled it) sneaks 
 through.]

garry wiegand   (garry%cadif-oak@cu-arpa.cs.cornell.edu)

ggs@ulysses.homer.nj.att.com (Griff Smith) (12/18/86)

In article <1858@batcomputer.tn.cornell.edu>, garry@batcomputer.UUCP writes:
> I have an application which entails computing and churning out vast
> quantities of data and, for speed, I'd like to have the I/O happening
> in parallel with the computing. After reading the BSD and SysV manuals, 
> I'm puzzled: does the system give us *any way* to do asynchronous output?
> 
> I've thought of writing a (presumably buffered) pipe to "cat" and thence to
> my device. Is there anything else?
> 
> garry wiegand   (garry%cadif-oak@cu-arpa.cs.cornell.edu)

It depends on who else is using the computer, where you are putting
your data, etc.  If you are writing to a disk file, don't worry; the
disk system is buffered internally.  If you are writing to a mag tape
and you are the only one on the system, you might want to fork a tape
writing process.  If you are using System V, you could even use shared
memory to pass data to the process without writing or copying it.  If
shared memory isn't available, try a pipe.  Remember, though, that
pipes aren't free.  The time it takes for one process to write the pipe
and another to read it may be larger than the time spend waiting for
the tape to spin.  If the system is heavily loaded, complicated
multi-process buffering schemes may slow you down; the system will just
run something simpler instead.  Your best strategy is usually to
minimize your own cpu requirements so you make the best use of the
fraction of the cpu allocated to you.

Of course, if you have the machine to yourself the only thing that
matters is elapsed time, not cpu time.  In that case, you might
consider running something else useful to soak up the idle time.  You
still won't get the primary job done any faster, but the total
productivity of the system should improve.

After an experience I had about 7 years ago I am not very enthusiastic
about asynchronous I/O in time-sharing systems.  We were using TOPS-10,
which had a buffering scheme for asynchronous I/O.  After having many
problems in the kernel caused by cache maintenance errors during
asynchronous I/O, we defeated the asynchronous feature.  The system
itself was still highly asynchronous, but not within processes.  Users
didn't know the difference but the system did: improved system
stability, better system throughput, faster disk performance, less disk
fragmentation.

philip@axis.UUCP (Philip Peake) (12/20/86)

In article <1858@batcomputer.tn.cornell.edu> garry%cadif-oak@cu-arpa.cs.cornell.edu writes:
>I have an application which entails computing and churning out vast
>quantities of data and, for speed, I'd like to have the I/O happening
>in parallel with the computing. After reading the BSD and SysV manuals, 
>I'm puzzled: does the system give us *any way* to do asynchronous output?
>
>I've thought of writing a (presumably buffered) pipe to "cat" and thence to
>my device. Is there anything else?

You don't seem to have read your manuals too well.
ALL (normal) I/O activity under UNIX is asynchronous.
When you do a write(), the data is copied from the data area of your program
into a buffer (or clist structure) in the kernel data space. The write()
then returns. The buffered data is output either by a dma transfer or by
interrupt driven routines within the kernel, depending upon the device
to which you are writing.

Philip

mangler@cit-vax.Caltech.Edu (System Mangler) (12/20/86)

In article <1858@batcomputer.tn.cornell.edu>, garry@batcomputer.UUCP writes:
> I have an application which entails computing and churning out vast
> quantities of data and, for speed, I'd like to have the I/O happening
> in parallel with the computing.

Many computations go through a "read the data, crunch it, output it"
cycle in which the crunching of one block is independent of the
crunching of another.  If that's your case, use three processes.
At any given time, one is reading, another is crunching what it
just read, and one is writing.	You'll get a fairly continuous
flow of data, won't have to pass around large volumes of data
(i.e. low overhead), and get a large share of the CPU.

One curious gotcha is that throughput will be substantially worse
if you run it "nice --20".   Explain that, BSD scheduler wizards!
(I noticed this on 4.3bsd dump, which works in precisely this fashion).

Don Speck   speck@vlsi.caltech.edu  {seismo,rutgers,ames}!cit-vax!speck