[comp.arch] Async system interface

bruce@cs.su.oz (Bruce Janson) (02/06/91)

In article <13772@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
>.. The I/O call (below, RW
>stands for _either_ the read or the write system call) should be
>something like the following:
>
>RW(fd,nw,buf,da,aflag,handle)
>..
>da - logical disk address (relative to beginning of file) for I/O
>   transfer.  lseek() is redundant and in this model would disappear.
>   If da is -1, transfer is sequential from previous request.
>..
>handle - this is a pointer to an I/O completion routine in the user's
>   process.  This routine is called when the I/O requested has been
>   done.  If this pointer is null, no user function is called. (This
>   routine is called even if synchronous I/O was requested.)
>..

Jim,
    Above, you have outlined part of an async I/O interface.  But there
are some issues that I am still not clear about that perhaps you might
like to clear up.
    When you say "transfer is sequential from previous request" do you
mean the previously issued request or the previously completed request?
More generally, in what context does the I/O get done?
In what context does the I/O completion function run?  Are the machine
operations of either the I/O copy or the completion function interleaved with
those of the rest of the program or is the "rest of the program"
suspended while the I/O and its completion routine run.
Is the contents of the area of memory pointed to by "buf" updated
atomically with respect to the process?
Is the I/O completion routine called with any arguments?
Can the I/O completion routine call sleep()?  Where does a longjmp()
take me from within such an I/O completion routine?  What does a setjmp()
do when executed from within such an I/O completion routine?  How about calls
to the signal() family of routines and can the I/O completion routine
be interrupted by an incoming signal?  What exits when I call exit()
from within an I/O completion routine?  What happens when the main
program calls exit() while there are still I/O requests outstanding?
What happens if my process calls close() on an fd that is still
associated with outstanding async I/O?
How many concurrent outstanding I/O requests can my process have at
the one time and can I prioritise and/or schedule them?
Can a process enquire as to the status of a particular
outstanding I/O request and if so, what naming scheme should it use
to refer to the I/O request of interest.  How do I terminate an
outstanding I/O request prematurely?  Can an I/O completion routine
make further async I/O calls itself?
    It is possible to devise answers for all of these questions.
And async I/O does exist in various operating systems so answers must
have been devised.  However, after considering these questions I might
conclude that async I/O would need to bring with it some very
significant advantages to offset what appears to be a more complex model.

Cheers,
bruce.

Bruce Janson
Basser Department of Computer Science
University of Sydney
Sydney, N.S.W., 2006
AUSTRALIA

Internet:	bruce@cs.su.oz.au
Telephone:	+61-2-692-3272
Fax:		+61-2-692-3838

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (02/07/91)

In article <13772@lanl.gov> jlg@lanl.gov (Jim Giles) writes:

| The fact that UNIX is badly designed, doesn't rule out the possibility
| of other bad systems.  You should never be forced to do a wait system
| call if what you want is synchronous I/O.  

The solutions are either a WAIT call, a separate SIO call for blocking
i/o, or a flag to be supplied on each and every i/o system call. The
systems I used all had the first, some had the second. Using a flag adds
size to the calling program, CPU overhead to set and test the flag, and
appears on first glance to have no advantages over the others. The
separate blocking i/o system call is probably lowest overhead.
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
  "I'll come home in one of two ways, the big parade or in a body bag.
   I prefer the former but I'll take the latter" -Sgt Marco Rodrigez

barmar@think.com (Barry Margolin) (02/07/91)

In article <3181@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.com (bill davidsen) writes:
>In article <13772@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
>| The fact that UNIX is badly designed, doesn't rule out the possibility
>| of other bad systems.  You should never be forced to do a wait system
>| call if what you want is synchronous I/O.  
>The solutions are either a WAIT call, a separate SIO call for blocking
>i/o, or a flag to be supplied on each and every i/o system call.

Another common solution is a settable mode.

On the system I'm most familiar with (Multics), all I/O system calls are
asynchronous (the only time a process ever blocks in the kernel is when it
takes a page fault during a system call).  However, the user-ring I/O
library, which implements the device-independent equivalent to Unix's
read() and write(), normally hides the asynchrony, by internally performing
the wait.  Applications can get an asynchronous interface by using an ioctl
to set non-blocking mode (for devices where it makes sense -- there is no
asynchronous interface to the file system, because it is implemented using
files mapped into virtual memory, and page faults are implemented
synchronously within a process).  The blocking mechanism is based on
general interprocess communications channels, and a process can either
perform a wait call on a channel or request that a signal be generated when
data is written to the channel (this is similar to Unix's select() and
SIGIO, but more general because it can be used independently of I/O).

--
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

Bruce.Hoult@bbs.actrix.gen.nz (02/07/91)

Jim Giles writes:
>You should never be forced to do a wait system
>call if what you want is synchronous I/O.  The I/O call (below, RW
>stands for _either_ the read or the write system call) should be
>something like the following:
> 
>RW(fd,nw,buf,da,aflag,handle)

[rest of description deleted]

Once again, that is almost *exactly* the way the Mac does it.  The main
difference is that the Mac splits out your "da" parameter into two
parameters: an offset (+ve, -ve or zero), and a positioning mode which can
be "from start of file", "from where the last transfer finished", or "from
end of file".  Also, your "aflag" parameter is split into two: an actual
anync flag, and a result code that acts in almost exactly the way you
indicate -- it is +ve during the i/o, zero on success, and has a -ve error
number on failure.
-- 
Bruce.Hoult@bbs.actrix.gen.nz   Twisted pair: +64 4 772 116
BIX: brucehoult                 Last Resort:  PO Box 4145 Wellington, NZ
"And they shall beat their swords into plowshares, for if you hit a man
with a plowshare, he's going to know he's been hit."

gsarff@meph.UUCP (Gary Sarff) (02/12/91)

In article <1995@cluster.cs.su.oz.au>, bruce@cs.su.oz (Bruce Janson) writes:
>In article <13772@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
>>.. The I/O call (below, RW
>>stands for _either_ the read or the write system call) should be
>>something like the following:
>>
>>RW(fd,nw,buf,da,aflag,handle)
>>..
>>da - logical disk address (relative to beginning of file) for I/O
>>   transfer.  lseek() is redundant and in this model would disappear.
>>   If da is -1, transfer is sequential from previous request.
>>..
>
>Jim,
>    Above, you have outlined part of an async I/O interface.  But there
>are some issues that I am still not clear about that perhaps you might
>like to clear up.

I am now keeper of the sources of a multiuser/tasking OS at work that has
async, I/O so, as the poster bruce says at the end of his posting, some of
these questions have been answered in different ways at different
places/times.  Some of his questions, all of which are valid, I can not
answer because we do not have callback functions when I/O is complete.
I'll take a shot at the rest.

>    When you say "transfer is sequential from previous request" do you
>mean the previously issued request or the previously completed request?

Sequential from the last completed request.

>More generally, in what context does the I/O get done?

The I/O request is of course queued to the driver in the requesting process'
context, (thread of execution through the driver).

>In what context does the I/O completion function run?  Are the machine
>operations of either the I/O copy or the completion function interleaved with
>those of the rest of the program or is the "rest of the program"
>suspended while the I/O and its completion routine run.

The rest of the program continues to execute or remains in an executable
state while the I/O is happening, it is not suspended as I am understanding
your question.

>Is the contents of the area of memory pointed to by "buf" updated
>atomically with respect to the process?

Yes, in fact I will mention that we do place some restrictions on the I/O
async I/O requests, that we do not place on synchronous I/O.  To wit, the I/O
request must begin on a sector boundary, and the destination buffer must be
wholly contained within one 4K page of memory, i.e, cannot cross a page
boundary.  These requests are limited in this fashion mainly for simplicity
of implementation, and next for efficiency sake.  The buffer alignment
condition can be easily satisfied because the OS has a system call that will
allocate any number of 4K pages into a process' logical address space at
whatever address the process requests, (as long of course as there is not
some memory already allocated there.)  All this is of course because we DMA
the disk request.  The physical address of the requesting process' buffer
pages are resolved at the time the request is made from their logical address
space, and those pages are marked as unswappable until the I/O completes, or
the timeout expires.  Unlike UNIX where a process makes an I/O request and
then sets an alarm if needed, we incorporate a timeout value (in 100'ths of a
second) into the I/O request itself.

>Is the I/O completion routine called with any arguments?
>Can the I/O completion routine call sleep()?  Where does a longjmp()
>take me from within such an I/O completion routine?  What does a setjmp()
>do when executed from within such an I/O completion routine?  How about calls
>to the signal() family of routines and can the I/O completion routine
>be interrupted by an incoming signal?  What exits when I call exit()
>from within an I/O completion routine?  What happens when the main
>program calls exit() while there are still I/O requests outstanding?

I cannot answer most of the above as I said because we don't do user I/O
completion routines.  The I/O completion itself happens in an interrupt
routine, and it can of course be any arbitrary process on the system that was
interrupted.  What happens in this routine also answers the last question in
the above paragraph.  When a program dies, of its own free will or because
another process or the OS kills it, all of its memory pages are deallocated
and returned to the system free memory list.  The pages that have Async
I/O's pending were marked as such in the memory control structures (a bit
was set), and those pages are not deallocated.  They are placed on a pending
deallocation list.  The interrupt routine that handles the I/O completion
clears this bit.  Periodically the OS scans the list and pages that have 
their bit cleared can be returned to the OS free memory list.  The list will
also be scavenged by the OS before it comtemplates swapping of any process'
address space because of memory shortage.

>What happens if my process calls close() on an fd that is still
>associated with outstanding async I/O?

The same thing as the previous paragraph regarding processes dieing.

>How many concurrent outstanding I/O requests can my process have at
>the one time and can I prioritise and/or schedule them?

No limit except that imposed by your memory and how much the control
structures take up, which is only 20-30 bytes, by many megabytes of memory,
that is a lot of I/O requests.

>Can a process enquire as to the status of a particular
>outstanding I/O request and if so, what naming scheme should it use
>to refer to the I/O request of interest.  

Yes, you enquire by asking about the file handle (LUN, we call it ,Logical
Unit Number,) and the offset into that file that was given in the original
request.

>How do I terminate an
>outstanding I/O request prematurely?

We currently do not implement this feature.

>Can an I/O completion routine
>make further async I/O calls itself?

No completion routines, but it is possible that the interrupt routine that
handles completed I/O could make additional requests, and often does, as this
is one part of the way we implement read-aheads.

>    It is possible to devise answers for all of these questions.
>And async I/O does exist in various operating systems so answers must
>have been devised.  However, after considering these questions I might
>conclude that async I/O would need to bring with it some very
>significant advantages to offset what appears to be a more complex model.

Our more limited model does not seem to impose significant disadvantages, or
unduly burden the system, in my view.  But, I am the last source keeper, not
the original designer, and though I have talked with him often, I do not know
all the historical reasons he may have had, behind all the different
features.  I know we use it in some tasks, for instance, process activation
from the image file on disk, is done using these async I/O (we call them
fastreads), some utilities and other application programs use it.  For
instance a BACKUP program, upon entering a new sub directory, will read the
directory in one sweep, generate open, and fast read requests for all the
files that match its file specifications, and then start backing up the first
file to tape.  By the time it gets to subsequent files, it is hoped that some
of them have already been read into memory buffers and can now also be backed
up to tape. It helps keep the tape at streaming speed, at any rate.