glenn@ll-xn.ARPA (Glenn Adams) (11/21/85)
> > As more and more mainframe systems are moving to UNIX, I am > very interested in finding out how asynch I/O is being implemented > on these systems. > This was one of the first complaints I had about UNIX after having used operating systems which imposed fewer constraints on how user processes performed synchronization on I/O completion. There are a few things worth contemplating here. Firstly, why there is no asynchronous I/O mechanism in UNIX, and secondly, how may such a mechanism be implemented. Before getting down to details, it should be pointed out that there are other, conceptually cleaner methods for performing overlapped I/O. That is, using multiple processes each of which have no more than one outstanding (blocking) I/O request. This form of overlapped I/O results in a conceptually straightforward implementation but is costly in terms of efficiency. This is especially true due to the hard boundaries maintained between process address spaces and the lack of a shared memory mechanism (non SYSV). In addition, the overhead from context switching contributes to an overall inefficiency. Moreover, the current mechanisms in UNIX for interprocessor communication, e.g., pipes, sockets, or files, all result in the copying of data to and from the kernel address space as it is being transferred to the destination process. This introduces more inefficiences. There is, however, the select() system call in 4.[23]BSD UNIX which allows a timed blocking poll of multiple potentially outstanding I/O activities. This is many times more efficient than the previous busy wait polling method which used the FIONREAD ioctl(), and this latter method was usable on a limited number of I/O activities, e.g., read(). Given these various mechanisms for performing multiple I/O activities, most applications have chosen to make do with them rather than address the more difficult task of implementing a more general kernel-based asynchronous I/O mechanism. My efforts originated while implementing an I/O intensive signal processing application under RSX-11M/S using Whitesmith's C. My first job was to throw out the junk Whitesmith's called a standard I/O library and make it more similar to UNIX V7. I actually used the 4.2BSD stdio with the addition of most V7 system calls which were mapped to RSX Executive Calls of one sort or another. Since, for this application, I had strong need of the efficiency of asynchronous I/O, I needed some UNIX like mechanism for implementing it. What I ended up with is as follows: prior to performing an I/O operation, e.g., read(), write(), or ioctl(), an fcntl() call is performed with a command argument of F_ASYNC, and an argument which points to an Asynchronous Control Block. This argument structure contains the address of the asynchronous I/O handler and an optional argument to be passed to the handler. The optional argument is used to communicate application specific information to the handler about the subsequent I/O activity. The handler is invoked upon I/O completion as follows: (*handler)(status, opt-argument); Thus the status code indicating the success/failure of the I/O activity is communicated along with the optional arugment specified in the Asynchronous Control Block. It may be argued that a cleaner mechanism could be implemented, especially since this calls for two stages, i.e., arming and execution phases. However, I felt that it was better to do it this way than to add another argument to all I/O related system calls, or even worse, to add yet more system calls. >From the application programmer's perspective, this mechanism is quite simple to use and builds upon existing system calls. The semantics of handler invocation are quite simple and result in a clean interface with minimal global data communication. Furthermore, since this mechanism enables asynchronous notification on a per descriptor basis, it is possible to have outstanding I/O on multiple descriptors. Further still, since an optional argument is specified on a per I/O request basis, i.e., the optional argument in the Asynchronous Control Block, it is possible to have multiple outstanding I/O requests on a single descriptor and use this optional argument to identify the request. For the application for which I implemented this mechanism, it was necessary to have overlapping I/O on multiple devices and to have multiple outstanding requests enqueued to a single device. The latter was necessary to reduce I/O turnaround latency on devices with very small data overrun periods, e.g., an unbuffered A/D converter. I haven't mentioned a few details here such as the obvious need for blocking out sections of critical code from incurring asynchronous entry. Now that I have had some success with this particular interface mechanism for performing Asynchronous I/O, I am consdering how it might best be implemented in the 4.[23]BSD environment. I haven't scoped the problem enough at this point to be able to state how difficult this will be. One problem that I see already is the fact that different device drivers use different mechanisms to perform synchronization. Some use iowait(), and others call sleep() directly. If a single mechanism were used, e.g., iowait(), then the task would be much easier. Those drivers that use iowait() could now be easily converted to enable asynchronous notification since a hook could be placed in iowait() to allow the process to continue and then to notify the user process when the driver calls iodone(). However, the other drivers would be much more difficult since they don't necessarily follow this strict protocol, i.e., calling iowait() and then iodone(). The actual notification could come via the psignal() mechanism with a special signal (SIGIO ?) being used to get things going. I'm not sure when or if I will get an opportunity to try implementing these ideas in the UNIX kernel; however, I thought it might be interesting to discuss the ideas that I've had on the subject in case you or others are interested in actually doing the implementation. -- Glenn Adams MIT Lincoln Laboratory ARPA: glenn@LL-XN.ARPA CSNET: glenn%ll-xn.arpa@csnet-relay UUCP: ...!seismo!ll-xn!glenn ...!ihnp4!houem!ll-xn!glenn
paul@unisoft.UUCP (n) (11/30/85)
<oog> I too would like to see some form of async I/O under Unix, after all VMS and even RT11 and the macintosh!! have it. The main reason I see against it is that it would massively complicate unix's very simple user interface (read, when you are done reading do something else ...). On the other hand you could avoid the problems with some utilities that have to use more than 1 process to do some very simple things (cu for example runs 2 processes, one each way, typing a character results in 2 process switches!!!!). The major problem in implementation under Unix is the concept of IO context, with Unix an IO's context (u_base, u_count, u_offset) are in a processes 'udot' if you are going to be doing multiple IO's you need to be able to send this information to the device with the queued IO (you also have to send the process ID, the completion routine's virtual address and the parameter you are going to give to the completion routine). There is also and amazing amount of internal synchonisation work needs to be done in the kernel. And while you are doing that you may as well make it orthoganol so that all the system calls can have completion routines. Rats!! It seemed so easy!! Still it would be nice to see it done .... sigh .... Paul Campbell ..!ucbvax!unisoft!paul
jsdy@hadron.UUCP (Joseph S. D. Yao) (12/01/85)
I'm afraid this isn't a great response, but: I've seen one or two implementations of asynchronous I/O. One was done by Steve Holmgren et al at U Ill Champaign-Urbana (U-C?) maybe half a dozen years ago. Obviously, not based on any current system, nor has it been propagated much, especially since it was part of a version that completely changed UNIX. I vaguely remember an interface; and I don't remember whether this is from Steve's implementation or not. Basically, one opens an fd and posts a signal routine for that fd. Asread()'s and aswrite()'s etc. return an object which is compared to (something, I think retrieved by another system call) at interrupt time. Not too far from simple. -- Joe Yao hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}
wls@astrovax.UUCP (William L. Sebok) (12/03/85)
In article <604@unisoft.UUCP> paul@unisoft.UUCP (n) writes: > The major problem in implementation under Unix is the concept of IO >context, with Unix an IO's context (u_base, u_count, u_offset) are in a >processes 'udot' if you are going to be doing multiple IO's you need to be >able to send this information to the device with the queued IO (you also have >to send the process ID, the completion routine's virtual address and the >parameter you are going to give to the completion routine). I noticed that in the transition from the 4.1 BSD to the 4.2 BSD kernel, references in the i/o drivers to u.u_base, u.u_count, u.u_offset are replaced by references to members of a uio structure. This was the only change (that I remember) that I had to make in some locally written drivers I ported. I wondered and still wonder if this change means that support of asynchronous i/o is in the works in BSD. -- Bill Sebok Princeton University, Astrophysics {allegra,akgua,cbosgd,decvax,ihnp4,noao,philabs,princeton,vax135}!astrovax!wls