[comp.unix.questions] async I/O

gwyn@smoke.BRL.MIL (Doug Gwyn) (01/16/90)

In article <M+21-K2xds13@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
>> -Let's add 4 new system calls: aread, awrite, await, and status.
>> -This doesn't break any existing programs, nor does it do any injury to
>> -the design goals of UNIX.
>> Yes, it does!  Go back and read what I said in my previous message.
>I'm sorry, but other than the assertion that asynchronous I/O is itself
>a violation of the design goals I don't see anything that would lead to
>this conclusion. And I don't see that that, in and of itself, is a
>problem in the face of all the poor asynchronous I/O models that are
>spreading like gangrene through the UNIX world. At the worst it's a matter
>of choosing the lesser of two necessary evils.
>I mean you seem to agree that *allowing* asynchronous I/O is a good idea.
>Can you come up with a cleaner model? I'd love to see it.

One thing that was appreciated in the computer science research community
during the 1970s was that forcing applications to explicitly deal with
asynchronism had been causing numerous reliability problems.  Thus when
Tony Hoare published his paper on "Communicating Sequential Processes",
it was acknowledged as a positive step toward shifting programming back
into the relatively well-understood domain of sequential operations, with
asynchronity managed by the underlying run-time support.  (This was an
improvement over Per Brinch-Hansen's "monitor" concept, which still had
the details of asynchronity too visible.)

UNIX loosely followed the CSP notion, wherein individual processes are
strictly sequential but can communicate with concurrent processes to
achieve controlled asynchronity.  The UNIX kernel manages the actual
asynchronous operations and converts them into the per-process sequential
I/O model.

Rob Pike has shown in an article in a recent issue of Computing Systems
how the CSP model can be applied to graphical windowing environments,
with the result of dramatically simplifying the design of applications
in such environments.

peter@ficc.uu.net (Peter da Silva) (01/17/90)

In article <11956@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
> One thing that was appreciated in the computer science research community
> during the 1970s was that forcing applications to explicitly deal with
> asynchronism had been causing numerous reliability problems.

First, a nitpick. By 1974 UNIX was basically in its current form, and its
design goals must have been established earlier than that. But that's just
a nitpick.

Secondly, I'm not suggesting that applications be forced to explicitly
deal with asynchronism. I just believe that since the real world is
asynchronous you should be able to deal with it.

Also, the event-loop construct has considerable success in the real world
for dealing with asynchronous events. I've worked in the process control
industry for the past 10 years, and UNIX has effectively zero penetration
simply because it doesn't allow for processes to handle asynchronous events.

> UNIX loosely followed the CSP notion, wherein individual processes are
> strictly sequential but can communicate with concurrent processes to
> achieve controlled asynchronity.  The UNIX kernel manages the actual
> asynchronous operations and converts them into the per-process sequential
> I/O model.

Unfortunately, UNIX doesn't support a sufficiently fine-grained process
structure to allow this to be generally used. Systems like Mach do, but
they do it by pretty much abandoning the UNIX model.

Or you can implement a fineer grained process structure within a UNIX
process, but to do that effectively you need asynchronous I/O.

> Rob Pike has shown in an article in a recent issue of Computing Systems
> how the CSP model can be applied to graphical windowing environments,
> with the result of dramatically simplifying the design of applications
> in such environments.

I'm sure it can. But not under UNIX as it exists, and not under any
extension of UNIX that I've seen that still remains close to the source.
-- 
 _--_|\  Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>.
/      \
\_.--._/ Xenix Support -- it's not just a job, it's an adventure!
      v  "Have you hugged your wolf today?" `-_-'

brnstnd@stealth.acf.nyu.edu (01/17/90)

I very much agree with Peter. The basic I/O calls should be asynchronous:
aread(), awrite(), and astatus(). aschedwait() and asyncwait() should wait
for scheduling and synchronization respectively; both should only be
special cases of a single await() call, with different semantics for
different devices and file types. Then my multitee program would be easy
to deal with, along with a host of related problems.

In article <CU318Y5xds13@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
> Secondly, I'm not suggesting that applications be forced to explicitly
> deal with asynchronism.

Exactly. read() and write() would be short library routines.

> I just believe that since the real world is
> asynchronous you should be able to deal with it.

Yup, and select() is only half a solution. (select() and poll() would be
forms of the more logically named await().)

---Dan

gwyn@smoke.BRL.MIL (Doug Gwyn) (01/17/90)

In article <CU318Y5xds13@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
>First, a nitpick. By 1974 UNIX was basically in its current form, and its
>design goals must have been established earlier than that. But that's just
>a nitpick.

Yes, Ken Thompson was thinking about these issues too, as far back as
1969 for sure and probably well before that.

>Secondly, I'm not suggesting that applications be forced to explicitly
>deal with asynchronism. I just believe that since the real world is
>asynchronous you should be able to deal with it.

I would rather have it under control than have to deal with it ad lib.

>Also, the event-loop construct has considerable success in the real world
>for dealing with asynchronous events.

Ha!  Practically everybody I know who has had to program event loops
thinks "there has to be a better way".  The fundamental problem with
event loops is that it forces the application to maintain state
information merely to properly schedule the application's actions.
This (tedious and error-prone) bookkeeping is unnecessary when using
better methods for handling asynchronism.

>Unfortunately, UNIX doesn't support a sufficiently fine-grained process
>structure to allow this [CSP] to be generally used.

Actually, it does pretty well, but in most implementations its IPC needs
improvement.  Also, there is no reasonable programming language for
exploiting this approach other than the shell language, which is too
limited and difficult to use in this area.

>I'm sure it can. But not under UNIX as it exists, and not under any
>extension of UNIX that I've seen that still remains close to the source.

The issue was the best way to extend UNIX to give applications better
control over asynchronism.  I made suggestions for better methods than
forcing processes to deal with awrite() etc.

peter@ficc.uu.net (Peter da Silva) (01/18/90)

In article <11968@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
> In article <CU318Y5xds13@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
> >I just believe that since the real world is
> >asynchronous you should be able to deal with it.

> I would rather have it under control than have to deal with it ad lib.

> >Also, the event-loop construct has considerable success in the real world
> >for dealing with asynchronous events.

> Ha!  Practically everybody I know who has had to program event loops
> thinks "there has to be a better way".

You sound like me (see my occasional diatribes against X in comp.windoows.*).
However with a conventional programming language it's the only way to do it,
unless you go all the way to UNIX processes... and that's too slow.

> >Unfortunately, UNIX doesn't support a sufficiently fine-grained process
> >structure to allow this [CSP] to be generally used.

> Actually, it does pretty well, but in most implementations its IPC needs
> improvement.

Yes, that's an understatement. Replacing all System V's shm_* calls with
something like map_fd() (from Mach) would help.

But context switch overhead is still too high for realtime work.

> Also, there is no reasonable programming language for
> exploiting this approach other than the shell language, which is too
> limited and difficult to use in this area.

Multithreaded applications are difficult in many languuages, even when the
operating system is up to snuff. This is a language problem...

> The issue was the best way to extend UNIX to give applications better
> control over asynchronism.  I made suggestions for better methods than
> forcing processes to deal with awrite() etc.

First, you keep telling me I'm *forcing* processes to deal with awrite().
I'm not. I'm saying it should be an option.

Secondly, you can implement threads on top of asynchronous I/O calls. I've
done this for Forth under RSX-11. You have to have an explicit context
switch routine, but that simplifies the programming immensely anyway. You
just include checks for completed I/O in the swtch() routine. I laid out
the outline for such a routine in comp.lang.c some months ago, and at least
one person has turned it into a real concurrent "library" for C.

*If* UNIX supported await() and friends, then you could efficiently
implement a concurrent programming language. In fact, you could use C
plus a set of small routines to switch to a new context.

But it doesn't. Pity. Your serve.
-- 
 _--_|\  Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>.
/      \
\_.--._/ Xenix Support -- it's not just a job, it's an adventure!
      v  "Have you hugged your wolf today?" `-_-'

barmar@think.com (Barry Margolin) (01/18/90)

In article <20718@stealth.acf.nyu.edu> brnstnd@stealth.acf.nyu.edu (Dan Bernstein) writes:
>I very much agree with Peter. The basic I/O calls should be asynchronous:
...
>Exactly. read() and write() would be short library routines.

Watch out, this is how Multics does it.  Remember, Unix is supposed to be a
castrated Multics :-)

On Multics, the only system call that causes the process to block is
hcs_$block, which is similar to select().  All I/O system calls are
asynchronous (file access is done using memory mapping and paging, so it
isn't included).  These are hidden away in library routines (called I/O
modules) which implement device-independent, I/O (similar to Unix read(),
write(), etc.).  Since the underlying mechanism is asynchronous, I/O
modules can provide synchronous and asynchronous modes.

When doing asynchronous writes, the I/O module returns the count of the
number of characters written.  The caller can then advance his buffer
pointer that many characters into his output buffer, wait for the device to
be ready to accept more data, and then try to write the rest of the buffer;
this is iterated until the entire buffer is taken.  The terminal driver
also provides an all-or-nothing interface, for use by applications that
write escape sequences (to guarantee that process interrupts don't cause
partial escape sequences to be written); this is just like the normal
interface, but acts as if the kernel's buffer is full unless it has enough
room for the entire string being written (even a normal write call can
return "0 characters written", if other processes fill up the kernel
buffers before this process gets around to making the write call).

--
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

hunt@dg-rtp.dg.com (Greg Hunt) (01/19/90)

In article <11956@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
> In article <M+21-K2xds13@ficc.uu.net> peter@ficc.uu.net (Peter da
Silva) writes:
> >> -Let's add 4 new system calls: aread, awrite, await, and status.
> >> -This doesn't break any existing programs, nor does it do any injury to
> >> -the design goals of UNIX.

[some comments removed for brevity]

> One thing that was appreciated in the computer science research community
> during the 1970s was that forcing applications to explicitly deal with
> asynchronism had been causing numerous reliability problems.  Thus when
> Tony Hoare published his paper on "Communicating Sequential Processes",
> it was acknowledged as a positive step toward shifting programming back
> into the relatively well-understood domain of sequential operations, with
> asynchronity managed by the underlying run-time support.  (This was an
> improvement over Per Brinch-Hansen's "monitor" concept, which still had
> the details of asynchronity too visible.)
> 
> UNIX loosely followed the CSP notion, wherein individual processes are
> strictly sequential but can communicate with concurrent processes to
> achieve controlled asynchronity.  The UNIX kernel manages the actual
> asynchronous operations and converts them into the per-process sequential
> I/O model.

[some comments removed for brevity]

I agree with the original poster - UNIX needs asynchronous reads and
writes (and lots of other commercial-grade features).

While the statement about the computer science research may be quite
true for the average client application, it is IMHO not true for systems
software, like database and transaction processing servers.  UNIX is
sadly lacking critical features for proper support of these, and other,
types of system software.  Such features have been available for years
and years in commercial-quality proprietary systems.  I am constantly
amazed that they do not exist under UNIX, and further amazed that many
folks don't seem to think that they are needed.  The research needs to
look at real world problems, and how they are solved, in addition to
solving pure research-oriented problems.  
IMHO, this problem is part of the reason that there are so many flavors
of UNIX.  Each vendor is adding extensions for those things that should
be there, but aren't.  It's a way to make money, but really messes up
being able to write sophisticated software easily in a portable fashion.

I'll get off my soapbox now.

--
Greg Hunt                   Internet: hunt@dg-rtp.dg.com
Data Management Development UUCP:     {world}!mcnc!rti!dg-rtp!hunt
Data General Corporation
Research Triangle Park, NC  Standard Disclaimer Applies