[comp.unix.wizards] The 4.3 BSD awrite

brnstnd@stealth.acf.nyu.edu (02/08/90)

According to the 4.3 BSD siginterrupt() documentation, after a program
executes siginterrupt(SIGALRM,1), interrupted I/O calls will return the
number of bytes actually read. Then the (untested) code below should
provide a true awrite(). Any comments?

---Dan

#include <sys/time.h>
#include <signal.h>
extern int errno;

nothing() { }

int awrite(fd,buf,num)
int fd;
char *buf;
int num;
{
 struct itimerval it1;
 struct itimerval it2;
 struct itimerval it3;
 int w;
 int saveerrno;
 int (*fun)();

 it1.it_interval.tv_sec = it1.it_value.tv_sec = 0;
 it1.it_interval.tv_usec = it1.it_value.tv_usec = 0;
 (void) setitimer(ITIMER_REAL,&it1,&it2);
 fun = signal(SIGALRM,nothing);
 it1.it_interval.tv_sec = it1.it_value.tv_sec = 0;
 it1.it_interval.tv_usec = it1.it_value.tv_usec = 10000;
 (void) setitimer(ITIMER_REAL,&it1,&it3);
 w = write(fd,buf,num);
 saveerrno = errno;
 (void) setitimer(ITIMER_REAL,&it3,&it1);
 (void) signal(SIGALRM,fun);
 (void) setitimer(ITIMER_REAL,&it2,&it3);
 errno = saveerrno;
 return(w);
}

lm@snafu.Sun.COM (Larry McVoy) (02/08/90)

In article <1055.18:35:28@stealth.acf.nyu.edu> brnstnd@stealth.acf.nyu.edu (Dan Bernstein) writes:
>According to the 4.3 BSD siginterrupt() documentation, after a program
>executes siginterrupt(SIGALRM,1), interrupted I/O calls will return the
>number of bytes actually read. Then the (untested) code below should
>provide a true awrite(). Any comments?
>
>---Dan
>
>[code that schedules an alarm and then starts a write deleted]

Funny you should ask.  The answer is, "no, this won't work the way you think".
Why?  Hmm.  Think of the kernel as a processor with system calls being
instructions and signals being interrupts.  In general, interrupts do not
interrupt an instruction.  That is mostly true inside the kernel as well (*).
I beleive that your code wanted to get the write rolling and then return
after it was started (this is what almost every Unix system does anyway unless
you have opened the file with O_SYNC, so this is a moot point, but...).  The
only way (*) that the signal would cause an early return is if the signal 
arrived while the write() was waiting for some resource (was sleep()ing, in
kernel sense).  Once the ``instruction'' gets rolling you can't interrupt it.

(*) An interesting exception.  Someone at Xerox pointed this out to me: if
you have a buffer, say two pages worth, with the first page R/W and the
2nd page RDONLY, and you hand that to read(), you get a SIGSEGV halfway 
through the read() and a return value of -1.  This is a bug in most current
implementations of Unix (including SunOS).  So there is a way that signals
can interrupt an ``instruction'' but you have to work hard and we'll fix it
eventually....
---
What I say is my opinion.  I am not paid to speak for Sun, I'm paid to hack.
    Besides, I frequently read news when I'm drjhgunghc, err, um, drunk.
Larry McVoy, Sun Microsystems     (415) 336-7627       ...!sun!lm or lm@sun.com

brnstnd@stealth.acf.nyu.edu (02/09/90)

Larry, why don't you at least test the code before asserting that it
doesn't work? I've now run my async library through a rather thorough
series of tests. It works perfectly. aread() and awrite() don't block.

Your theoretical error is the assertion that a system call can't be
interrupted by a signal (except in an obscure situation, where you
consider the behavior to be in error). But a blocked system call can
be interrupted. That's why siginterrupt() exists.

Try compiling this code and feeding it into various situations. Try it
with the interrupt flag changed from 1 to 0. Try it with write() instead
of awrite(). Try similar tests with reading. You'll become a believer too.

#include <signal.h>
#include <stdio.h>
#include "async.h"

main()
{
 char buf[100000];

 siginterrupt(SIGALRM,1); /* this should be in the async library, I guess */
 fprintf(stderr,"%d\n",awrite(1,buf,sizeof(buf)));
}

Here are a few tests I tried on a Sun with the above program:

kramden% ./astest > /dev/null
100000
kramden% ./astest | cat | wc
4096
       0       0    4096
kramden% ./astest | sleep 2
4096
kramden% (sleep 1;./astest) | cat|wc
4096
       0       0    4096
kramden% !!
( sleep 1 ; ./astest ) | cat | wc
8192
       0       0    8192
kramden% !!
( sleep 1 ; ./astest ) | cat | wc
8192
       0       0    8192
kramden% !!
( sleep 1 ; ./astest ) | cat | wc
8192
       0       0    8192

Notice the effects of various caching mechanisms: after the first time,
cat starts much more quickly. Obviously kramden's pipes hold 4096 bytes.

kramden% ./astest | cat > ascat
4096
kramden% ls -l ascat
-rw-------  1 brnstnd      4096 Feb  8 18:06 ascat
kramden% (cat ascat; ./astest; sleep 10; ./astest) | (sleep 5; cat) | wc
-1
8192
       0       0   12288
kramden% (cat /etc/termcap;./astest) | (sleep 2; cat) | wc
1730
    2724   10772  135168
kramden% wc /etc/termcap
    2724   10772  133438 /etc/termcap

If those last two lines don't convince you I don't know what will.
(135168 is 33 times 4096.)

One comment on your comments:

> (this is what almost every Unix system does anyway unless
> you have opened the file with O_SYNC, so this is a moot point, but...).

What about pipes, ttys, and sockets? None of this is moot.

My multitee program works beautifully. More evidence for the superiority
of BSD over System V...

---Dan

lm@snafu.Sun.COM (Larry McVoy) (02/10/90)

In article <6068:00:23:14@stealth.acf.nyu.edu> brnstnd@stealth.acf.nyu.edu (Dan Bernstein) writes:
>Larry, why don't you at least test the code before asserting that it
>doesn't work? I've now run my async library through a rather thorough
>series of tests. It works perfectly. aread() and awrite() don't block.
>
>Your theoretical error is the assertion that a system call can't be
>interrupted by a signal (except in an obscure situation, where you
>consider the behavior to be in error). But a blocked system call can
>be interrupted. That's why siginterrupt() exists.
>
>Try compiling this code and feeding it into various situations. Try it
>with the interrupt flag changed from 1 to 0. Try it with write() instead
>of awrite(). Try similar tests with reading. You'll become a believer too.
>
>#include <signal.h>
>#include <stdio.h>
>#include "async.h"
>
>main()
>{
> char buf[100000];
>
> siginterrupt(SIGALRM,1); /* this should be in the async library, I guess */
> fprintf(stderr,"%d\n",awrite(1,buf,sizeof(buf)));
>}
>
>Here are a few tests I tried on a Sun with the above program:
>
>kramden% ./astest > /dev/null
>100000
>kramden% ./astest | cat | wc
>4096
>       0       0    4096

First of all, you're right, I was thinking about restarting the system
call which only happens if they haven't moved any data.  As you pointed
out, partially complete system calls (basically read and write) will allow
themselves to be interrupted part way through (if you remember the analogy
in the last message, this is like allowing "blockmove R0,R1,R3" to be
interrupted, whereas you wouldn't allow "inc R0" to be interrupted.)

So let's consider your stuff again.  You claim that you've implemented
awrite().  So what's awrite()?  Well, in my mind, awrite(fd, buf, n)
should start the I/O and return "n" (it would be nice if you could get
status, but you can't with writes either...).  The I/O should all go out
unless there is some sort of error condition (no mem, bad fd, whatever).

That is not at all what you have.  You've got a very perverse way to do 
non-blocking writes.  If that's what you want why not implement it like
so:

awrite(fd, buf, n)
    void *buf;
    unsigned n;
{
    fcntl(fd, F_SETFL, O_NONBLOCK);
    return (write(fd, buf, n));
}

Well, I suppose that you could argue that this won't work on disk files
(and it won't, at least not in the SunOS code I looked at).  But for disk
files you obviously don't care - they are async anyway, so you're only
interested in pipes/sockets/etc.  And those sorts of things support
non-blocking I/O in all sorts of ways.

So... My point is basically the same.  This is *not* an implementation of
awrite() by any reasonable definition - it fails to send all the data
through.  I'm not saying that this has no use, but I am saying (a) this is
not an asynchronous write, and (b) you can get the same behavior in a much
cleaner way by using O_NONBLOCK.
---
What I say is my opinion.  I am not paid to speak for Sun, I'm paid to hack.
    Besides, I frequently read news when I'm drjhgunghc, err, um, drunk.
Larry McVoy, Sun Microsystems     (415) 336-7627       ...!sun!lm or lm@sun.com

peter@ficc.uu.net (Peter da Silva) (02/13/90)

In article <131606@sun.Eng.Sun.COM> lm@sun.UUCP (Larry McVoy) writes:
> So let's consider your stuff again.  You claim that you've implemented
> awrite().  So what's awrite()?

awrite(fd, buf, n) initiates a write of n bytes from buf, returning
immediately. It returns some token that you can use to tell when the
write is complete and get the number of bytes actually written. You also
need to be able to wait on completion, and a close would implicitly
perform such a wait.
-- 
 _--_|\  Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>.
/      \
\_.--._/ Xenix Support -- it's not just a job, it's an adventure!
      v  "Have you hugged your wolf today?" `-_-'