smh (04/19/83)
What's wrong with synchronous IO? Usually nothing. But remember, some people on some machines are not backing up gigabyte disks on 125IPS 6200BPI drives. Rather, they are backing up several x 10^7 byte disks on 45IPS 800BPI tapes. It makes a difference.........
guy (04/20/83)
Actually, synchronous I/O with system-provided read-ahead isn't always what you want; see "Operating System Support for Database Management" by Michael Stonebraker, CACM, July 1981, Vol. 24 No. 7, pp 412-418. A DBMS might want to have O_NOCACHE and O_ASYNC bits in the second argument to "open", so that it can open or create a file, bypass the UNIX cache entirely, and be able to start an I/O operation and not wait immediately for it to finish. (That article does make a misleading statement about UNIX byte-stream vs. RSX-11M record-oriented I/O; the RSX-11M kernel does NOT provide record-oriented I/O. It provides block I/O, and a user-mode subroutine library (FCS or RMS) provides the record structure.) Guy Harris RLG Corporation {seismo,mcnc,we13}!rlgvax!guy
tjt (04/21/83)
It's certainly true that some form of read-ahead/write-behind buffering is required for the I/O devices described by John Gilmore (e.g. streaming tape drives), and I agree that making asynchronous I/O available to user processes is probably the best to accomplish this. I think the biggest potential disadvantage of asynchronous I/O is that the burden of maintaining adequate buffering for a device is placed on the user program. The epitomy of this is VMS (at least the early versions -- I have not had to use VMS since the release 1.6 days) where most of the file I/O buffering appears in RMS, part of each user process. Although it was possible to use RMS to get very high throughput by using lots of read-ahead/write-behind buffers, the system defaults were pretty bad. Also, the technique of making system I/O asynchronous and using user libraries to make it appear synchronous (while doing the necessary buffering) is very expensive unless you also have the ability (like VMS) to share libraries between programs.
swatt (04/21/83)
For about 95+% of the cases, synchronous I/O is the method of choice; UNIX makes this commendably easy. For the remaining uses, it is a bloody pain and you spend infinite energy in various kludges to get around it. I disagree there is anything terribly ugly or clumsy about anynchronous I/O; I've used it under RSX-11. If you approach it with coroutines, it is not much more complex than synchronous I/O. We have a DEC TU78 and using George Goble's "dbuf" program (a double- buffered version of "dd), I can copy /dev/rhp1g (141545 1024-byte blocks) to tape in 7 minutes flat (VAX780). However, to "dump" two filesystems of this size, plus one root partition takes over 4 hours elapsed time in single-user mode. Than could be cut at least in half with overlaped disk and tape I/O. Programs like "dump" and "tar" would be significantly faster with asynchronous I/O. Programs like "cu", "tip", and communications tasks in general would be infinitely cleaner as well. - Alan S. Watt
smh (04/22/83)
Enough talk about the possible benefits of aynchronous IO with tapes, pipes, etc. Did it occur to anyone that some aspects of the question can be trivially tested. On an unloaded system: tar cvfb /dev/rmt0 20 LOTSA.FILES and: tar cvf - LOTSA.FILES | dd ibs=1b obs=20b of=/dev/rmt0 where LOTSA.FILES is some substantial directory tree. A simple timing would tell a lot. I am willing to try this with an 11/45 and TU10 (800BPI 45IPS) if someone else would volunteer to try it on hardware at the other end of the spectrum, i.e., a 780 with a TU78. Of course, the 11 will certainly suffer from contention for scarce kernel IO buffers for the pipe and all the tar file reading. Perhaps a blocking factor of 10 could be tried as well. I will post results when I get them. Steve Haflich genrad!mit-eddie!smh