peterson@crash.cts.com (John Peterson) (12/19/89)
I am developing a code to LU factor large symmetric matrices on an FPS 522 ( running BSD 4.3 ) which are too large to fit in RAM memory. The mechanics of "out of core" algorithms are pretty straight forward, but to be efficient, the I/O should be done asynchronously. There seems to be no real painless way to do this under UNIX. My collegues and I have worked out a rough sketch of a way of doing asynchronous I/O. One would fork off a copy of your process, the child would 'nap' until an I/O request came from the parent. Upon receipt of an I/O request, the child goes off and issues a synchronous I/O request like one ordinarily does, and then set a flag of some sort when the I/O has completed. The data to be moved would be stored in memory accessible to the parent and child processes, probably using System V shared memory. Does anyone have a better scheme for doing this? Has any general implementation of this been posted to any of the other newsgroups? e-mail responses would be best way to reply. Thanks in advance, John P. -- +--------------------------------------+ | John C. Peterson | | UUCP: { nosc ucsd }!crash!peterson | | ARPA: crash!peterson@nosc.mil | +--------------------------------------+
m5@lynx.uucp (Mike McNally) (12/20/89)
peterson@crash.cts.com (John Peterson) writes: > My collegues and I have worked out a rough sketch of a way of doing >asynchronous I/O. One would fork off a copy of your process, the child >would 'nap' until an I/O request came from the parent. Upon receipt of >an I/O request, the child goes off and issues a synchronous I/O request >like one ordinarily does, and then set a flag of some sort when the I/O >has completed. The data to be moved would be stored in memory accessible >to the parent and child processes, probably using System V shared memory. Sounds OK to me, if you're willing to swallow the cost of starting a new process for each I/O transaction. Of course, when the world gets "threads" or "lightweight tasks" or whatever the current buzzword is, this gets much cheaper; in fact, if I had threads, I don't think I'd want or need a separate kernel-supported async I/O mechanism. -- Mike McNally Lynx Real-Time Systems uucp: {voder,athsys}!lynx!m5 phone: 408 370 2233 Where equal mind and contest equal, go.
martin@mwtech.UUCP (Martin Weitzel) (12/20/89)
In article <932@crash.cts.com> peterson@crash.cts.com (John C. Peterson) writes: > >I am developing a code to LU factor large symmetric matrices on an FPS >522 ( running BSD 4.3 ) which are too large to fit in RAM memory. The >mechanics of "out of core" algorithms are pretty straight forward, but >to be efficient, the I/O should be done asynchronously. There seems to >be no real painless way to do this under UNIX. [rest deleted] By it's pure nature I/O to disk under UNIX is often more 'asynchronous' than it appears to be: - A write system call returns, if the blocks have been placed in the disk cache (Rochkind's book "Advanced UNIX Programming" has a very nice little section about this on page 29 - read and enjoy). The physical I/O may well overlap with CPU-activity. - A read system call starts some machinery in the kernel which tries to decide, if sequential reads are made and if this seems to be true, carries out 'read ahead's, which often yields in the same effects, as asynchronous reads. Of course, sometimes a progam can decide more efficiently than the kernel, especially if reads are not done sequentially, but it often seems, that UNIX programmers want to take too much responsibility for efficient I/O. (This is more a general remark than a criticism of the ideas of the poster). KEEP IN MIND: These are not any longer the days of 'single job batch processing' where it was on the programmers responsibility to keep the cpu and the disk equally loaded. The UNIX kernel does quite a well job in this respect - may be not allways the best but far better than many programmers think! -- Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83
jas@postgres.uucp (James Shankland) (12/22/89)
In article <6679@lynx.UUCP> m5@lynx.uucp (Mike McNally) writes: >peterson@crash.cts.com (John Peterson) writes: >... in fact, if I had threads, I don't think I'd >want or need a separate kernel-supported async I/O mechanism. Yes! I strongly agree, and it sure is nice to hear someone else taking this position. I consider asynchronous I/O a hack to compensate for the absence of sufficiently lightweight processes. Of course, one person's spartan elegance is another's semantic impoverishment, but I consider it a cleaner, clearer programming model. jas
archer@elysium.esd.sgi.com (Archer Sully) (12/28/89)
In article <20880@pasteur.Berkeley.EDU>, jas@postgres.uucp (James Shankland) writes: > In article <6679@lynx.UUCP> m5@lynx.uucp (Mike McNally) writes: > >peterson@crash.cts.com (John Peterson) writes: > > >... in fact, if I had threads, I don't think I'd > >want or need a separate kernel-supported async I/O mechanism. > > Yes! I strongly agree, and it sure is nice to hear someone else taking > this position. I consider asynchronous I/O a hack to compensate for the > absence of sufficiently lightweight processes. Of course, one person's > spartan elegance is another's semantic impoverishment, but I consider > it a cleaner, clearer programming model. > > jas I've done this on IRIX (Silicon Graphics hacks....er IMPROVEMENTS to SYSV :-), and it works ok. Benefit turns out to be highly dependant on exactly what you are doing, and there are, of course, lots of funky limitations, but it does work. If anyone wants it I may even be able to dig it up. Archer Sully | A Mind is a Terrible thing to Taste (archer@sgi.com) | - Ministry
lm@snafu.Sun.COM (Larry McVoy) (12/29/89)
peterson@crash.cts.com (John Peterson) writes: > My collegues and I have worked out a rough sketch of a way of doing >asynchronous I/O. One would fork off a copy of your process, the child >would 'nap' until an I/O request came from the parent. Upon receipt of >an I/O request, the child goes off and issues a synchronous I/O request >like one ordinarily does, and then set a flag of some sort when the I/O >has completed. The data to be moved would be stored in memory accessible >to the parent and child processes, probably using System V shared memory. Yeah, this will work. A couple things to note: (1) This is a bad idea for writes, especially under SunOS 4.x. See (2), (3), (4) below. It's a great idea for reads. Especially if you do it right. I would keep a pool of processes around - i.e., don't do a fork per read, do a fork iff you haven't got someone hanging around (forks are not cheap, contrary to popular opinion). Also, let read ahead work for you. Oh, yeah, do yourself a favor and valloc() your buffers rather than allocating space off the stack. It won't help you now, but I'm looking at ways of making I/O go fast and one game I can play will only work if you give me a page aligned buffer. And use mmap() if you can. It's much nicer than sys5 shm and it's in 5.4. (2) Writes are already async, especially so on SunOS 4.x. I think it is limited by segmap, which is around 4megs. On buffer cache Unix's, you'll be limited to the size of the buffer cache (no kidding) which is fairly small, around 10-20% of mem. (3) Having lots of outstanding writes doesn't buy you very much. In fact, it can really lead to weird behavior. Everyone should know that (on simple controllers, at least) writes go through disk sort. Including synchronous writes (NFS is a heavy user of sync writes). Well, given that you go through disk sort, you won't ever get to starvation (i.e., a buffer will get written out) but you can get to something I call being very hungry. Suppose you have a disk queue that starts out with requests for cyl 0 and 100. Then suppose you do a series of writes onto cyls >=0 but < 100. The buffer waiting for cyl 100 will wait until all of those i/o's (that came in after it did) complete. That buffer waiting for 100 is in the "hungry" state. Fortunately, this doesn't happen very often. Traces I've taken indicate that disk requests (due to the BSD fs) are nicely grouped. You have to have lots of busy processes doing unrelated I/O to get into this state. I suspect the async i/o could hit this problem. (4) Those outstanding writes cost memory. You have to grab the users data before saying "I'm done". SunOS 4.x claims this is a feature "Our writes finish faster than your writes, especially for big ones" seems to be the party line. Well, for what I do this is a waste of mem so I run a hacked version of ufs that limits outstanding writes (mail me if you have src and want to try this - it's trivial to implement and tunable. I'd be interested in outside comments). (4) Reads could work really well. What I say is my opinion. I am not paid to speak for Sun. Larry McVoy, Sun Microsystems ...!sun!lm or lm@sun.com
rec@dg.dg.com (Robert Cousins) (12/29/89)
I don't understand what all of the difficulty is. Several UNIX and similar operating systems support both sync and async file I/O. In fact, for those of you who are intersted, DG/UX does support both synchronous and asynchronous I/O. Furthermore, the semantics are such that it is straightforward to use these features without many of the shared sync/async gotchas. Robert Cousins Dept. Mgr, Workstation Dev't. Data General Corp. Speaking for myself alone.