henry@utzoo.uucp (Henry Spencer) (09/21/88)
In article <20981@watmath.waterloo.edu> rbutterworth@watmath.waterloo.edu (Ray Butterworth) writes: >[new magtape] devices use a very large buffer, and in many cases >the tapes don't even start to move until the write-end-of-file >command is issued by the device driver in the close. If anything >goes wrong and the data isn't written correctly, the close() >function returns an error status but everything simply ignores it. > >If you are writing (or buying) software that is going to write >to these devices, I strongly suggest you make sure that it >checks the return value of close(). The same comment, actually, is much more broadly applicable. It's not at all inconceivable for devices that use the buffer cache to report an error in asynchronous I/O by returning an error from close(). One should always check the result from close(). The same goes, double, for fclose(). There it's even stronger, because fclose() has a high probability of doing buffer flushes that involve actual I/O. -- NASA is into artificial | Henry Spencer at U of Toronto Zoology stupidity. - Jerry Pournelle | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
mike@turing.unm.edu (Michael I. Bushnell) (09/22/88)
In article <1988Sep20.230150.7574@utzoo.uucp>, henry@utzoo (Henry Spencer) writes: >In article <20981@watmath.waterloo.edu> rbutterworth@watmath.waterloo.edu (Ray Butterworth) writes: >>If you are writing (or buying) software that is going to write >>to these devices, I strongly suggest you make sure that it >>checks the return value of close(). >The same comment, actually, is much more broadly applicable. It's not >at all inconceivable for devices that use the buffer cache to report an >error in asynchronous I/O by returning an error from close(). One should >always check the result from close(). Sigh. If only. The problem here is that I have *never* seen a man page for close which returns errors like EQUOT or EIO. The 4.3+NFS man page says close can only fail with EBADF, and SunOS 4.0 says that close will only fail with EBADF or EINTR. If close returns errors like EQUOT or EIO, then the man page needs to be rewritten. UNIX does *not* guarantee that hardware related errors will get reflected on write. This is one of its deficiencies, but is unavoidable given the implementation of the filesystem. The actual disk write may take place hours after the write(2) system call (assuming update isn't running). Do we resurrect the process to return an error from the close(2)? What about processes that don't explicitly close their file descriptors? That has always been acceptable practice, but now we are told we are supposed to close everything before we exit so we can check for undocumented errors. The quota problem is one that NFS did not implement very well, alas. The EIO problem is a different story...Since applications should not count on write(2) working, even if the call returned success, in the even that hardware goes bad, they should not complain about EIO being returned late. At least it was returned sometime. And if the kernel gives me EIO on a close, and the man page says nothing about it, they should NOT return that error. -- -- N u m q u a m G l o r i a D e o \ Michael I. Bushnell \ HASA - "A" division /\ mike@turing.unm.edu / \ {ucbvax,gatech}!unmvax!turing.unm.edu!mike
rb@ist.CO.UK (News reading a/c for rb) (09/23/88)
From article <1988Sep20.230150.7574@utzoo.uucp>, by henry@utzoo.uucp (Henry Spencer): > It's not at all inconceivable for devices that use the buffer > cache to report an error in asynchronous I/O by returning an > error from close(). NFS is a case in point - there all sorts of cases where close() can fail after other operations have succeeded.
bzs@xenna (Barry Shein) (09/25/88)
For years I've suggested from time to time that there should be a signal assigned for I/O errors which is by default OFF but can be enabled calling signal(). It should call the signal handler with the fd that caused the signal and the errcode it would have returned to the call that generated it (possibly some indication of the system call tho that might be hard.) The advantage is that you don't have to wrap all your I/O's with checks for errors (although I suppose there's still the grey area of short read/writes, I think semantics can be worked out for that with a little work.) Another advantage (similar) is that I can simply add a few lines to an existing program (like cat) and now check errors without hunting down every place it does I/O. More importantly, I don't need the souces to a library to add error checking (assuming I can return sanely.) The disadvantages (and implementation difficulties) are several, the worst being derived from the fact that the error will not be detected until all I/O physically completes. This means that I could, if nothing is done to prevent it, close a file and later receive a signal on that fd that there was an I/O error even tho by now I have it opened to a different file, very confusing. I suppose one semantic requirement might be that if the signal is enabled then a close() automaticallly implies an fsync() first, at any rate I don't think it's too much of a rat's nest, I didn't say the change was trivial. I dunno, if it's useful for SIGFPE it seems similarly useful for I/O. P.S. Yes, I am fully aware of what a SYNAD is. -Barry Shein, ||Encore||
daryl@ihlpe.ATT.COM (Daryl Monge) (09/25/88)
In article <1213@unmvax.unm.edu> mike@turing.unm.edu (Michael I. Bushnell) writes: >UNIX does *not* guarantee that hardware related errors >will get reflected on write. This is one of its deficiencies, but is >unavoidable given the implementation of the filesystem. The actual >disk write may take place hours after the write(2) system call >(assuming update isn't running). So true. I would like it if close(2) would insure all blocks were successfully written to disk before it returned. (Possibly by an fcntl(2) option if every one doesn't want this?) Daryl Monge UUCP: ...!att!ihcae!daryl AT&T CIS: 72717,65 Bell Labs, Naperville, Ill AT&T 312-979-3603
jfh@rpp386.Dallas.TX.US (The Beach Bum) (09/26/88)
In article <3542@ihlpe.ATT.COM> daryl@ihlpe.UUCP (Daryl Monge) writes: >In article <1213@unmvax.unm.edu> mike@turing.unm.edu (Michael I. Bushnell) writes: >>UNIX does *not* guarantee that hardware related errors >>will get reflected on write. ... >> ... The actual >>disk write may take place hours after the write(2) system call > >So true. I would like it if close(2) would insure all blocks were >successfully written to disk before it returned. >(Possibly by an fcntl(2) option if every one doesn't want this?) this can be handled by open(...,|O_SYNCW); the following routine will cause the given file descriptor to have the O_SYNCW bit set: #include <fcntl.h> int setsync (fd) int fd; { int flags; if ((flags = fcntl (fd, F_GETFL, 0)) == -1) return (-1); flags |= O_SYNCW; return (fcntl (fd, F_SETFL, flags)); } -- John F. Haugh II (jfh@rpp386.Dallas.TX.US) HASA, "S" Division "Why waste negative entropy on comments, when you could use the same entropy to create bugs instead?" -- Steve Elias