scs@adam.mit.edu (Steve Summit) (10/25/90)
Jon Kamens mentioned a good example of why checking the return values of ALL system calls (e.g. close), even when you think they "can't fail," is a good idea. A few people have disputed this, choosing to label filesystems (e.g. remote ones) which can return an error on close "broken." Evidently the wish is to retain the notion that close "can't fail." As Boyd Roberts and others have correctly pointed out, the best policy is to check for errors from all routines which can return them, system calls and otherwise. An interface specification is a contract between implementer and caller. If it says "this routine returns -1 and sets errno on error," then the most robust code will dutifully check for the error condition, even if the programmer "knows" that the implementation currently in use "never fails." Those words on the specification mean that some later reimplementor is perfectly within rights to add new error returns. (If, on the other hand, the routine were declared as returning void, or without any other error indication, a later reimplementor would not have this option, and the calling code would be justified in not checking.) Remember that binary compatibility means that the operating system can change out from under your program without your even recompiling or relinking, so it is particularly important to anticipate changes in system call returns. You can't even blame the man page for not being updated to reflect new errno values (e.g. EDQUOT for close when remote filesystems were introduced). Looking at a man page and deciding that you don't care about any of the errno values listed there is as bad as looking at the kernel source code and deciding that you don't care about any of the errno values you find there. Both are liable to change (and the man page is likely to be incorrect in the first place, since gleaning syscall errno values from the kernel and keeping man pages up-to-date is drudgework that is not likely to be done reliably). (Remember, too, as Kristoffer Eriksson has pointed out, that close() could conceivably fail, with EIO, under the good, old-fashioned Unix filesystem, on a local disk, due to buffering and write-behind.) If there are errno values you truly don't care about, check for them explicitly, but complain about the ones you don't not care about (and the ones that haven't been invented yet): openfile(name) char *name; { int fd; if((fd = open(name, 1)) < 0) { switch(errno) { case ENOENT: /* doesn't exist, so create: */ fd = creat(name, 0666); if(fd >= 0) return fd; /* else fall through: */ default: fprintf(stderr, "%s: can't open %s: ", progname, name); perror(""); exit(1); } } return fd; } In earlier and simpler days, I never checked the return value of close. Then, when I started messing around with distributed filesystems, I discovered that it was reasonable for close() to fail. I didn't complain that the distributed filesystem was "broken" (in fact, I was writing it, so I could have "fixed" it if it was). I shook my finger at myself, and chastised myself for having been disobeying the "always check error returns" rule. I've been checking the return value from close (at least, for fd's open for writing :-( ) ever since. The call I've still been ignoring the return value of (shake, shake, naughty, naughty) is fstat. Anybody know when that can fail, as long as the file descriptor and stat buf pointer are valid? Actually, don't answer that. I'm going to listen to my own advice and start checking that value too, even if it "can't fail" today. Note that all of this stuff is relative. You're not a heathen atheist scum if you don't check the return value of a call that you are pretty sure can't fail. (There are even good reasons for not doing so, such as on systems with limited code or error message space.) Above I used words like "the best policy is..." and "...the most robust code will dutifully check...". The more care you put into things like rigorous error return checking and other specification details, the more robust, forward-looking, and widely portable your code will be. If you are not writing production-quality code, and you leave out some checking, your code may need tweaking later. (Putting /* XXX */ as a reminder in soft spots in the code is a common device.) By the way, I've included comp.unix.programmer in the distribution, since checking return values is something all programmers (Unix and otherwise) need to know about. (I missed out on the great comp.unix.wizards shakedown and all the brouhaha surrounding it. What a lot of squalling! Not that some of the new groups aren't reasonable, but what was so wrong with unix-wizards that it had to be removed? Answer by mail only, if at all, please -- I don't want to start more flames.) Steve Summit scs@adam.mit.edu
ronald@robobar.co.uk (Ronald S H Khoo) (10/25/90)
[ wizard readers: skip now. followups redirected. ] scs@adam.mit.edu (Steve Summit) writes: > default: > fprintf(stderr, "%s: can't open %s: ", > progname, name); > perror(""); usenet: can't open scs@adam.mit.edu: not a tty. (Steve, I'm shocked! *you* of all people :-) For those of you still puzzled: perror() must be called *immediately* after the error returns, or the internal error state may be modified by calls to system calls in the intervening situation. It's annoying, but perror(name); fprintf(stderr, "%s: error opening %s (see error message above)\n", progname, name); is more correct. -- ronald@robobar.co.uk +44 81 991 1142 (O) +44 71 229 7741 (H)
boyd@necisa.ho.necisa.oz (Boyd Roberts) (10/26/90)
In article <1990Oct24.193712.8693@athena.mit.edu> scs@adam.mit.edu (Steve Summit) writes: >The call I've still been ignoring the return value of (shake, >shake, naughty, naughty) is fstat. Anybody know when that can >fail, as long as the file descriptor and stat buf pointer are >valid? Actually, don't answer that. I'm going to listen to my >own advice and start checking that value too, even if it "can't >fail" today. Good idea. The fstat (or stat) may fail when it goes to write/read the inode times on the file-system. The file-system could go bad. Boyd Roberts boyd@necisa.ho.necisa.oz.au ``When the going gets wierd, the weird turn pro...''