scs@adam.mit.edu (Steve Summit) (10/25/90)
Jon Kamens mentioned a good example of why checking the return values of ALL system calls (e.g. close), even when you think they "can't fail," is a good idea. A few people have disputed this, choosing to label filesystems (e.g. remote ones) which can return an error on close "broken." Evidently the wish is to retain the notion that close "can't fail." As Boyd Roberts and others have correctly pointed out, the best policy is to check for errors from all routines which can return them, system calls and otherwise. An interface specification is a contract between implementer and caller. If it says "this routine returns -1 and sets errno on error," then the most robust code will dutifully check for the error condition, even if the programmer "knows" that the implementation currently in use "never fails." Those words on the specification mean that some later reimplementor is perfectly within rights to add new error returns. (If, on the other hand, the routine were declared as returning void, or without any other error indication, a later reimplementor would not have this option, and the calling code would be justified in not checking.) Remember that binary compatibility means that the operating system can change out from under your program without your even recompiling or relinking, so it is particularly important to anticipate changes in system call returns. You can't even blame the man page for not being updated to reflect new errno values (e.g. EDQUOT for close when remote filesystems were introduced). Looking at a man page and deciding that you don't care about any of the errno values listed there is as bad as looking at the kernel source code and deciding that you don't care about any of the errno values you find there. Both are liable to change (and the man page is likely to be incorrect in the first place, since gleaning syscall errno values from the kernel and keeping man pages up-to-date is drudgework that is not likely to be done reliably). (Remember, too, as Kristoffer Eriksson has pointed out, that close() could conceivably fail, with EIO, under the good, old-fashioned Unix filesystem, on a local disk, due to buffering and write-behind.) If there are errno values you truly don't care about, check for them explicitly, but complain about the ones you don't not care about (and the ones that haven't been invented yet): openfile(name) char *name; { int fd; if((fd = open(name, 1)) < 0) { switch(errno) { case ENOENT: /* doesn't exist, so create: */ fd = creat(name, 0666); if(fd >= 0) return fd; /* else fall through: */ default: fprintf(stderr, "%s: can't open %s: ", progname, name); perror(""); exit(1); } } return fd; } In earlier and simpler days, I never checked the return value of close. Then, when I started messing around with distributed filesystems, I discovered that it was reasonable for close() to fail. I didn't complain that the distributed filesystem was "broken" (in fact, I was writing it, so I could have "fixed" it if it was). I shook my finger at myself, and chastised myself for having been disobeying the "always check error returns" rule. I've been checking the return value from close (at least, for fd's open for writing :-( ) ever since. The call I've still been ignoring the return value of (shake, shake, naughty, naughty) is fstat. Anybody know when that can fail, as long as the file descriptor and stat buf pointer are valid? Actually, don't answer that. I'm going to listen to my own advice and start checking that value too, even if it "can't fail" today. Note that all of this stuff is relative. You're not a heathen atheist scum if you don't check the return value of a call that you are pretty sure can't fail. (There are even good reasons for not doing so, such as on systems with limited code or error message space.) Above I used words like "the best policy is..." and "...the most robust code will dutifully check...". The more care you put into things like rigorous error return checking and other specification details, the more robust, forward-looking, and widely portable your code will be. If you are not writing production-quality code, and you leave out some checking, your code may need tweaking later. (Putting /* XXX */ as a reminder in soft spots in the code is a common device.) By the way, I've included comp.unix.programmer in the distribution, since checking return values is something all programmers (Unix and otherwise) need to know about. (I missed out on the great comp.unix.wizards shakedown and all the brouhaha surrounding it. What a lot of squalling! Not that some of the new groups aren't reasonable, but what was so wrong with unix-wizards that it had to be removed? Answer by mail only, if at all, please -- I don't want to start more flames.) Steve Summit scs@adam.mit.edu
ronald@robobar.co.uk (Ronald S H Khoo) (10/25/90)
[ wizard readers: skip now. followups redirected. ] scs@adam.mit.edu (Steve Summit) writes: > default: > fprintf(stderr, "%s: can't open %s: ", > progname, name); > perror(""); usenet: can't open scs@adam.mit.edu: not a tty. (Steve, I'm shocked! *you* of all people :-) For those of you still puzzled: perror() must be called *immediately* after the error returns, or the internal error state may be modified by calls to system calls in the intervening situation. It's annoying, but perror(name); fprintf(stderr, "%s: error opening %s (see error message above)\n", progname, name); is more correct. -- ronald@robobar.co.uk +44 81 991 1142 (O) +44 71 229 7741 (H)
richard@aiai.ed.ac.uk (Richard Tobin) (10/26/90)
In article <1990Oct25.075856.4923@robobar.co.uk> ronald@robobar.co.uk (Ronald S H Khoo) writes: >For those of you still puzzled: perror() must be called *immediately* after >the error returns Or else you must save and restore errno. Is this guaranteed to work? -- Richard -- Richard Tobin, JANET: R.Tobin@uk.ac.ed AI Applications Institute, ARPA: R.Tobin%uk.ac.ed@nsfnet-relay.ac.uk Edinburgh University. UUCP: ...!ukc!ed.ac.uk!R.Tobin
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (10/26/90)
In article <1990Oct25.075856.4923@robobar.co.uk> ronald@robobar.co.uk (Ronald S H Khoo) writes: > perror(name); > fprintf(stderr, "%s: error opening %s (see error message above)\n", > progname, name); Correct but ugly. As long as you have an array sys_errlist[] with the messages, you can just copy errno into a variable and use that. To skip coding most of the common idioms you might snarf err.c from the pty package. ---Dan
boyd@necisa.ho.necisa.oz (Boyd Roberts) (10/26/90)
In article <1990Oct24.193712.8693@athena.mit.edu> scs@adam.mit.edu (Steve Summit) writes: >The call I've still been ignoring the return value of (shake, >shake, naughty, naughty) is fstat. Anybody know when that can >fail, as long as the file descriptor and stat buf pointer are >valid? Actually, don't answer that. I'm going to listen to my >own advice and start checking that value too, even if it "can't >fail" today. Good idea. The fstat (or stat) may fail when it goes to write/read the inode times on the file-system. The file-system could go bad. Boyd Roberts boyd@necisa.ho.necisa.oz.au ``When the going gets wierd, the weird turn pro...''
ronald@robobar.co.uk (Ronald S H Khoo) (10/26/90)
In article <3648@skye.ed.ac.uk> richard@aiai.UUCP (Richard Tobin) writes: > Or else you must save and restore errno. Is this guaranteed to work? I had thought not, but apparently the answer is "yes" because 4.1.3 says that errno expands to a "modifiable lvalue" and 4.9.10.4 says that perror() "maps the error number in the integer expression errno to an error message" So I guess you're right. OK. In <8215:Oct2521:30:3890@kramden.acf.nyu.edu>, Dan Bernstein <brnstnd@kramden.acf.nyu.edu> suggested using sys_errlist[] which is not standard. I suppose you can always provide a strerror() written in terms of sys_errlist[] if you don't already have strerror() [4.11.6.2]. Many people don't have strerror() which is why I did not mention it previously. (I have it, though) -- ronald@robobar.co.uk +44 81 991 1142 (O) +44 71 229 7741 (H)
chris@mimsy.umd.edu (Chris Torek) (10/26/90)
>In article <1990Oct25.075856.4923@robobar.co.uk> ronald@robobar.co.uk >(Ronald S H Khoo) writes: >> perror(name); >> fprintf(stderr, "%s: error opening %s (see error message above)\n", >> progname, name); In article <8215:Oct2521:30:3890@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >Correct but ugly. As long as you have an array sys_errlist[] ... But you do not---not on all systems, anyway. (It was not documented originally and a number of vendors changed things.) Instead of inventing Yet Another Way To Print Errors, why not use the one mandated by ANSI C, called `strerror'? char *strerror(int err); /* returns the system error string associated with system error number `err' */ fprintf(stderr, "%s: error opening %s: %s\n", progname, name, strerror(errno)); strerror is to be declared in <string.h> (`errno' is declared in <errno.h>) and found in the standard C library. If it does not exist on your system, add it (edit <string.h> if necessary; be sure to save the original, and to note that the change may need to be reapplied on every new release until your vendor catches on to the fact that the ANSI C standard exists). A suitable implementation (which may be compiled and merged into libc.a if necessary) appears below. (I could delete the copyright, since I wrote the thing, but I am feeling perverse. :-) ) /* * Copyright (c) 1989 The Regents of the University of California. * All rights reserved. * * Redistribution and use in source and binary forms are permitted * provided that the above copyright notice and this paragraph are * duplicated in all such forms and that any documentation, * advertising materials, and other materials related to such * distribution and use acknowledge that the software was developed * by the University of California, Berkeley. The name of the * University may not be used to endorse or promote products derived * from this software without specific prior written permission. * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. */ #include <string.h> /* * Return the error message corresponding to some error number. */ char * strerror(e) int e; { extern int sys_nerr; extern char *sys_errlist[]; static char unknown[30]; if ((unsigned)e < sys_nerr) return (sys_errlist[e]); (void) sprintf(unknown, "Unknown error: %d", e); return (unknown); } -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 405 2750) Domain: chris@cs.umd.edu Path: uunet!mimsy!chris
sarima@tdatirv.UUCP (Stanley Friesen) (10/26/90)
In article <1990Oct25.075856.4923@robobar.co.uk> ronald@robobar.co.uk (Ronald S H Khoo) writes: >scs@adam.mit.edu (Steve Summit) writes: >> default: >> fprintf(stderr, "%s: can't open %s: ", >> progname, name); >> perror(""); >For those of you still puzzled: perror() must be called *immediately* after the >error returns, or the internal error state may be modified by calls >to system calls in the intervening situation. My solution, when I feel I *really* must place something in between the error and the call to perror(), is to save and restore errno. Thus: default: old_err = errno; fprintf(stderr, "%s: can't open %s: ", progname, name); errno = old_err; perror(""); Actually, I am likely to make the fprintf() above into an sprintf and pass the resulting string to perror() - it gets the same results, and doesn't involve passing an empty string to perror(). had enough? -- --------------- uunet!tdatirv!sarima (Stanley Friesen)
jim@segue.segue.com (Jim Balter) (10/31/90)
In article <27201@mimsy.umd.edu> chris@mimsy.umd.edu (Chris Torek) writes: [ stuff about strerror, including an implementation ] If you use strerror to implement perror, beware of a subtle trap: #include <errno.h> #include <stdio.h> #include <string.h> int main(void) { char * p; p = strerror(31415); /* presumably unknown error */ errno = 27182; /* ditto */ perror(""); printf("%s\n", p); } A naive implementation will print 27182 twice, which is wrong, because ANSI says that the string returned by strerror may be overwritten by a subsequent call of strerror, but it doesn't say that perror may overwrite it or that perror may call strerror. This and many other gotchas were known to the committee, but unfortunately they didn't provide an implementers' guide enumerating them.