thomson@uthub.UUCP (Brian Thomson) (02/13/84)
Index: sys/kern_descrip.c 4.2BSD Fix
Description:
It is possible to close a file descriptor more than once,
or otherwise use it after it has been closed, and
possibly after another process has reallocated it or
reallocated the in-core inode it points to.
Repeat-By:
If you have installed Jeff Mogul's firewall panic in
closef() [ref. unix-wizards <14552@sri-arpa.UUCP> Dec 12 83]
you may have already seen this. If not, then PUT IT IN
FIRST!!! like so:
kern_descrip.c, closef(), before
(*fp->f_ops->fo_close)(fp);
insert
if(fp->f_count < 1)
panic("closef: f_count < 1");
Then run the following program with the shell 'exec' command,
such that it is the only process that has your terminal open:
#include <sys/types.h>
#include <setjmp.h>
#include <signal.h>
#include <sys/ioctl.h>
jmp_buf jb;
int zero;
gorp() {
longjmp(jb, 0);
}
main() {
int i;
for(i = 0; i < 20; i++)
if(i != 1) close(i);
setjmp(jb);
ioctl(1, TIOCSTART, 0);
ioctl(1, TIOCFLUSH, &zero);
ioctl(1, TIOCSTOP, 0);
write(1, "a", 1);
signal(SIGALRM, gorp);
alarm(1);
close(1);
}
If all went well (so to speak) a single 'a' will print on your
terminal, and your 4.2 system will have paniced.
Fix:
What is happening here is that the process is sleeping at
interruptible priority in sys/tty.c routine ttywait()
for the output queue to drain. Routine close() in sys/kern_descrip.c
doesn't clear u.u_ofile[1] until the close is complete,
even though the reference counts in the file table entry
and in the inode are decremented BEFORE the close completes,
so after the signal wakes us up we have a pointer to a
freed file table entry which may also point to a freed
in-core inode.
Note that the same scenario holds if the process takes
a fatal signal during the close, since normal exit() handling
involves closing all open file descriptors, and f.d. #1
still looks open.
4.1 used to clear the u.u_ofile[] entry in close() before
calling closef(), and doing so would indeed correct the above
instance of the problem. But closef() is called from several
other places in the code, none of which expect to be interrupted:
copen() in sys/ufs_syscalls.c
setregs() in sys/kern_exec.c
unp_discard() in sys/uipc_usrreq.c
The fix I have adopted is suggested by the comment 'XXX Should catch'
in sys/sys_inode.c's ino_close(). In that routine, add the
declaration
label_t oqsave;
and replace the last line
(*cfunc)(dev, flag, fp);
by the lines
oqsave = u.u_qsave;
if(setjmp(&u.u_qsave) == 0)
(*cfunc)(dev, flag, fp); /* last arg is invalid!! */
u.u_qsave = oqsave;
There is still the minor problem that the tty doesn't
quite get properly closed if a signal is taken, but
that is preferable to the mashed file systems people have
been reporting. A better, but harder, fix would be
to rework ttyclose() so that, instead of waiting, the
tty structure gets marked TS_CLOSING and the close
completes asynchronously at interrupt time when the
outq has drained.
I think the lesson to be learned here is that UNIX has become
too big to let this setjmp/longjmp-on-signal nonsense remain
much longer.
--
Brian Thomson, CSRG Univ. of Toronto
{linus,ihnp4,uw-beaver,floyd,utzoo}!utcsrgv!thomsonjbray@bbncca.ARPA (James Bray) (02/24/84)
Yes, that's cute, isn't it? [sys/sys2.c]close(), in my kernel, v7, and sys5, all do something like u.u_ofile[up->fdes] = NULL; closef(fp); 4.2bsd [sys/kern_descrip.c] has, if I may quote, closef(fp); /* WHAT IF u.u_error ? */ u.u_ofile[uap->i] = NULL; Well, what if, indeed. I wonder why they felt they had to reverse these lines. I don't know why one would want to keep a pointer to a structure that one was trying to get rid of, unless one was to change one's mind (which is unfortunately what seems to be happening)... It would certainly be better to lose a pointer to something you still have than to have a pointer to /dev/rdisk00... I wonder if it wasn't just an editing slip. I've always had a morbid fear of those myself... Your editor is in a bad mood, or your fd's get crossed, and somewhere, deep in your kernel, where you don't look very often but don't get me wrong, I'm not trying to put down say acct.c or something, but just say it was in there, and a "==" got changed to a "=" (which used to be my favorite little error, but I am wary of it now), and every six months, like clockwork, it would cause a wild jump into the disk start routine with just the right junk on the stack to start up spiral writes on all your drives... I dunno, maybe lint would catch it... If you were lucky... (diabolical laughter, SCTV 3D noises, and then "pani~]s$.7"...) --Jim Bray