pc (03/02/83)
We run 4.1BSD and have installed several device drivers to support our Cambridge ring. Typically, a /dev/entry is mapped in a dynamic way onto a printer (say) on our network. Most of the time, the device driver functions perfectly, but there seems to be occasions (about one every two or three days - when the ring interface is in continual use) where the close routine in the driver isn't called. The program which drives the printer opens it as standard output and the forks to pr to actually do the printing. So the file table entry for the device has a reference count of two - but there is only one inode entry in memory. Now, sometime during the closedown of this pair of processes a user sends an interrupt signal to the parent. The parent traps this (using sigset) and passes on an interrupt to the child. I suspect (but can't prove) that this signal either causes some form of screw-up in the file table mechanism such that the code to call the close routine doesn't happen - or what...? There are two places (sys1.c, sys3.c) where the call to closef is not of the form fp = u.u_ofile[i]; u.u_ofile[i] = NULL; closef(fp); Is this the problem????? At the end of the day, the file table entry and the inode referenced by it have gone - but the device is locked because it's internal tables say that it is busy. The evidence is that the close routine has never been called. Has anyone out there in uucp-land experienced this sort of problem? Please don't just dismiss it and say OH it must be the device driver because I have looked and looked and looked and looked. If anyone has had this problem and can give me a fix (or just to say they might have had it) - please reply to lime!ukc!pc (best) or philabs!mcvax!ukc!pc or decvax!mcvax!ukc!pc Peter Collinson, University of Kent UK
obrien@Rand-Unix (03/30/83)
This "close not called" bug has been in UNIX since at least research Version 6. I've had several stabs myself at finding it and have never managed the trick. I'm hoping the new signal stuff in 4.2 will make it go away. Berkeley's aware of the problem but they haven't found it either.
greep@Su-Dsn (03/31/83)
I've also had problems with close routines apparently not being called, especially when signals were being used. (This was on a driver for the Arpanet.) I think this is a known (but not too well known) bug in 4.1bsd. I don't know any fix for it. Also I once saw a tty left with the exclusive-open bit on, even though the only process talking to it had been closed. This may have also been caused by the close routine not being called. This was with the standard driver (DZ), so no non-Berkeley code was involved.