joe@fluke.UUCP (Joe Kelsey) (08/02/84)
vhangup() doesn't do ALL of what I want it to do. What it DOES do quite well is turn off I/O to the particular tty in question when someone logs out with background processes still running. Unfortunately, the background process still has the tty in question open, so the various reference count fields associated with the terminal never drop to zero, and if it happens to be set HUPCLS, the tty close routine never gets called to actually hang up the line. This is a problem especially if you have your terminals connected to a port selector (like a Micom 600), since the line won't be automatically disconnected at logout. I've been staring at kernel source all morning and I can't really see any easy way to fix vhangup(). One possibility would be to actually call closef() from within the inner loop of forceclose(), but this might cause problems when the process in question eventually issues its own close(). I can't really tell if it WOULD cause a problem, although I suspect that there are a lot of race conditions possible. If I could substitute a reference to /dev/null somehow that would be the best, but I'm not sure exactly how I would do all this from the kernel without causing lots of trouble. I can think of lots of potential panic()s that could arise here. Does anyone have any good ideas about the best way to try to approach modifying vhangup()s behavior? I guess the practical hack is to modify the shells to toggle DTR just before they exit at logout. This is NOT what I really want - I'm not even sure if I can do that to /bin/sh - I'm sure another hack couldn't hurt csh any! Well, I guess I'll just go back and stare at the kernel code some more. I'll either get a sudden inspiration or maybe just catch up on my beauty sleep! /Joe
joe@fluke.UUCP (Joe Kelsey) (08/03/84)
I have been staring at the code more, and am on the verge of inserting
some new code. What I am thinking of doing is calling the device close
routine from vhangup() if it detects any processes still running. Now,
these processes may or may not terminate when they receive the SIGHUP
that vhangup() sends. If they do, fine. If they ignore or otherwise
catch SIGHUP, then my original problem occurs. What I propose to do is
change the vhangup code as follows:
#ifdef FLUKE
i = forceclose (u.u_ttyd);
if ((u.u_ttyp -> t_state) & TS_ISOPEN) {
gsignal (u.u_ttyp -> t_pgrp, SIGHUP);
/* Insert call to device close routine here */
printf ("vhangup: dev %d/%d forceclose returns %d\n",
major (u.u_ttyd), minor (u.u_ttyd), i);
}
#else
forceclose (u.u_ttyd);
if ((u.u_ttyp -> t_state) & TS_ISOPEN)
gsignal (u.u_ttyp -> t_pgrp, SIGHUP);
#endif FLUKE
As you can see, it also involves a trivial change to forceclose() to
return the number of file descriptors that it found associated with the
device. I am going to run with the printf statement for a while before
I decide whether or not I need to use the number returned in a
conditional call to the device close or whether the test for the device
being open is sufficient. It seemed to me that this was the solution
least likely to cause problems with file and inode counts, but still
have the effect of dropping DTR on the line and hanging up the
connection.
Any comments about this proposal? After I run with the printf in the
kernel for a week or two I'll be able to tell more about how often the
problem really occurs.
/Joe
pc@ukc.UUCP (R.P.A.Collinson) (08/11/84)
The vhangup problem discussed by Joe Kelsey and also the getty/dialup problem are related. One fundamental problem with the vhangup call is that it takes a file descriptor and not a device name. This means that the terminal has to be opened, vhangup'ed and then closed. This means that the act of opening the device may foul up reference counts on inodes etc etc. I also suspect that there is a fundamental problem in the fact that inodes for terminals are never locked so that UNIX can scribble when you are inputting, there are occasional race problems with the close code. The terminals on our systems are connected via a network and we want all terminals to detach cleanly from a machine on logout. The problem is to do this in a benign way, not killing everything which a user has carefully left running in the background. I have had various attempts at this, and the current scheme is as follows: 1) Invent a new system call (yes YET ANOTHER one) called hangtty(dev) char *dev; The dev argument is used to obtain an pointer to the incore inode for the terminal. 2) Set a flag in this inode IDEAD (UCB thoughtfully left a few spare bits) call the terminal close routine. We must now make sure that any further action on the inode does not get through to the device. 3) A write/read to an IDEAD inode results in a SIGKILL being sent to the offending process. The user has no business leaving a background process running which writes or reads when the user has gone away. 4) The call to get a new inode for the terminal (iget) ignores any inodes with IDEAD bits. So for short periods, there is more than one inode referring to the terminal. One is live, all the others have the IDEAD bit set. This scheme appears to work, and is in use at mcvax. I believe that certain ex-UCB people don't like it because it cuts across the layering in the kernel. But still..... (On the UCB/System V discussion, Bill Joy did us all a disservice in csh when he allowed people to leave background jobs running without having said nohup). Peter Collinson University of Kent, UK vax135!ukc!pc mcvax!ukc!pc
joe@fluke.UUCP (Joe Kelsey) (08/14/84)
>From: pc@ukc.UUCP (R.P.A.Collinson) > >3) A write/read to an IDEAD inode results in a SIGKILL being sent > to the offending process. > The user has no business leaving a background process running > which writes or reads when the user has gone away. I object strenuously to any solution which generates a SIGKILL! This is completely bogus! I can't write software which ignores this signal! What you want is to quietly replace the inode referring to the terminal with one referring to /dev/null and also send a SIGHUP. The current implementation correctly sends the SIGHUP, but for processes which are either nohup or choose to ignore SIGHUP, this doesn't work. Going around killing processes for no reason is really and extremely bad practice and should not be placed in any kernel code! >(On the UCB/System V discussion, Bill Joy did us all a disservice in csh when >he allowed people to leave background jobs running without having said >nohup). I see absolutely no disservice here. You can do exactly the same thing in System V. The problem is NOT csh - the problem is a process which CHOOSES to ignore SIGHUP! Let's stop this ucb bad mouthing unless you really know what you are talking about. BTW - I have heard from jwp@sdchema that there is an implementation of a new system call called chfile() which will substitute one file for another and does EXACTLY what I (and chris@umcp-cs) want. I am anxiously awaiting more details about the code so I can install it here. I could write it myself, but why re-invent the wheel. /Joe