bin@rhesus.primate.wisc.edu (Brain in Neutral) (11/16/88)
System: either VAXstation w/Ultrix 2.2 or VAX 8200 w/Ultrix 1.2 I have an rsh-like program: opens socket connections to remote port, tells remote machine to execute program, forks. Child reads local input sends to remote command. Parent reads remote stdout and stderr. Signals to parent get sent to remote command on a socket as well. I get the following weird behavior on occasion, apparently only IF remote machine is same as local, and if the remote command does a lot of fast writing back to local: some of the data gets stuck in the network (Send-Q for "remote" end has non-zero count in netstat output, and the local parent never sees it. It's waiting for it, because gcore gives a dump that shows it's in a select call. What I don't understand is that the child (which is done writing to remote command and is ready to exit) is shown as a zombie by ps, AND the ps flags for the child include SSEL (=400000), which indicate that *it* is selecting! How can this be? gcore on the child fails (gcore says "Zombie"). 1) Why does ps show the child and not the parent as selecting when the child is exiting and the parent is selecting? 2) Why does the parent select fail when there is actually something to read (or why is the output stuck in the network?) Now the ugly part of the above scenario. The remote command has finished, it's just that the local parent is hung up waiting to receive the rest of the output. Ok, fine, says I, I'll just ^C it. That's supposed to send a signal into the socket. Of course, that socket goes nowhere. The system dies with a segmentation violation. This does not seem friendly to me. Why does it occur? Alternatively, how do I tell that I'd better not write into that socket? Paul DuBois dubois@primate.wisc.edu
bin@primate.wisc.edu (Brain in Neutral) (11/18/88)
> Now the ugly part of the above scenario. The remote command has > finished, it's just that the local parent is hung up waiting to receive > the rest of the output. Ok, fine, says I, I'll just ^C it. That's > supposed to send a signal into the socket. Of course, that socket goes > nowhere. The system dies with a segmentation violation. This does not > seem friendly to me. Why does it occur? Alternatively, how do I tell > that I'd better not write into that socket? My problem seems to be fixed by making sure that the client socket has keepalive and linger turned on. Apparently a socket was getting shut down too early on one end and writing something into it from the other end did nasty things. I am still of the opinion that it would be more reasonable for the system to report an error in this case than to panic and crash, however. Paul DuBois dubois@primate.wisc.edu