gordon@prls.UUCP (Gordon Vickers) (10/26/87)
Reply-Path: I am currently running 1.2 though I've had this problem since 1.0 . I have a very simple interface to the killpg(2) call but it fails if the parent process was started in rc.local . Specifically, my application program xxx must run 24 hrs/day so, I have an appropriate entry in my /etc/rc.local to ensure it gets started. xxx is owned by root and has the setuid bit on. Once started it spawns some number of children processes and then sleeps until all of the children have derminated. The killgp(2) call has worked flawlessly everytime except for those occations when xxx was started from /etc/rc.local . When the call fails, the error indicates that there is no such process. I have, of course, been the super user while using the call. Any help in resolving this would be most appreciated since killpg() provides the only easy-to-use way for shutting down my application. Gordon P. Vickers, (408) 991-5370, ======================= Signetics Corp. || Ultrix-32 ver 1.2 || PO Box 3409 M/S 69 || VAX 11/750 || Sunnyvale, California, USA 94086 ======================= {pyramid, philabs}!prls!gordon ----- ALL DISCLAIMERS APPLY. In fact I have sent this whole mess by mistake.
mouse@uunet.UU.NET (der Mouse) (11/18/87)
Reply-Path: In article <10364@felix.UUCP>, gordon@prls.UUCP (Gordon Vickers) writes: > I am currently running 1.2 though I've had this problem since 1.0. I > have a very simple interface to the killpg(2) call but it fails if > the parent process was started in rc.local. > [...] > The killgp(2) call has worked flawlessly everytime except for those > occations when xxx was started from /etc/rc.local. This is not peculiar to Ultrix. I have been checking mtXinu 4.3+NFS source on this subject; I expect that this code is nearly identical in all Berkeley derivatives. The pronouncements below about what is and isn't done are thus based on the mtXinu 4.3+NFS code. Where do you get the process group you send the signal to? Do you simply assume it is identical to the process ID of the parent process or do you use getpgrp()? If the former, bad boy - fix it, 'cause otherwise it'll bite you eventually (like when you use this technique on a process started as part of a pipeline). If the latter, note the returned process group (it's 0 for rc.local processes, right?). Note also that /bin/sh, which interprets rc.local, does not grok process groups. All children of a sh inherit the sh's process group, which was given to it by its parent. In the case of rc.local, this runs all the way back to init. Init's process group is never set and is therefore zero. Sending a signal to process group 0 with killpg(2) is interpreted as a request to send to the process group of the sending process. Thus, you wind up sending the signal to the process supposedly doing the sending instead of to the process you wanted to send it to. So how do you reliably kill the whole tree? Recommendation one: At startup, have the parent process check to see whether it was started from rc.local and if so to set its process group to match its process ID. To check for rc.local, you can check for a process group ID of zero or for a missing control terminal (try to open /dev/tty), both, or something else if it occurs to you. Recommendation two: Have your killing program know a little about the internals of the kernel, a la ps, and have it run through the process tree finding and killing the child processes. Recommendation three: Have the child processes periodically check to see whether the parent process has died (do getppid() at startup and periodically try sending signal 0), and make them go away if so. Recommendation four: Have the parent process trap the signal and have it kill all the children before dying. If none of the above are satisfactory, send me mail explaining in greater detail and I'll see if I can come up with any better ideas. der Mouse (mouse@mcgill-power:56 you
chuck@felix.UUCP (12/03/87)
Reply-Path: -In an article <recently>, mouse&uunet (der Mouse) writes: In article <10364@felix.UUCP>, gordon@prls.UUCP (Gordon Vickers) writes: I am currently running 1.2 though I've had this problem since 1.0. I have a very simple interface to the killpg(2) call but it fails if the parent process was started in rc.local. [...] The killgp(2) call has worked flawlessly everytime except for those occations when xxx was started from /etc/rc.local. - -This is not peculiar to Ultrix. -I have been checking mtXinu 4.3+NFS source on this subject; I expect -that this code is nearly identical in all Berkeley derivatives. The -pronouncements below about what is and isn't done are thus based on the -mtXinu 4.3+NFS code. Three or four weeks ago, someone sent me the aprropriate fix to my problem. Since it has been a while, I no longer have the message so I don't know who deserves the credit (Sorry Mr/Mrs/Ms However ). One line of code fixed the problem. Had I been more familar with the availiable library routines prehaps I would not have had the difficulty. The line that cured it all: setpgrp(0, getpid()); Pretty easy huh? BTW: Thanks to ALL that responded, even the inappropriate suggestions contained interesting ideas and usefull knowelge. Gordon P. Vickers, (408) 991-5370, ======================= Signetics Corp. || Ultrix-32 ver 1.2 || PO Box 3409 M/S 69 || VAX 11/750 || Sunnyvale, California, USA 94086 ======================= {pyramid, philabs}!prls!gordon ----- ALL DISCLAIMERS APPLY. In fact I have sent this whole mess by mistake.