gwoho@nntp-server.caltech.edu (g liu) (10/26/90)
supose someone ran the following program:
main()
{
int i;
for (i=0; i<32; i++)
signal(i,(char *)1);
while (fork()!=-1)
;
}
then ran it. how would i kill it? i could kill one of them, but
then the others multiply until the limit is reached. then one fails,
terminates, and a new one appears, etc. so i have 1000 processes
with constantly changing pids that have too be stopped simultaneously.
any ideas for stopping it?
gwoho.
rickert@mp.cs.niu.edu (Neil Rickert) (10/26/90)
In article <1990Oct25.185822.11838@nntp-server.caltech.edu> gwoho@nntp-server.caltech.edu (g liu) writes: >supose someone ran the following program: >main() >{ >int i; >for (i=0; i<32; i++) >signal(i,(char *)1); >while (fork()!=-1) >; >} > >then ran it. how would i kill it? i could kill one of them, but >then the others multiply until the limit is reached. then one fails, >terminates, and a new one appears, etc. so i have 1000 processes >with constantly changing pids that have too be stopped simultaneously. >any ideas for stopping it? 1000 processes. Wow. I keep my limit to 50 processes per user. The power on/off switch is a very effective method of simultaneously stopping processes. Short of that, and assuming that the offending processes are not running as 'root', you can try my 'csh' script. Since this depends on 'csh' you will need to temporarily change that user's login shell to be /bin/csh for this script to work. If the PIDs are constantly changing you will need to run the script several times to eliminate all of the runaway processes. You must do it carefully so as to not kill any of the additional processes created by the script. I call the script 'force'. If the user's program is called 'xyzzy' and the user's name is 'baduser' I would, as root, use the following command: ps uax | grep ^baduser | grep xyzzy | force and I would repeat that command as many times as needed until the script prints out its help message to indicate that no more processes are left (i.e. it received no input in the pipe). The reason it needs 'csh' is that it su's to the user before killing a process. If you are at the process limit you can't use /bin/kill, so you need a kill built in command. The basic strategy is blocking. Before killing a process, you create a new, more harmless process for the user. This keeps the number of processes for the user at the limit, so that it cannot breed. Eventually they will all die by themselves or be killed by repeated applications of the script, since the new harmless processes prevent creation of more. Then finally you sit back and wait for the harmless processes (sleeps) to naturally die. Hope this helps. #/bin/csh -fx # This C-shell script is designed to forcably kill self propogating # processes. If a process is in a "fork" loop, it will continue to # propogate (as will its children) until it reaches the limit of the # number of processes per user. It is often impossible to kill these # processes, since as soon as one is killed, it makes room for another # to propogate. # In order to force them off, it is necessary to block their reproduction # by creating additional processes under the same userid. This can be done # with the "su" command. Thus we go into a loop of su, kill a process, and # repeat this until all processes are killed. We must then terminate the # newly created blocking processes. # # To use this command, just run it (as root) with the input redirected to come # from the output of the "ps -ut..| grep user" command, where .. refers # to the terminal from which these processes were started. set nonomatch set count=0 set process=( x x x ) while ($#process > 2) set noglob eval set process=\($<\) unset noglob if ($#process < 3) break set u=$process[1] set p=$process[2] su $u -cf "kill -KILL $p ; exec sleep 1000" & @ count++ end if ($count == 0) then echo "Usage:" echo "ps -uta31 | grep ^xyzzy | $0 " echo " will (we hope) kill off all propogating processes of" echo " user xyzzy from terminal ttya31" echo "The killed processes are replaced by relatively harmless" echo "sleeping process (sleeping for 17 minutes approx.)" echo " the user MUST NOT be root, but only root can issue this command" echo " " echo " The user being forced off MUST be using /bin/csh as login shell" echo " " echo " Before using this command, it is a good idea to 'renice'" echo " all of the user's commands to +19, to essentially stop them." exit endif -- =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*= Neil W. Rickert, Computer Science <rickert@cs.niu.edu> Northern Illinois Univ. DeKalb, IL 60115. +1-815-753-6940
pfalstad@rise.Princeton.EDU (Paul John Falstad) (10/27/90)
In article <1990Oct25.185822.11838@nntp-server.caltech.edu> gwoho@nntp-server.caltech.edu (g liu) writes: >supose someone ran the following program: >main() >{ >int i; >for (i=0; i<32; i++) >signal(i,(char *)1); >while (fork()!=-1) >; >} >then ran it. how would i kill it? i could kill one of them, but >then the others multiply until the limit is reached. then one fails, >terminates, and a new one appears, etc. so i have 1000 processes >with constantly changing pids that have too be stopped simultaneously. >any ideas for stopping it? Yes. If YOU ran this, you could kill it with 'kill -9 %%', which would killpg() the whole process group. If someone else ran it, you could use a short C program that would just killpg() the process group of all 1000 processes. A smart troublemaker would setpgrp(0,getpid()) after each fork(). Line fodder. -- Paul Falstad, pfalstad@phoenix.princeton.edu PLink:HYPNOS GEnie:P.FALSTAD "Your attention please. Would the owner of the Baader-Meinhof shoulder-bag which has just exploded outside the terminal please pick up the white courtesy phone."
gsh7w@astsun.astro.Virginia.EDU (Greg Hennessy) (10/27/90)
g liu writes:
#i have 1000 processes
#with constantly changing pids that have too be stopped simultaneously.
#any ideas for stopping it?
/etc/reboot.
--
-Greg Hennessy, University of Virginia
USPS Mail: Astronomy Department, Charlottesville, VA 22903-2475 USA
Internet: gsh7w@virginia.edu
UUCP: ...!uunet!virginia!gsh7w
pierrot@opal.cs.tu-berlin.de (Tatjana Heuser) (10/29/90)
gwoho@nntp-server.caltech.edu (g liu) writes: >then ran it. how would i kill it? i could kill one of them, but >then the others multiply until the limit is reached. then one fails, >terminates, and a new one appears, etc. so i have 1000 processes >with constantly changing pids that have too be stopped simultaneously. >any ideas for stopping it? the fastest way coming to my mind would be 'kill -9 -1' sending a sigkill to every process attached. -not smart but effective- -tatjana -- Pierrot le fou | UUCP: pierrot@tubopal.UUCP (pierrot@opal.cs.tu-berlin.de) Tatjana Heuser | ...!unido!tub!opal!pierrot (Europe) D-1000 Berlin 30 | ...!pyramid!tub!opal!pierrot (World) Ettaler Str.2 | BITNET: pierrot%tubopal@DB0TUI11.BITNET (saves $$$)
news@brian386.uucp (News Administrator) (10/30/90)
In article <1990Oct25.185822.11838@nntp-server.caltech.edu> gwoho@nntp-server.caltech.edu (g liu) writes: >supose someone ran the following program: >main() >{ >int i; >for (i=0; i<32; i++) >signal(i,(char *)1); >while (fork()!=-1) >; >} > >then ran it. how would i kill it? i could kill one of them, but If you have super user priviledges, you could: kill -9 -1 (at least on SysV). This has the "unfortunate" side effect of killing darn near everything on the system, but it should take care of the problem. Then you edit /etc/passwd, and change the users password to NONE ;-). brian
GEustace@massey.ac.nz (Glen Eustace) (11/01/90)
We recently had the exact situation described in the previous posting. There was a little more code involved but the net effect was the same. All attempts to clear out the system failed as there was no spare CPU available to allow remedial action to be taken. The problem was cured by a reboot. Following our problem, the perpertrator posted to comp.unix.questions to find out what we could have done. We received various replies including the 'kill -9 -1' variety. I have recently had the oppurunity to repeat the excercise on a dedicated 3 processor Pyramid MISServer. After many experiments I was unable to come up with a satisfactory solution. My experimental situation had CPU to spare and the user had consumed all 40 of their available process table slots. I still could not clear things up. NB. The first thing I did was to renice the user to 19 so that all current processes and any new ones would run as slowly as possible. It would appear that kill doesn't lock the process table will killing all of the process group, they still kept replicating. I would appreciate any further comments by anyone who has a technique that will actually solve the situation in practice not theory. -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Glen Eustace, Software Manager, Computer Centre, Massey University, Palmerston North, New Zealand. EMail: G.Eustace@massey.ac.nz Phone: +64 63 69099 x7440, Fax: +64 63 505 607, Timezone: GMT-12 12 Glen Eustace, Software Manager, Computer Centre, Massey University, Palmerston North, New Zealand. EMail: G.Eustace@massey.ac.nz Phone: +64 63 69099 x7440, Fax: +64 63 505 607, Timezone: GMT-12 12
jdarcy@encore.com (Jeff d'Arcy) (11/01/90)
GEustace@massey.ac.nz (Glen Eustace) writes: >I would appreciate any further comments by anyone who has a technique >that will actually solve the situation in practice not theory. it's nice to know you have such respect for your peers. The solutions I suggested have actually been used successfully on a variety of machines and flavors of U*X from workstations to multiprocessor superminis, 4.3 to SysV to Mach. Just because something doesn't work for you is no reason to get obnoxious. -- Jeff d'Arcy, Generic Software Engineer - jdarcy@encore.com Ask me if I care. . .maybe I do
mohta@necom830.cc.titech.ac.jp (Masataka Ohta) (11/01/90)
In article <1119@massey.ac.nz> GEustace@massey.ac.nz (Glen Eustace) writes: >It would appear that kill doesn't lock the process table will killing >all of the process group, they still kept replicating. >I would appreciate any further comments by anyone who has a technique >that will actually solve the situation in practice not theory. You should "kill -STOP -1" several times. Having stoped all self-replicating processes, you can kill all of them. If your UNIX dose not have STOP, you are out of luck. Masataka Ohta
rickert@mp.cs.niu.edu (Neil Rickert) (11/02/90)
In article <1119@massey.ac.nz> GEustace@massey.ac.nz (Glen Eustace) writes: >We recently had the exact situation described in the previous >posting. There was a little more code involved but the net effect >was the same. All attempts to clear out the system failed as there >was no spare CPU available to allow remedial action to be taken. The >problem was cured by a reboot. > >Following our problem, the perpertrator posted to comp.unix.questions >to find out what we could have done. We received various replies >including the 'kill -9 -1' variety. > We have 10 processors. Simple killing of replicating processes never works, because more are created as fast as old ones are killed off. I regularly see students who inadvertently create the problem, and finish up running out of processes (the local per-user limit is 50). I have NEVER had to reboot to resolve this problem. My experience is with a BSD system, so may not apply to SysV. Here are three simple approaches to try: (1) The simple-minded approach. Look for a file which the programs depend on. Try removing or renaming that file. In particular, if the replicating process seems to be a shell script, look for a shell script in the user's directory named 'test'. (2) The slow and tediou method. This is a method I sometimes ask the student and/or his instructor to use. It is somewhat slow, as it requires killing all the processes individually. It usually works. Step 1. Find a list of the bad processes. If the student is doing this himself, he can ask a friend on a different account to do a 'ps uax|grep user' for this purpose. Failing that, he should be able to login, and then used 'exec ps ug'. This will give the list of processes, but log him out again. Step 2. Armed with a list of process IDs, start killing them with the STOP signal. exec /bin/kill -STOP pid pid pid ... The idea is to prevent further replication, but keep the processes in place so that you are always at the limit. This step, and Step 1 may have to be repeated several times to stop them all. Step 3. Start killing the STOPPED processes. To do this you will need the output of 'ps l'. You must not kill a child before killing the parent. Killing the child may cause the parent to wake up, and go back to its errant ways of replicating itself. Most of the time when you see this some of the processes have process 1 as the parent ID. The procedure is to kill all of the errant processes whose PPID is 1. Keep repeating this step till they are all gone. Usually this becomes easier as you proceed, for you stop getting the 'out of processes' message after a killing a few, and no longer need to 'exec /bin/kill' and relogin after every try. (3) The brute force method. I posted a script to do this recently. It was posted as article <1990Oct26.140851.11707@mp.cs.niu.edu>. Read that article for full information. It requires that you be root to execute it, and it requires that the perpetrator's login shell be 'csh' (because 'kill' is then builtin and doesn't require a new process). The basic idea is 'blocking'. You keep the number of processes at the limit, so as to prevent further replication. The script does the following: for each errant process create a new process (/bin/csh) for the user. kill the errant process the new process exec's to 'sleep 10 minutes' so as to be relatively harmless. If the processes are dying as well as replicating, my script may need to be rerun a few times. But, regardless, it soon creates enough sleeps under the userid that further replication of all errant processes is impossible, so they either all die out naturally, or sit around long enough to be killed. I have thought of rewriting the script as a C-program. It would be SUID, so that anyone could use it. Basically it would allow a user to type 'exec superkill' to kill all of his processes. I have never bothered to do this because the problem does not seem to crop up often enough to go to the trouble. -- =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*= Neil W. Rickert, Computer Science <rickert@cs.niu.edu> Northern Illinois Univ. DeKalb, IL 60115. +1-815-753-6940
boyd@necisa.ho.necisa.oz (Boyd Roberts) (11/05/90)
>In article <1119@massey.ac.nz> > GEustace@massey.ac.nz (Glen Eustace) writes: > >It would appear that kill doesn't lock the process table will killing >all of the process group, they still kept replicating. > On a uni-processor the process table is effectively locked because no other process can run while the kill() system call is running. System calls only relinquish the CPU when they want to. kill() is not one of them. If it's a uni-processor and they have the same process group its easy. A multi-processor or something calling setpgrp() is another problem. Your process spawning may vary. Boyd Roberts boyd@necisa.ho.necisa.oz.au ``November spawned a monster...'' - Morrissey