[comp.unix.misc] killing a process gone bad.

gwoho@nntp-server.caltech.edu (g liu) (10/26/90)

supose someone ran the following program:
main()
{
int i;
for (i=0; i<32; i++)
signal(i,(char *)1);
while (fork()!=-1)
;
}

then ran it. how would i kill it? i could kill one of them, but
then the others multiply until the limit is reached. then one fails,
terminates, and a new one appears, etc. so i have 1000 processes
with constantly changing pids that have too be stopped simultaneously.
any ideas for stopping it?
gwoho.

rickert@mp.cs.niu.edu (Neil Rickert) (10/26/90)

In article <1990Oct25.185822.11838@nntp-server.caltech.edu> gwoho@nntp-server.caltech.edu (g liu) writes:
>supose someone ran the following program:
>main()
>{
>int i;
>for (i=0; i<32; i++)
>signal(i,(char *)1);
>while (fork()!=-1)
>;
>}
>
>then ran it. how would i kill it? i could kill one of them, but
>then the others multiply until the limit is reached. then one fails,
>terminates, and a new one appears, etc. so i have 1000 processes
>with constantly changing pids that have too be stopped simultaneously.
>any ideas for stopping it?

 1000 processes.  Wow.  I keep my limit to 50 processes per user.

 The power on/off switch is a very effective method of simultaneously
stopping processes.  Short of that, and assuming that the offending
processes are not running as 'root', you can try my 'csh' script.  Since
this depends on 'csh' you will need to temporarily change that user's
login shell to be /bin/csh for this script to work.

 If the PIDs are constantly changing you will need to run the script
several times to eliminate all of the runaway processes.  You must
do it carefully so as to not kill any of the additional processes
created by the script.

 I call the script 'force'.  If the user's program is called 'xyzzy'
and the user's name is 'baduser' I would, as root, use the following
command:

  ps uax | grep ^baduser | grep xyzzy | force

 and I would repeat that command as many times as needed until the
script prints out its help message to indicate that no more processes
are left (i.e. it received no input in the pipe).

 The reason it needs 'csh' is that it su's to the user before killing
a process.  If you are at the process limit you can't use /bin/kill,
so you need a kill built in command.

 The basic strategy is blocking.  Before killing a process, you create
a new, more harmless process for the user.  This keeps the number of
processes for the user at the limit, so that it cannot breed.  Eventually
they will all die by themselves or be killed by repeated applications of
the script, since the new harmless processes prevent creation of more.
Then finally you sit back and wait for the harmless processes (sleeps)
to naturally die.

 Hope this helps.

#/bin/csh -fx
# This C-shell script is designed to forcably kill self propogating
# processes.  If a process is in a "fork" loop, it will continue to
# propogate (as will its children) until it reaches the limit of the
# number of processes per user.  It is often impossible to kill these
# processes, since as soon as one is killed, it makes room for another
# to propogate.
#  In order to force them off, it is necessary to block their reproduction
# by creating additional processes under the same userid.  This can be done
# with the "su" command.  Thus we go into a loop of su, kill a process, and
# repeat this until all processes are killed.  We must then terminate the
# newly created blocking processes.
#
#  To use this command, just run it (as root) with the input redirected to come
# from the output of the "ps -ut..| grep user" command, where .. refers
# to the terminal from which these processes were started.
set nonomatch
set count=0
set process=( x x x )
while ($#process > 2)
	set noglob
	eval set process=\($<\)
	unset noglob
	if ($#process < 3) break
	set u=$process[1]
	set p=$process[2]
	su $u -cf "kill -KILL $p ; exec sleep 1000" &
	@ count++
end
if ($count == 0) then
	echo "Usage:"
	echo "ps -uta31 | grep ^xyzzy | $0 "
	echo " will (we hope) kill off all propogating processes of"
	echo " user  xyzzy  from terminal  ttya31"
	echo "The killed processes are replaced by relatively harmless"
	echo "sleeping process (sleeping for 17 minutes approx.)"
	echo "  the user MUST NOT be root, but only root can issue this command"
	echo " "
	echo " The user being forced off MUST be using /bin/csh as login shell"
	echo " "
	echo " Before using this command, it is a good idea to 'renice'"
	echo " all of the user's commands to +19, to essentially stop them."
	exit
endif
-- 
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
  Neil W. Rickert, Computer Science               <rickert@cs.niu.edu>
  Northern Illinois Univ.
  DeKalb, IL 60115.                                  +1-815-753-6940

pfalstad@rise.Princeton.EDU (Paul John Falstad) (10/27/90)

In article <1990Oct25.185822.11838@nntp-server.caltech.edu> gwoho@nntp-server.caltech.edu (g liu) writes:
>supose someone ran the following program:
>main()
>{
>int i;
>for (i=0; i<32; i++)
>signal(i,(char *)1);
>while (fork()!=-1)
>;
>}
>then ran it. how would i kill it? i could kill one of them, but
>then the others multiply until the limit is reached. then one fails,
>terminates, and a new one appears, etc. so i have 1000 processes
>with constantly changing pids that have too be stopped simultaneously.
>any ideas for stopping it?

Yes.  If YOU ran this, you could kill it with 'kill -9 %%', which would
killpg() the whole process group.  If someone else ran it, you could
use a short C program that would just killpg() the process group of all
1000 processes.  A smart troublemaker would setpgrp(0,getpid()) after
each fork().










Line fodder.
--
Paul Falstad, pfalstad@phoenix.princeton.edu PLink:HYPNOS GEnie:P.FALSTAD
"Your attention please.  Would the owner of the Baader-Meinhof shoulder-bag
which has just exploded outside the terminal please pick up the white
courtesy phone."

gsh7w@astsun.astro.Virginia.EDU (Greg Hennessy) (10/27/90)

g liu writes:
#i have 1000 processes
#with constantly changing pids that have too be stopped simultaneously.
#any ideas for stopping it?

/etc/reboot.

--
-Greg Hennessy, University of Virginia
 USPS Mail:     Astronomy Department, Charlottesville, VA 22903-2475 USA
 Internet:      gsh7w@virginia.edu  
 UUCP:		...!uunet!virginia!gsh7w

pierrot@opal.cs.tu-berlin.de (Tatjana Heuser) (10/29/90)

gwoho@nntp-server.caltech.edu (g liu) writes:

>then ran it. how would i kill it? i could kill one of them, but
>then the others multiply until the limit is reached. then one fails,
>terminates, and a new one appears, etc. so i have 1000 processes
>with constantly changing pids that have too be stopped simultaneously.
>any ideas for stopping it?
the fastest way coming to my mind would be 'kill -9 -1'
sending a sigkill to every process attached. -not smart but effective-
-tatjana
--
Pierrot le fou      | UUCP: pierrot@tubopal.UUCP (pierrot@opal.cs.tu-berlin.de)
Tatjana Heuser      |         ...!unido!tub!opal!pierrot (Europe) 
D-1000 Berlin 30    |         ...!pyramid!tub!opal!pierrot (World)
Ettaler Str.2       | BITNET: pierrot%tubopal@DB0TUI11.BITNET (saves $$$)

news@brian386.uucp (News Administrator) (10/30/90)

In article <1990Oct25.185822.11838@nntp-server.caltech.edu> gwoho@nntp-server.caltech.edu (g liu) writes:
>supose someone ran the following program:
>main()
>{
>int i;
>for (i=0; i<32; i++)
>signal(i,(char *)1);
>while (fork()!=-1)
>;
>}
>
>then ran it. how would i kill it? i could kill one of them, but

If you have super user priviledges, you could:

	kill -9 -1

(at least on SysV).  This has the "unfortunate" side effect of killing darn
near everything on the system, but it should take care of the problem.  Then
you edit /etc/passwd, and change the users password to NONE ;-).

	brian

GEustace@massey.ac.nz (Glen Eustace) (11/01/90)

We recently had the exact situation described in the previous
posting.  There was a little more code involved but the net effect
was the same.  All attempts to clear out the system failed as there
was no spare CPU available to allow remedial action to be taken.  The
problem was cured by a reboot.

Following our problem, the perpertrator posted to comp.unix.questions
to find out what we could have done.  We received various replies
including the 'kill -9 -1' variety.

I have recently had the oppurunity to repeat the excercise on a
dedicated 3 processor Pyramid MISServer.  After many experiments I
was unable to come up with a satisfactory solution.  My experimental
situation had CPU to spare and the user had consumed all 40 of their
available process table slots.  I still could not clear things up.

NB. The first thing I did was to renice the user to 19 so that all
current processes and any new ones would run as slowly as possible.

It would appear that kill doesn't lock the process table will killing
all of the process group, they still kept replicating.

I would appreciate any further comments by anyone who has a technique
that will actually solve the situation in practice not theory.

-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Glen Eustace, Software Manager, Computer Centre, Massey University,
Palmerston North, New Zealand.        EMail: G.Eustace@massey.ac.nz
Phone: +64 63 69099 x7440, Fax: +64 63 505 607,    Timezone: GMT-12
12

Glen Eustace, Software Manager, Computer Centre, Massey University,
Palmerston North, New Zealand.        EMail: G.Eustace@massey.ac.nz
Phone: +64 63 69099 x7440, Fax: +64 63 505 607,    Timezone: GMT-12
12

jdarcy@encore.com (Jeff d'Arcy) (11/01/90)

GEustace@massey.ac.nz (Glen Eustace) writes:
>I would appreciate any further comments by anyone who has a technique
>that will actually solve the situation in practice not theory.

it's nice to know you have such respect for your peers.  The solutions I
suggested have actually been used successfully on a variety of machines
and flavors of U*X from workstations to multiprocessor superminis, 4.3 to
SysV to Mach.  Just because something doesn't work for you is no reason
to get obnoxious.
--

Jeff d'Arcy, Generic Software Engineer - jdarcy@encore.com
             Ask me if I care. . .maybe I do

mohta@necom830.cc.titech.ac.jp (Masataka Ohta) (11/01/90)

In article <1119@massey.ac.nz>
	GEustace@massey.ac.nz (Glen Eustace) writes:

>It would appear that kill doesn't lock the process table will killing
>all of the process group, they still kept replicating.

>I would appreciate any further comments by anyone who has a technique
>that will actually solve the situation in practice not theory.

You should "kill -STOP -1" several times.

Having stoped all self-replicating processes, you can kill all of them.

If your UNIX dose not have STOP, you are out of luck.

					Masataka Ohta

rickert@mp.cs.niu.edu (Neil Rickert) (11/02/90)

In article <1119@massey.ac.nz> GEustace@massey.ac.nz (Glen Eustace) writes:
>We recently had the exact situation described in the previous
>posting.  There was a little more code involved but the net effect
>was the same.  All attempts to clear out the system failed as there
>was no spare CPU available to allow remedial action to be taken.  The
>problem was cured by a reboot.
>
>Following our problem, the perpertrator posted to comp.unix.questions
>to find out what we could have done.  We received various replies
>including the 'kill -9 -1' variety.
>
  We have 10 processors.  Simple killing of replicating processes never
works, because more are created as fast as old ones are killed off.  I
regularly see students who inadvertently create the problem, and finish
up running out of processes (the local per-user limit is 50).

  I have NEVER had to reboot to resolve this problem.  My experience is
with a BSD system, so may not apply to SysV.

  Here are three simple approaches to try:

  (1)	The simple-minded approach.
	Look for a file which the programs depend on.  Try removing or
	renaming that file.  In particular, if the replicating process
	seems to be a shell script, look for a shell script in the user's
	directory named 'test'.

  (2)	The slow and tediou method.
	This is a method I sometimes ask the student and/or his instructor
	to use.  It is somewhat slow, as it requires killing all the processes
	individually.  It usually works.

	Step 1.  Find a list of the bad processes.  If the student is doing
	this himself, he can ask a friend on a different account to do a
	'ps uax|grep user' for this purpose.  Failing that, he should be
	able to login, and then used 'exec ps ug'.  This will give the list
	of processes, but log him out again.

	Step 2.  Armed with a list of process IDs, start killing them with
	the STOP signal.
		exec /bin/kill -STOP pid pid pid ...
	The idea is to prevent further replication, but keep the processes
	in place so that you are always at the limit.  This step, and Step 1
	may have to be repeated several times to stop them all.

	Step 3.  Start killing the STOPPED processes.  To do this you
	will need the output of 'ps l'.  You must not kill a child before
	killing the parent.  Killing the child may cause the parent to
	wake up, and go back to its errant ways of replicating itself.
	Most of the time when you see this some of the processes have
	process 1 as the parent ID.  The procedure is to kill all of the
	errant processes whose PPID is 1.  Keep repeating this step till
	they are all gone.  Usually this becomes easier as you proceed,
	for you stop getting the 'out of processes' message after a killing
	a few, and no longer need to 'exec /bin/kill' and relogin after every
	try.

  (3)	The brute force method.
	I posted a script to do this recently.  It was posted as article
	<1990Oct26.140851.11707@mp.cs.niu.edu>.  Read that article for
	full information.  It requires that you be root to execute it,
	and it requires that the perpetrator's login shell be 'csh'
	(because 'kill' is then builtin and doesn't require a new process).

	The basic idea is 'blocking'.  You keep the number of processes at
	the limit, so as to prevent further replication.  The script does
	the following:
		for each errant process
			create a new process (/bin/csh) for the user.
			kill the errant process
			the new process exec's to 'sleep 10 minutes' so
			 as to be relatively harmless.
	If the processes are dying as well as replicating, my script may
	need to be rerun a few times.  But, regardless, it soon creates
	enough sleeps under the userid that further replication of all
	errant processes is impossible, so they either all die out
	naturally, or sit around long enough to be killed.

 I have thought of rewriting the script as a C-program.  It would be SUID,
so that anyone could use it.  Basically it would allow a user to type
'exec superkill' to kill all of his processes.  I have never bothered to do
this because the problem does not seem to crop up often enough to go to
the trouble.

-- 
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
  Neil W. Rickert, Computer Science               <rickert@cs.niu.edu>
  Northern Illinois Univ.
  DeKalb, IL 60115.                                  +1-815-753-6940

boyd@necisa.ho.necisa.oz (Boyd Roberts) (11/05/90)

>In article <1119@massey.ac.nz>
>	GEustace@massey.ac.nz (Glen Eustace) writes:
>
>It would appear that kill doesn't lock the process table will killing
>all of the process group, they still kept replicating.
>

On a uni-processor the process table is effectively locked because
no other process can run while the kill() system call is running.
System calls only relinquish the CPU when they want to.  kill() is
not one of them.

If it's a uni-processor and they have the same process group its easy.
A multi-processor or something calling setpgrp() is another problem.
Your process spawning may vary.


Boyd Roberts			boyd@necisa.ho.necisa.oz.au

``November spawned a monster...'' - Morrissey