[net.unix-wizards] fork timing hole and runaway children

trt@rti-sel.UUCP (Tom Truscott) (08/19/85)

By the way, the 'fork() loses signals' problem hit our site
about a month ago, due to a rampantly forking program.
Someone here created a program which forked infinitely:
	while ((pid = fork()) != THEONEIWANT)
		;
	Okay, now I have the pid I want ...
This is one of those classic annoying UNIX problems
that seems to lack a general, simple, portable solution.
If I am wrong, someone please post it along with a man page.
Sorry this article is so long, think of it as either
(a) Verbose evidence that fork() has a bug
(b) A suggestion that this subject be reopened

I was busy reading news at the time, so we let the problem
go until people started complaining that troff was slow
(the load average was over 25).
Then we came up with the following incorrect solution:
	killpg(getpgrp(atoi(argv[1])), SIGKILL);  /* 'killpgrp'?  'getpg'? */
We used 'ps' (well, actually 'top') to get a pid
and gave it to the above program
in an attempt to kill all of the rampant monsters.
There were several flaws in the above program:
0) In general the monsters are in different process groups (we were lucky).
1) Since fork() shields a partial child (fetus?) from SIGKILL,
	we could not kill them all, and the ones that were left
	immediately regenerated.  This *really* slowed things down.
2) After a few rounds of this we fed a bad pid to the above program,
	so getpgrp returned -1, and we did a killpg(-1, SIGKILL).
	I am not sure why, but we found a reboot quite necessary.
	(It solved the problem!)

Over lunch Mike Shaddock decided that if we had done a SIGSTOP
rather than a SIGKILL we might have avoided the embarrassment.
But Tim Seaver (mcnc -- Microelectronics Center of North Carolina)
had a more general solution.  Run (as root) a program
which gobbles MAXUPRC process slots:
	for (i = 0; i < 25; i++)
	    if (fork() == 0) {
		setuid(rampantuid);
		sleep(5*60);	/* you have five minutes to clean up */
	    }
These methods require manual zapping of the monsters,
but at least they (probably) work.
	Tom Truscott

kwlalonde@watmath.UUCP (Ken Lalonde) (08/21/85)

Every term, at least one student writes one of those charming programs
like "for (;;) fork();".  Shooting them manually is tiresome and
sometimes impossible if the cancer has spread, so we added a new system
call:
	zonk(uid, signal)
Sends the given signal to all processes owned by uid, returning a count
of the number of processes found.  You can type "zonk fool" to send
SIGKILL to all of fool's processes.

If there is interest, I'll post the code.

dbr@foxvax5.UUCP (D. B. Robinson ) (08/22/85)

----
You could try to use 'renice' if you are running a 4.2 BSD system.  This
is the way I fixed a similar situation.  Find the username of the user who
started the runnaway, and then type (as root):

      renice +20 -u username

This should cause all of the little beasties (including the user's
shell) to go to sleep on most active systems.  You then can do the
killing at a little less hectic pace.

smk@axiom.UUCP (Steven M. Kramer) (08/25/85)

Rather than a system call names zonk, why not a simple program
(because the function is really too specific for a system call).  It would;

gather the output of ps to find the uid's processes,
do
{
	send SIGSTOP to all of the processes,
	gather the output of ps to find the uid's processes,
} while (ps's view of user's processes remain same || some limit is reached)

send SIGKILL to all of the processes.
-- 
	--steve kramer
	{allegra,genrad,ihnp4,utzoo,philabs,uw-beaver}!linus!axiom!smk	(UUCP)
	linus!axiom!smk@mitre-bedford					(MIL)

david@wisc-rsch.arpa (David Parter) (08/27/85)

summary: a program to gather all the processes running away, stop them then
	kill them...

we have a program, which i beleive came from the net, called 'zap.'
It sends any signal, or changes the 'niceness' of a process or collection
of processes.

It is usuful for this exact problem, which we seem to have a lot of
from one particular programming class every semester.

david
-- 
david parter
UWisc Systems Lab

uucp:	...!{allegra,harvard,ihnp4,seismo, topaz}!uwvax!david
arpa now:	david@wisc-rsch.arpa
arpa soon:	david@wisc-rsch.WISCONSIN.EDU or something like that