[net.unix] Temporarily suspending a job

rbp@investor.UUCP (Bob Peirce) (10/10/86)

Here' an interesting problem I am trying to solve.  We have certain
jobs which, once started, should not be stopped until they finish. 
However, we sometimes get too much of a load and wish these jobs
weren't running.  I am trying to use traps in my shell scripts to put
these jobs to sleep for a while and start them up again later.

I have a solution, but it is not very neat.  I break the script into
sections which can be killed and restarted without problem.  I surround
each section with the following

	trap "sleep 3600; T=0" 16
	T=0
	while [ $T -eq 0 ]
	do
		T=1
		this_section
	done

Actually, one trap at the top suffices except I put messages in the trap
to log what is happening.

To drive this I have a "suspend" script which is where things get messy.
The problem is if I send kill -16 to the program but this_section is
another program the signal is ignored until this_section completes.  To
get around that I run the output of ps through an awk script to
determine the pid of this_section.  I send a kill -16 to the program
followed by a kill -15 to this_section.  That works.

What I woiuld like is some way of sending a signal to all of the children
of a process without knowing their pid.  At the same time, I DON'T want
to send the signal to the parent.  The manuals talk about process groups
and that seemed like a possibility, but I haven't figured out how to
create same let alone whether that is really the solution.

Any ideas would be appreciated.
-- 

	 	    Bob Peirce, Pittsburgh, PA
	    uucp: ...!{allegra, bellcore, cadre, idis}
	  	     !pitt!darth!investor!rbp
			    412-471-5320

	    NOTE:  Mail must be < 30K  bytes/message