[comp.sys.sgi] batch control

rbriber@POLY1.NIST.GOV (02/03/91)

Sometime ago there was a discussion of a batch process control utility which
allowed the user more control over batch jobs.  I con't remember the name of
this utility or whether anyone had successfully built it under Irix.  We 
were wondering if it would solve our problem.  Our users would like to be  
able to start a job in the background and then logoff (or close the window
that started the job on the graphics console) and then at some later time,
log back in and reset the priorities of that job (for example, stop it 
temporarily and the start it back up).  It seems that once you close the shell
that started a given job the only communication you can have with that job
(from another session) is to kill it.  We talked to the hotline about being
able to temporarily stop and then restart the job from another window
and then restart it (sort of like ^z and then %% from the parent window) but
it doesn't seem possible.  We have a 4D80GT running 3.3.1.

Anyone have any suggestions?


 ------------------------------------------------------------------------
| Adios Amebas,         | "I've tried and tried and I'm still mystified, |
| Rob Briber            |  I can't do it anymore and I'm not satisfied"  |
| NIST 224/B210         |                           -Elvis               |
| Gaithersburg, MD 20899|  rbriber@poly1.nist.gov   (Internet)           |
| (301) 975-6775 (voice)|  rbriber@enh.nist.gov     (Internet)           |
| (301) 975-2128 (fax)  |  rbriber@nbsenh.bitnet    (Bitnet)             |
 ------------------------------------------------------------------------

doelz@urz.unibas.ch (02/05/91)

In article <9102031532.AA15085@poly1.nist.gov>, rbriber@POLY1.NIST.GOV writes:
> 
> were wondering if it would solve our problem.  Our users would like to be  
> able to start a job in the background and then logoff (or close the window

NQS is a product which will do that. Ask your SGI sales rep about it.

> that started the job on the graphics console) and then at some later time,
> log back in and reset the priorities of that job (for example, stop it 
> temporarily and the start it back up).  It seems that once you close the shell

Regardles of the mechanism you used to push it to the background, you can send
kill -STOP tp the pid of interest and kill -CONT to continue again. 
If its not your process, you must be root. 
Dont forget to start the process with nohup(1) if you dont use nqs, otherwise 
it might die if you close the window. 

> that started a given job the only communication you can have with that job
> (from another session) is to kill it.  We talked to the hotline about being
> able to temporarily stop and then restart the job from another window
> and then restart it (sort of like ^z and then %% from the parent window) but
> it doesn't seem possible.  We have a 4D80GT running 3.3.1.
> 
> Anyone have any suggestions?
> 

You could also use a process prioritizer which is a cron of the root. 
This one will npri to a non-degrading priority beyond eden. However, if 
you have a very large job, keep in mind that your memory stays blocked. 
You might need to increase your swpap space to succeed. 

Maybe this helps, 

Regards, 
Reinhard 

dhinds@elaine24.stanford.edu (David Hinds) (02/06/91)

In article <1991Feb4.191058.1353@urz.unibas.ch> doelz@urz.unibas.ch writes:
>In article <9102031532.AA15085@poly1.nist.gov>, rbriber@POLY1.NIST.GOV writes:
>
>Regardles of the mechanism you used to push it to the background, you can send
>kill -STOP tp the pid of interest and kill -CONT to continue again. 
>If its not your process, you must be root. 
>Dont forget to start the process with nohup(1) if you dont use nqs, otherwise 
>it might die if you close the window. 
>
>> that started a given job the only communication you can have with that job
>> (from another session) is to kill it.  We talked to the hotline about being
>> able to temporarily stop and then restart the job from another window
>> and then restart it (sort of like ^z and then %% from the parent window) but
>> it doesn't seem possible.  We have a 4D80GT running 3.3.1.
>> 
    There is also an easy way out.  It seems that if a process's parent id
is 1 (because its parent has terminated), then it will not accept kill -STOP
and kill -CONT from the same user.  But, if you run your batch job via a
shell script, you will lose job control on the script when you log off, but
you will still be able to use STOP/CONT on your batch job, because its
immediate parent (the script) is still around.  All our users have some
sort of script to start a batch job for this reason.  Mine does stuff like
create a scratch directory for temporary links to data files, save standard
output and standard error to unique-named files, and sends me mail with
the error output when the job finishes.

 -David Hinds
  dhinds@cb-iris.stanford.edu