[comp.sys.sgi] Batch job systems

srp@babar.mmwb.ucsf.edu (Scott R. Presnell) (12/04/90)

doelz@urz.unibas.ch writes:

>In article <4146@network.ucsd.edu>, slamont@network.ucsd.edu (Steve Lamont) writes:
>> In article <1990Dec2.181344.20040@cunixf.cc.columbia.edu> shenkin@cunixf.cc.columbia.edu (Peter S. Shenkin) writes:
>>>Finally, another way to go would be to implement a batch queue, as Convex
>>>has done.
>> 
>> Small point of information.  The batch queue system (NQS) was developed, to the
>> best of my knowledge, at NASA Ames Research Center Numerical Aerodynamics
>> Simulation Facility for their Crays.  Convex has taken that system, which is
>> pretty nifty, by the way, and packaged it up for use on their systems.  NQS is
>> also available on Crays from Cray Research.  There is also a PD version, I
>> believe that you can get fropm either NASA or COSMIC -- I don't recall which.

>Also available on SGI's - ask your sales rep for the Network Queuing System 
>option. Bad news first: It wont talk to Convexes CXbatch. Good news: Works 

Also available is a batch job package that I wrote.  It handles multiple
hosts, remote queueing, load balanced queueing (between the multiple
hosts), variable methods of job priority (nice and nd-priority), and
multiprocessor control (you can select the CPU on which to run the job).

You can set it up to use rlimits (both hard and soft limits are
accessible), or you can use a lower level kernel reading routine that I
implemented to get at the current CPU time all the processes that started
under that job have used (rlimits are associated with a particular process,
the second scheme is associated with all processes under a specific job).

The system also dumps accounting information to specified files.

Its set up like the lpr/lpd system in that it uses a batchcap file -
analogous to printcap.  I realized after the fact that this was similar to
the convex scheme.  We use it regularly at our site so I can say with some
confidence that it works.  Its configurable with a "Configure" script a la
perl and rn.

I'm at the stage where I'm trying to get it running on some different
platforms (right now it compiles under Sun OS4.1, and it seems to work on a
stardent).

I guess the things this system has over NQS for the SGI crowd is: 

	This was developed on a SGI 4D for a SGI 4D.

	I think its somewhat more compact than the NQS system, both in the
source tree and in the resources the daemon requires.  It also seems less
cumbersome to me.

	You get the source, so you can hack on it.  If you want.

	- Scott
--
Scott Presnell				        +1 (415) 476-9890
Pharm. Chem., S-926				Internet: srp@cgl.ucsf.edu
University of California			UUCP: ...ucbvax!ucsfcgl!srp
San Francisco, CA. 94143-0446			Bitnet: srp@ucsfcgl.bitnet