jmleonar@CRDC.ARPA ("Dr. Joseph M. Leonard") (05/21/86)
Increased system load has led me to consider splitting the batch queue into a fast and a slow queue. Before I do this, I want some kind of detached process that can (a) determine the job queue of a batch job, (b) determine the amount of CPU time consumed and (c) notify me of jobs that use more that a preset time limit. This would enable me to "enforce" the distinction between the two queues. If my terminology has not given me away, I'm running VMS 4.3 (with 4.4 expected in the near future). Please reply to me directly if you have an idea or two, and I'll summarize if there is a lot of response (or a lot of different responses). Thanks in advance, Joe Leonard <jmleonar@crdc.arpa>y
JMS@ARIZMIS.BITNET.UUCP (05/24/86)
In re: the fellow that wanted to have two batch queues, with some sort of adminstrative system where the system tells him who the hogs are, and he can squash them. It all depends on your system philosophy. In order to provide a minimum of friction and let the system 'run itself,' the University of Arizona has a time limit on batch queues. Thus, whatever image is executing when the time limit expires aborts, and the whole batch job goes down the drain. There's a nice informative message in the batch log for jobs that hit the wall time-limit-wise. I believe that it will only take once for batch queue users to get this message before they begin to use the slower queue for long jobs. The beauty of letting VMS do the work is that the system does the correction and not the 'system manager.' Our experience is that the less a 'system manager' tells the people who own the VAX what to do with their resources, the happier the people who own the VAX (often called 'my users' by system managers) will be. On a related note, it's good to put some reasonable time limit on batch queues ANYWAY; in a University environment without heavy OR people, 24 hours is as good a limit as any. Then, someone who lets loose a batch job with an infinite loop in it (and doesn't know how to kill the job, or doesn't realize) is self-corrected. Typically, we have established a DEFAULT maximum CPU time of 24 hours, and an ABSOLUTE MAXIMUM CPU time of 1 week. Thus, unless you KNOW you're job is going to run long, you don't pay attention to CPULIMIT switches, and the system protects itself and you. jms Joel M Snyder University of Arizona Department of MIS Tucson, Arizona 85721 (602) 621-2748 JMS@ARIZMIS.BITNET