buck@nrl-cmf.UUCP (Loren Buchanan) (06/08/91)
We are in the processes of planning an upgrade of our Cray to Unicos (I know it's about time), and was wondering about using NQS to submit jobs to the Cray from SGI, Sun, VAX, IBM, Stardent, and other computers. Are there reasons to not use NQS? Is there something else we should be using? Are there particular headaches that can be avoided by knowing of past mistakes? Is there public domain versions of the software available, or should we tell management to come up with the money to buy it. Are there third party vendors, and if so who are they? Any parting shots...er...comments? Thanks & B Cing U Buck P.S. Email responses will be collected, munged, and posted. -- Loren Buchanan (buck@caligula.nrl.navy.mil) | #include <standard.disclaimer> NRL Code 5842, 4555 Overlook Ave. | #include <computer.graphics> Washington, DC 20375 (202) 767-3884 | #include <electronic.music> Phone tag, America's fastest growing business sport.
buck@nrl-cmf.UUCP (Loren Buchanan) (06/15/91)
This is the response document to the questions I posed about NQS
last week. I have filtered out most of the noise (and in one case
most of the meat). It appears as though we will start with the
code from COSMIC, 382 East Broad St., Athens GA 30602, or if you
want to call, John A. Gibson, Director, (404) 542-3265. Does
anyone have any experience with the Sterling or General Atomics
versions they would care to share with the rest of us?
From: bernhold@qtp.ufl.edu
>Are there reasons to not use NQS?
Most of the vendors your listed don't (to my knowledge) offer NQS with
their systems. You'll have to get it elsewhere. It is not PD. It
was developed on contract from NASA with public funds. It is sold to
try to recover costs via NASA's COSMIC distribution center. The cost
of the original version of NQS (_not_ what you'll get from Cray!) last
time I checked was $6000. Don't know if you'd get some kind of deal
for being "family".
The current commercial versions of NQS are very nice -- much advanced
over the one to be had from COSMIC (that may have changed by now -- see
below), but either should probably be workable. The one thing you'll
probably need which isn't in the older version is the ability to
specify a remote username to run under (verified with .rhosts, etc.).
Otherwise, there is no facility (in the original NQS) for one userid
to submit the job to run under another userid on the remote machine.
Given a knowledge of the communication protocol between NQS daemons,
this shouldn't be hard to implement in the old code (I say that
without having looked at the old code!). Cavaet: We are running the
original NQS only and haven't yet tried to speak to a machine running
a more current commercial verion -- who knows what may have changed in
the protocols!
About the different versions: The original is definitely available
from COSMIC. With some work, we got it to run on our Suns and FPS.
When I asked for information on NQS a while ago, I was told that a)
the original is being upgraded -- bugs fixed, perhaps _some_ enhanced
capabilities and this may be at COSMIC by now; b) there is a brand new
development, NQS II beginning, which is to rewrite the whole thing
from scratch to address needs which didn't exist when NQS (I) was
designed -- mostly distributed computing, I think. Since NQS is going
to be a POSIX standard too, I imagine, but don't know for sure, that
NQS II will become POSIX-compliant. I think NQS II is expected to be
available from COSMIC also, but I don't know the time frame.
I don't know the legality of it, but there used to be a copy of the
original NQS available from the Convex User Group archive on
permac.space.swri.edu. I checked on it a while after its existance
had been widely announced on the net, and it was still there -- so
either noone who cares heard about it or noone cares or someone is
being stubborn in not removing it. Take it as you will.
I would like to head any more up-to-date information -- particularly
on (a) vendors planning to support NQS and (b) updated versions of NQS
and where to obtain them.
From: jones@hermes.chpc.utexas.edu
You should run NQS on the cray.
You can get a version of NQS from COSMIC (at an one time price) that you can
run on your SIG, SUN, VAX
and Stardent. You may have do some porting. Its not hard once you
understand the source, I ported it to AIX in about two days,
but it will take at least a month of work to get to the point where you can
do this. The nice thing about the COSMIC version is you can do what you
want with it so long as you don't give to foreigner. (You will also
have to modify it to understand cray's tape conventions.)
STERLING SOFTWARE also sells NQS. They sell it by CPU's and they
also have do maintenance on NQS.
I don't know yet if they support the CRAY tapes conventions. They have
ported it to AIX.
You can also check out RQS from cray. It allows you to submit jobs to the
cray NQS and get the output files back.
Bill Jones
From: nash@ucselx.sdsu.edu (Ron Nash)
Here in San Diego, the Cray runs EZBATCH. Here is the manual:
[[[with large chunks of the manual deleted]]]
EZBATCH
Scope EZBATCH discusses the basics of using the Net-
work Queuing System (NQS), the UNICOS batch
facility.
Last Revision May 30, 1991
Documentation To view this document at your terminal, use the
interactive SDSC utility doc:
doc view ezbatch
For a list of other doc options, including
printing your documents, enter
doc
and respond to the prompts, or see the doc man
page.
Consulting For questions about or problems with any SDSC
hardware, software, or facilities, please call the
SDSC consultants at
(619)534-5100
between 0800 and 1700 Pacific time. To send your
questions online, enter the following and respond
to the prompts:
mailx consult
or use your local mail utility to send your
question via Internet mail to the following
Internet address:
consult@y1.sdsc.edu
(c) 1991 General Atomics.
General Atomics gives authorized users
of the San Diego Supercomputer Center
(SDSC) permission to make copies of this
document. Authorized users include
academic, industrial, and government
researchers with SDSC accounts as well
as officials of the National Science
Foundation and the University of
California. This material may not be
used for commercial purposes.
Permission for any other use of this
material and by any other party must be
obtained from General Atomics.
Table of Contents
Page
Documentation Conventions..................................... 1
Introduction.................................................. 2
NQS Requests............................................. 3
NQS Output............................................... 3
NQS Queues.................................................... 4
Batch Queues............................................. 4
Standard Queues..................................... 4
Queues for Large Disk Requirements.................. 5
Test Queue.......................................... 5
Queues for High or Low Priority..................... 5
Table of Batch Queues............................... 7
Pipe Queues.............................................. 8
Table of Pipe Queues................................ 9
Choosing a Queue..............................................10
Choosing a Priority......................................10
Determining Your Job's Memory Requirements...............11
Determining Your Job's Local Disk Requirements...........12
NQS Commands..................................................13
The qsub command.........................................14
Submitting Scripts with Command Options.............14
Useful qsub Options.................................14
Example qsub Command Line...........................16
Specifying qsub Options in the Shell Script.........16
Submitting Shells Interactively.....................17
Message after Successful Submission.................18
Submission Example..................................18
Useful Shell Flags..................................18
The qsmart Utility.......................................20
The qsmart Command Line.............................20
Example qsmart Command Line with Options............21
Interactive qsmart Example..........................21
The qstat Command........................................23
The qstat Command Line..............................23
Default qstat Display...............................24
Using qstat to Examine Your Jobs....................27
Using qstat to Examine the Queue Complexes..........28
The qdel Command.........................................30
The qlimit Command.......................................31
The qmsg Command.........................................32
The qrank Command........................................33
Ranking by Time Submitted and Priority..............33
The qrank Command Line..............................34
Default qrank Display...............................35
Displaying a Single Request.........................36
Displaying Queues and Primary Complexes.............36
Revision History..............................................38
INTRODUCTION
You can run jobs under UNICOS on the Cray Y-MP in three
different ways: interactively in the foreground, interactively
in the background, and in batch. The Network Queueing System
(NQS) is the UNICOS batch facility, which will help you make the
best use of SDSC system resources. By submitting your jobs to
the batch queue, you allow NQS to schedule your job according to
the resources requested and to run it when those resources are
available. By redistributing the load on the system over a 24-
hour period, this scheduling of jobs balances the load during the
day and prevents the machine from idling late at night when the
number of interactive jobs reaches a minimum. NQS also lets you
o Stretch your allocation. When you run jobs
interactively in the foreground or background, you are
charged two times the amount of CPU time you use. By
running in batch, you can reduce the amount you are
charged for each job.
o Checkpoint your program. Jobs run in NQS are
automatically checkpointed. After a sytem shutdown (or
crash), checkpointed jobs continue to run from the last
checkpoint rather than from the beginning, which can
save you from excessive charges and time delays caused
by rerunning your entire job.
o Run jobs that are too large or too small to be run
interactively. Interactive jobs are limited to 6
Mwords of memory, 20 CPU minutes, and 60 Mwords of disk
space. By using NQS, you can run jobs that require up
to 6000 CPU minutes, 32 Mwords of memory, and 1000
Mwords of disk space.
o Continue running your jobs after you logout.
Interactive jobs, including those run in the background
terminate when you logout (unless you specify nohup on
the command line).
Thus endeth the summary or responses (thanks to all who responded, even
if none of your message ended up in this one).
B Cing U
Buck
--
Loren Buchanan (buck@caligula.nrl.navy.mil) | #include <standard.disclaimer>
NRL Code 5842, 4555 Overlook Ave. | #include <computer.graphics>
Washington, DC 20375 (202) 767-3884 | #include <electronic.music>
Phone tag, America's fastest growing business sport.
sean@ms.uky.edu (Sean Casey) (06/17/91)
I wonder how it compares with MDQS? -- ** Sean Casey <sean@s.ms.uky.edu>
benseb@grumpy.sdsc.edu (Booker Bense) (06/17/91)
>|> From: nash@ucselx.sdsu.edu (Ron Nash) >|> >|> Here in San Diego, the Cray runs EZBATCH. Here is the manual: >|> - Umm, I don't mean to be picky , but we actually run the NQS available from Cray Research supplemented with local bug/fixes and modifications. These mods are available for public use as well as our accounting , queued file tranport and other Unicos modifications. A qrank utility is now also completed and will be available shortly. - You can get qsmart as well, but I wouldn't reccommend it. It's a very ugly csh script I wrote in a fit of pique about NQS's lack of graceful shutdown options( then not now). I would be astonished if it ran anywhere else. BWT, ezbatch is the name of the document. We have other ones like ezmath ezc ezdebug .... -The convention is that an ez-doc contains enough to get you started. More complete documentation is also available. - Booker C. Bense prefered: benseb@grumpy.sdsc.edu "I think it's GOOD that everyone NeXT Mail: benseb@next.sdsc.edu becomes food " - Hobbes