[net.sources] load control system

muller@sdcc3.UUCP (Keith Muller) (02/10/85)

This is the first shar file of 8 for the load control system as described
in net.unix-wizards and net.sources.

NOTE: This shar file MUST be unpacked berfore any of the others. It creates
      several subdirectories which the seven other shar files require. Make
      a subdirectory for the load control system source and unpack all eight
      shar files in it.

	Keith Muller
	ucbvax!sdcsvax!muller

# This is a shell archive.  Remove anything before this line,
# then unpack it by saving it in a file and typing "sh file".
#
# Wrapped by sdcc3!muller on Sat Feb  9 13:40:15 PST 1985
# Contents:  client/ control/ h/ scripts/ server/ man/ README NOTICE Makefile
#	man/Makefile man/ldc.8 man/ldd.8 man/ldq.1 man/ldrm.1
 
echo x - README
sed 's/^@//' > "README" <<'@//E*O*F README//'
TO INSTALL: (you MUST be root) (January 24, 1985 version)

1) Select a group id for load control system to use. No user should be in this
   group. Add this group to /etc/groups and call it lddgrp.
   ** By default the group id 25 is used. **

2) Look at the file h/common.h. Make sure that LDDGID is defined to be the
   same group id as you selected in step 1.

3) cd to the scripts directory. Inspect the paths used in the file makedirs.
   The script makedirs creates the required directories with the proper modes
   groups and owners. The .code directories are where the real executable
   files are hidden, protected by group access (the directory is protected
   from all "other" access). Each directory which contains programs that you
   want load controlled must have a .code subdirectory.

   NOTE: You really do not have to change makedirs at all except to ADD
   any additional directories you want controlled. It is perfectly safe to
   just run this system on any 4.2 system without ANY path changes (this
   includes sun, vax and pyramid versions).

4) If you alter or add any pathnames in makedirs, you might have to adjust
   the makefiles. For each subdirectory (client, server, control) adjust
   or add the paths in the Makefiles. 

5) If you alter any pathname in makedirs you will have to check all the h
   files in the directory h. Change any paths as required. 

6) run makedirs (if you have an older release of ldd: You should shut down
   the ldd server and remove the old status and errlog file. Then run 
   makedirs.) Makedirs can be run any number of times without harm. It will
   reset the owners and groups of all directories to the correct state.

7) In the top level directory (The same directory as this README file is in),
   run make. then make install. All the binaries are now in place.

8) Start the ldd server:
	/etc/ldd [-T cycle] [-L load]

   The server will detach itself and wait for requests. You should get no
   messages from the server. The two flags are optional. The -L flag
   specifies the number of seconds between each load average check. The
   -L flag specifies the load average queueing starts. If neither are
   specified the defaults are used. (see the manual page for ldd). You
   can change the defaults by editing h/server.h. ALRMTIME is the cycle
   time, and MAXLOAD is the load average.

   The following are good values to start with:

   machine		cycle 			load
   ----------------------------------------------------------
   pyramid 90x		25			10.0
   pyramid 90mx		15			15.0
   vax 780		50			9.0
   vax 750		60			7.5
   vax 730		60			6.0
   sun 2		60			6.5

9) add the following lines to /etc/rc.local (change path and add any ldd
   arguements as selected from the above table). See the man page on ldd
   for more info.

if [ -f /etc/ldd ]; then
	/bin/rm -f /usr/spool/ldd/sr/errors
	/etc/ldd & echo -n ' ldd'			>/dev/console
fi

10) for each directory to be controlled select those programs you want under
    the load control system. The programs you select should be jobs that 
    usually do not require user interaction, though nasty systems like macsyma
    might be load controlled anyway. Never load control things that have time
    response requirements. The jobs you select will determine the overall
    usefullness of the load control system. For the load control system to
    be completely effective, all the programs that cause any significant load
    on the system should be placed under load control. For example the cc
    command is a very typical of a program that should be load controlled.
    When run, cc uses large amount of resources which increases as the size
    of the program being compiled increases. When there are many cc's running
    simultaneously the machine gets quite overloaded and your system thrashes.
    A poor choice would be a command like cat. Sure cat can do a lot of i/o,
    but even ten cat's reading very large files do not impact the system
    very much. Troff is a very good command to load control. It is not very 
    interactive, and a lot of them running would bring even slow a cray.
    Watching your system when it is overoaded with ps au should tell you which
    programs on your system need to be load controlled.

    The following is a list of programs I have under load control:

    /bin/cc /bin/make /bin/passwd /usr/bin/pc /usr/bin/pix /usr/bin/liszt
    /usr/bin/lisp /usr/bin/vgrind /usr/ucb/f77 /usr/ucb/lint /usr/ucb/nroff
    /usr/ucb/spell /usr/ucb/troff /usr/ucb/yacc

    The following is the list of places to look for other candidates for load
    control:
	a) /bin
	b) /usr/bin
	c) /usr/ucb
	d) /usr/new
   	e) /usr/local
	f) /usr/games

    i)  some programs use argv[0] to pass data (so far only the ucb pi
	does this when called by pix). These programs must be treated
	differently (since they mangle argv[0], it cannot be used to
	determine which binary to execute). A special client called
	.NAMEclient where NAME is the actual name of the program must be
	created. These special programs must be specified in the 
	client/Makefile.  See the sample for $(SPEC1) which is for a program
	called test in /tmp. Run the script onetime/saddldd for these programs.

    ii) run the script scripts/addldd with each program to be load controlled
	that requires a STATUS MESSAGE ("Queued waiting to run.") as an
	arguement (i.e. addldd /bin cc make)

    iii)run the script scripts/qaddldd with each program to be load controlled
	that DOES NOT require a STATUS MESSAGE as an arguement
	(i.e. qaddldd /usr/bin nroff)

    addldd/qaddldd/saddldd moves the real binary into the .code file and
    replaces it with a "symbolic link" to either .client (for addldd and
    qaddldd) or a .NAMEclient (for saddldd) So the command:
	addldd /bin cc
    moves cc to /bin/.code/cc and creates the symbolic link /bin/cc
    to /bin/.client.

11) any changes to any file in the load control system from now on
    will be correctly handled by a make install from the top level directory.

12) the script script/rmldd can be used to remove programs from the ldd system.

13) Compilers like cc and pc should have all the intermediate passes protected.
    Each pass must be in group lddgrp and have the others access turned off
    For example:
	chmod 0750 /lib/c2
	chgrp lddgrp /lib/c2

14) When the system is running you might have to adjust the operating 
    parameters of ldd for the job mix and the capacity of your machine.
    Use ldc to adjust these parameters while the load control system is
    running and watch what happens. The .h files as supplied use values that
    will safely work on any machine, but might not be best values for your
    specific needs. In the vast majority of cases, only the load point
    and cycle time need to be changed and these can be set with arguements to
    ldd when it is first invoked.  Be careful as radical changes to
    the defaults might make defeat the purpose of ldd. If things ever get
    really screwed up, you can just kill -9 the server (or from ldc: abort
    server) and things will run just like the load control doesn't exsist.
    (Note the pid of the currently running ldd is always stored in the lock
    file "spool/ldd/sr/lock"). (See the man page on ldd for more).

15) If load control does not stop the system load to no more than the load
    limit + 2.5 then there are programs that are loading down the machine
    which are not under load control. Find out what they are and load control
    them. 

16) To increase the response of the system you can lower the load threshold.
    Of course if the threshold gets too low the system can end up with long
    wait times for running. Long wait times are usually around 3000 seconds
    for super loaded vaxes. On the very fast pyramids, 500 seconds (48 users
    and as many large cc as the students can get running) seems the longest
    delay I have seen. You can also play with the times between checks. This
    has some effect on vaxes but 50 - 60 seconds seems optimal. On pyramids
    it is quite different. Since the throughput is so very much greater
    than vaxes (four times greater at the very least), the load needs to be
    checked at least every 25 seconds. If this check time is too long you
    risk having the machine go idle for a number of seconds. Since the whole
    point is to squeeze out every last cpu cycle out of the machine, idle
    time must be avoided. Watching the machine with vmstat or the mon program
    is useful for this. Try to keep the user percentage of the cpu as high
    as possible. Try to have enough jobs runnable so the machine doesn't
    go idle do to a lack of jobs (yes this can happen with lots of disk io).

17) If you want/need more info on the inner workings of the ldd system, you
    can read the comments in the .h files and the source files. If you have
    problems drop me a line. I will be happy to answer any questions.

    Keith Muller
    University of California, San Diego
    Mail Code C-010
    La Jolla, CA  92093
    ucbvax!sdcsvax!muller
    (619) 452-6090
@//E*O*F README//
chmod u=r,g=r,o=r README
 
echo x - NOTICE
sed 's/^@//' > "NOTICE" <<'@//E*O*F NOTICE//'
DISCLAIMER
  "Although each program has been tested by its author, no warranty,
  express or implied, is made by the author as to the accuracy and
  functioning of the program and related program material, nor shall
  the fact of distribution constitute any such warranty, and no
  responsibility is assumed by the author in connection herewith."
  
  This program cannot be sold, distributed or copied for profit, without
  prior permission from the author. You are free to use it as long the
  author is properly credited with it's design and implementation.

  Keith Muller
  Janaury 15, 1985 
  San Diego, CA
@//E*O*F NOTICE//
chmod u=r,g=r,o=r NOTICE
 
echo x - Makefile
sed 's/^@//' > "Makefile" <<'@//E*O*F Makefile//'
#
#	Makefile for ldd server and client 
#
#

all:
	cd server; make ${MFLAGS}
	cd client;  make ${MFLAGS}
	cd control;  make ${MFLAGS}

lint: 
	cd server; make ${MFLAGS} lint
	cd client;  make ${MFLAGS} lint
	cd control;  make ${MFLAGS} lint

install: 
	cd server; make ${MFLAGS} install
	cd client;  make ${MFLAGS} install
	cd control;  make ${MFLAGS} install
	cd man; make ${MFLAGS} install

clean:
	cd server; make ${MFLAGS} clean
	cd client;  make ${MFLAGS} clean
	cd control;  make ${MFLAGS} clean
@//E*O*F Makefile//
chmod u=r,g=r,o=r Makefile
 
echo mkdir - client
mkdir client
chmod u=rwx,g=rx,o=rx client
 
echo mkdir - control
mkdir control
chmod u=rwx,g=rx,o=rx control
 
echo mkdir - h
mkdir h
chmod u=rwx,g=rx,o=rx h
 
echo mkdir - scripts
mkdir scripts
chmod u=rwx,g=rx,o=rx scripts
 
echo mkdir - server
mkdir server
chmod u=rwx,g=rx,o=rx server
 
echo mkdir - man
mkdir man
chmod u=rwx,g=rx,o=rx man
 
echo x - man/Makefile
sed 's/^@//' > "man/Makefile" <<'@//E*O*F man/Makefile//'

#
# Makefile for ldd manual pages
#

DEST=	/usr/man

TARG=	$(DEST)/man8/ldd.8 $(DEST)/man8/ldc.8 $(DEST)/man1/ldrm.1 \
	$(DEST)/man1/ldq.1

all:

install: $(TARG)

$(DEST)/man8/ldd.8: ldd.8
	install -c -o root ldd.8 $(DEST)/man8

$(DEST)/man8/ldc.8: ldc.8
	install -c -o root ldc.8 $(DEST)/man8

$(DEST)/man1/ldrm.1: ldrm.1
	install -c -o root ldrm.1 $(DEST)/man1

$(DEST)/man1/ldq.1: ldq.1
	install -c -o root ldq.1 $(DEST)/man1

clean:
@//E*O*F man/Makefile//
chmod u=r,g=r,o=r man/Makefile
 
echo x - man/ldc.8
sed 's/^@//' > "man/ldc.8" <<'@//E*O*F man/ldc.8//'
@.TH LDC 8 "24 January 1985"
@.UC 4
@.ad
@.SH NAME
ldc \- load system control program
@.SH SYNOPSIS
@.B /etc/ldc
[ command [ argument ... ] ]
@.SH DESCRIPTION
@.I Ldc
is used by the system administrator to control the
operation of the load control system, by sending commands to
@.I ldd
(the load control server daemon).
@.I Ldc
may be used to:
@.IP \(bu
list all the queued jobs owned by a single user,
@.IP \(bu
list all the jobs in the queue,
@.IP \(bu
list the current settings of changeable load control server parameters,
@.IP \(bu
abort the load control server,
@.IP \(bu
delete a job from the queue (specified by pid or by user name),
@.IP \(bu
purge the queue of all jobs,
@.IP \(bu
rearrange the order of queued jobs,
@.IP \(bu
run a job regardless of the system load (specified by pid or user name),
@.IP \(bu
change the load average at which jobs will be queued,
@.IP \(bu
change the limit on the number of jobs in queue,
@.IP \(bu
change the number of seconds between each check on the load average,
@.IP \(bu
print the contents of the servers error logging file,
@.IP \(bu
change the maximum time limit that a job can be queued.
@.PP
Without any arguments,
@.I ldc
will prompt for commands from the standard input.
If arguments are supplied,
@.IR ldc
interprets the first argument as a command and the remaining
arguments as parameters to the command.  The standard input
may be redirected causing
@.I ldc
to read commands from a file.
Commands may be abbreviated, as any unique prefix of a command will be
accepted.
The following is the list of recognized commands.
@.TP
? [ command ... ]
@.TP
help [ command ... ]
@.br
Print a short description of each command specified in the argument list,
or, if no arguments are given, a list of the recognized commands.
@.TP
abort server
@.br
Terminate the load control server.
This does 
@.I not
terminate currently queued jobs, which will run when they
next poll the server (usually every 10 minutes).
If the server is restarted these jobs will be inserted into the queue ordered
by the time at which the job was started.
Jobs will 
@.I not
be lost by aborting the server.
Both words "abort server" must by typed (or a unique prefix) as a safety
measure.
Only root can execute this command.
@.TP
delete [\f2pids\f1] [-u \f2users\f1]
@.br
This command has two modes. It will delete jobs listed by pid, or with the
@.B \-u
option delete all the jobs owned by the listed users.
Job that are removed from the queue will exit returning status 1 (they
do not run).
Users can only delete jobs they own from the queue, while root can delete any
job.
@.TP
errors
@.br
Print the contents of the load control server error logging file.
@.TP
list [\f2user\f1]
@.br
This will list the contents of the queue, showing each jobs rank, pid,
owner, time in queue, and a abbreviated line of the command to be executed
for the specified user. If no user is specifies, it defaults to be the
user running the command. (Same as the ldq command).
@.TP
loadlimit \f2value\f1
@.br
Changes the load average at which the load control system begins
to queue jobs to \f2value\f1.
Only root can execute this command.
@.TP
longlist
@.br
Same as list except prints ALL the jobs in the queue. This is expensive to
execute. (Same as the ldq -a command).
@.TP
move \f2pid rank\f1
@.br
Moves the process specified by process id 
@.I pid
to position 
@.I rank
in the queue.
Only root can execute this command.
@.TP
purge all
@.br
Removes ALL the jobs from the queue. Removed jobs terminate returning a
status of 1.
As a safety measure both the words "purge all" (or a prefix of) must be typed.
Only root can execute this command.
@.TP
quit
@.br
Exit from ldc.
@.TP
run [\f2pids\f1] [-u \f2users\f1]
@.br
Forces the jobs with the listed 
@.I pids
to be run 
@.I regardless 
of the system load.
The
@.B \-u
option forces all jobs owned by the listed users to be run regardless
of the system load.
Only root can execute this command.
@.TP
sizeset \f2size\f1
@.br
Sets the limit on the number of jobs that can be in the queue to be
@.I size.
This prevents the unix system process table from running out of slots if
the system is extremely overloaded. All job requests that are made while
the queue is at the limit are rejected and told to try again later.
The default value is 150 jobs.
Only root can execute this command.
@.TP
status
@.br
Prints the current settings of internal load control server variables.
This includes the number of jobs in queue, the load average above which
jobs are queued, the limit on the size of the queue, the time in seconds between
load average checks by the server, the maximum time in seconds a job can be
queued, and the number of recoverable errors detected by the server.
@.TP
timerset \f2time\f1
@.br
Sets the number of seconds that the server waits between system load average
checks to
@.I time.
(Every 
@.I time
seconds the server reads the current load average and if it below the load
average limit (see 
@.I loadlimit
) the jobs are removed from the front of the queue and told to run).
Only root can execute this command.
@.TP
waitset \f2time\f1
@.br
Sets the maximum number of seconds that a job can be queued regardless
of the system load to 
@.I time
seconds.
This will prevent the load control system from backing up with jobs that never
run do to some kind of degenerate condition.
@.SH EXAMPLES
To list the jobs owned by user joe:
@.sp
list joe
@.sp
To move process 45 to position 6 in the queue:
@.sp
move 45 6
@.sp
To delete all the jobs owned by users sam and joe:
@.sp
delete -u sam joe
@.sp
To run jobs with pids 1121, 1177, and 43:
@.sp
run 1121 1177 43
@.SH FILES
@.nf
/usr/spool/ldd/*	spool directory where sockets are bound
@.fi
@.SH "SEE ALSO"
ldd(8),
ldrm(1),
ldq(1)
@.SH DIAGNOSTICS
@.nf
@.ta \w'?Ambiguous command      'u
?Ambiguous command	abbreviation matches more than one command
?Invalid command	no match was found
?Privileged command	command can be executed by only by root
@.fi
@//E*O*F man/ldc.8//
chmod u=r,g=r,o=r man/ldc.8
 
echo x - man/ldd.8
sed 's/^@//' > "man/ldd.8" <<'@//E*O*F man/ldd.8//'
@.TH LDD 8 "24 January 1985"
@.UC 4
@.ad
@.SH NAME
ldd \- load system server (daemon)
@.SH SYNOPSIS
@.B /etc/ldd
[ 
@.B \-L 
@.I load
] [ 
@.B \-T
@.I alarm 
]
@.SH DESCRIPTION
@.TP
@.B \-L
changes the load average threshold to
@.I load
instead of the default (usually 10).
@.TP
@.B \-T
changes the time (in seconds) 
between load average checks to 
@.I alarm
seconds instead of the default (usually 60 seconds).
@.PP
@.I Ldd
is the load control server (daemon) and is normally invoked
at boot time from the
@.IR rc.local (8)
file.
The
@.I ldd
server attempts to maintain the system load average
below a preset value so interactive programs like
@.IR vi (1)
remain responsive.
@.I Ldd
works by preventing the system from thrashing
(i.e. excessive paging and high rates of context switches) and decreasing the
systems throughput by limiting the number runnable processes in the system
at a given moment.
When the system load average 
is above the threshold,
@.I ldd
will block specific cpu intensive processes from running and place
them in a queue.
These blocked jobs are not runnable and therefore do not 
contribute to the system load. When the load average drops below the threshold,
@.I ldd
will remove jobs from the queue and allow them to continue execution.
The system administration determines which programs are 
considered cpu intensive and places control of their execution under the
@.I ldd
server.
The system load average is the number of runnable processes,
and is measured by the 1 minute 
@.IR uptime (1)
statistics.
@.PP
A front end client program replaces each program controlled by the
@.I ldd
server.
Each time a user requests execution of a controlled program, the
client enters the request state,
sends a "request to run" datagram to the server and waits for a response. The
waiting client is blocked, waiting for the response from the
@.I ldd
server.
If the client does not receive an answer to a request after a certain
period of time has elapsed (usually 90 seconds), the request is resent.
If the request is resent a number of times (usually 3) 
without response from the server, the requested program is executed. 
This prevents the process from being blocked forever if the
@.I ldd
server fails.
@.PP
The
@.I ldd
server can send one of five different messages to the client.
A "queued message" indicates that the client has
been entered into the queue and should wait.
A "poll message" indicates that the server did not receive a message,
so the client should resend the message.
A "terminate message" indicates that the request cannot be honored
and the client should exit abnormally.
A "run message" indicates the requested program should be run.
A "full message" indicates that the ldd queue is full and this request cannot
be accepted. This limit is to prevent the Unix kernel process table from
running out of slots, since queued processes 
still use system process slots.
@.PP
When the server receives a "request to run",
it determines whether the job should run immediately, be rejected, 
or be queued.
If the queue is full, the job is rejected and the client exits.
If the queue is not empty, the request is added to the queue,
and the client is sent a "queued message" 
The client then enters the queued state
and waits for another command from the server.
If no further commands are received from the server after a preset time 
has elapsed (usually 10 minutes),
the client re-enters the request state and resends the request
to the server to ensure that the server has not terminated or
failed since the time the client was queued.
@.PP
If the queue is empty, the server checks the current load average, and
if it is below the threshold, the client is sent a "run message".
Otherwise the server queues the request, sends the client a "queued message",
and starts the interval timer.
The interval timer is bound to a handler that checks the system load every
few seconds (usually 60 seconds). 
If the handler finds the current load average is below the threshold,
jobs are removed from the head of the queue and sent a "run message".
The number of jobs sent "run messages" depends on how much the current 
load average has dropped below the limit.
If the load average is above the threshold, the handler checks
how long the oldest process has been waiting to run.
If that time is greater than a preset limit (usually 4 hours), the job is 
removed from the queue and allowed to run regardless of the load.
This prevents jobs from being blocked forever due to load averages that
remain above the threshold for long periods of time.
If the queue becomes empty, the handler will shut off the interval timer. 
@.PP
The
@.I ldd
server logs all recoverable and unrecoverable errors in a logfile. Advisory
locks are used to prevent more than one executing server at a time.
When the
@.I ldd
server first begins execution, it scans the spool directory for clients that
might have been queued from a previous
@.I ldd
server and sends them a "poll request". 
Waiting clients will resend their "request to run" message to the new
server, and re-enter the request state.
The
@.I ldd
server will rebuild the queue of waiting tasks 
ordered by the time each client began execution.
This allows the
@.I ldd
server to be terminated and be re-started without
loss or blockage of any waiting clients.
@.PP
The environment variable LOAD can be set to "quiet", which will
surpress the output to stderr of the status strings "queued" 
and "running" for commands which have been set up to display status.
@.PP
Commands can be sent to the server with the
@.IR ldc (8)
control program. These commands can manipulate the queue and change the
values of the various preset limits used by the server.
@.SH FILES
@.nf
@.ta \w'/usr/spool/ldd/sr/msgsock           'u
/usr/spool/ldd	ldd spool directory
/usr/spool/ldd/sr/msgsock	name of server datagram socket
/usr/spool/ldd/sr/cnsock	name do server socket or control messages
/usr/spool/ldd/sr/list		list of queued jobs (not always up to date)
/usr/spool/ldd/sr/lock	lock file (contains pid of server)
/usr/spool/ldd/sr/errors	log file of server errors
@.fi
@.SH "SEE ALSO"
ldc(8),
ldq(1),
ldrm(1).
@//E*O*F man/ldd.8//
chmod u=r,g=r,o=r man/ldd.8
 
echo x - man/ldq.1
sed 's/^@//' > "man/ldq.1" <<'@//E*O*F man/ldq.1//'
@.TH LDQ 1 "24 January 1985"
@.UC 4
@.SH NAME
ldq \- load system queue listing program
@.SH SYNOPSIS
@.B ldq
[
@.I user
] [
@.B \-a
]
@.SH DESCRIPTION
@.I Ldq
is used to print the contents of the queue maintained by the
@.IR ldd (8)
server.
For each job selected by
@.I ldq
to be printed, the rank (position) in the queue, the process id, the owner of
the job, the number of seconds the job has been waiting to run, and the
command line of the job (truncated in length to the first 16 characters)
is printed.
@.PP
With no arguments,
@.I ldq
will print out the status of the jobs in the queue owned by the user running
@.I ldq.
Another users jobs can be printed if that user is specified as an argument
to
@.I ldq.
The
@.B \-a
option will print all the jobs in the queue.
Of course the
@.B \-a
option is much more expensive to run.
@.PP
Users can delete any job they own by using either the
@.IR ldrm (1)
or
@.IR ldc (8)
commands.
@.SH FILES
@.nf
@.ta \w'/usr/spool/ldd/cl/*            'u
/usr/spool/ldd/cl/*	the spool area where sockets are bound
@.fi
@.SH "SEE ALSO"
ldrm(1),
ldc(8),
ldd(8)
@.SH DIAGNOSTICS
This command will fail if the
@.I ldd
server is not executing.
@//E*O*F man/ldq.1//
chmod u=r,g=r,o=r man/ldq.1
 
echo x - man/ldrm.1
sed 's/^@//' > "man/ldrm.1" <<'@//E*O*F man/ldrm.1//'
@.TH LDRM 1 "24 January 1985"
@.UC 4
@.SH NAME
ldrm \- remove jobs from the load system queue
@.SH SYNOPSIS
@.B ldrm
[
@.I pids
] [
@.B \-u
@.I users
]
@.SH DESCRIPTION
@.I Ldrm
will remove a job, or jobs, from the load control queue.
Since the server is protected, this and
@.IR ldc (8)
are the only ways users can remove jobs from the load control spool (other
than killing the waiting process directly).
When a job is removed, it will terminate returning status 1.
This method is preferred over sending a kill -KILL to the process as the
job will be removed from the queue, and will no longer appear in
lists produced by
@.IR ldq (1)
or
@.IR ldc (8).
@.PP
@.I Ldrm
can remove jobs specified either by pid or by user name.
With the
@.B \-u
flag,
@.I ldrm
expects a list of users who will have all their jobs removed from the
load control queue.
When given a list of pid's,
@.I ldrm
will remove those jobs from the queue.
A user can only remove jobs they own, while root can remove any job.
@.SH EXAMPLES
To remove the two jobs with pids 8144 and 47:
@.sp
ldrm 8144 47
@.sp
To remove all the jobs owned by the users joe and sam:
@.sp
ldrm -u joe sam
@.SH FILES
@.nf
@.ta \w'/usr/spool/ldd/cl/*   'u
/usr/spool/ldd/cl/*	directory where sockets are bound
@.fi
@.SH "SEE ALSO"
ldq(1),
ldc(8),
ldd(8)
@.SH DIAGNOSTICS
``Permission denied" if the user tries to remove files other than his
own.
@//E*O*F man/ldrm.1//
chmod u=r,g=r,o=r man/ldrm.1
 
echo Inspecting for damage in transit...
temp=/tmp/shar$$; dtemp=/tmp/.shar$$
trap "rm -f $temp $dtemp; exit" 0 1 2 3 15
cat > $temp <<\!!!
     182    1518    9101 README
      14      96     613 NOTICE
      25      76     502 Makefile
      27      52     439 Makefile
     215    1075    5877 ldc.8
     168    1045    6106 ldd.8
      55     221    1145 ldq.1
      59     261    1362 ldrm.1
     745    4344   25145 total
!!!
wc  README NOTICE Makefile man/Makefile man/ldc.8 man/ldd.8 man/ldq.1 man/ldrm.1 | sed 's=[^ ]*/==' | diff -b $temp - >$dtemp
if [ -s $dtemp ]
then echo "Ouch [diff of wc output]:" ; cat $dtemp
else echo "No problems found."
fi
exit 0

muller@sdcc3.UUCP (Keith Muller) (02/12/85)

This is part 6 of the the load control system. Part 1 must be unpacked before
any other part.
	Keith Muller
	ucbvax!sdcsvax!muller


# This is a shell archive.  Remove anything before this line,
# then unpack it by saving it in a file and typing "sh file".
#
# Wrapped by sdcc3!muller on Sat Feb  9 13:56:47 PST 1985
# Contents:  server/Makefile server/data.c server/globals.c server/main.c
 
echo x - server/Makefile
sed 's/^@//' > "server/Makefile" <<'@//E*O*F server/Makefile//'
#
# Makefile for batch server
#

CFLAGS= -O

BGID=	lddgrp

DEST=	/etc

HDR=	../h/common.h ../h/server.h

SRC=	main.c data.c globals.c setup.c commands.c

OBJ=	main.o data.o globals.o setup.o commands.o

all:	ldd

ldd:    $(OBJ)
	cc -o ldd $(OBJ)

$(OBJ):	 $(HDR)

install: $(DEST)/ldd

$(DEST)/ldd: ldd
	install -c -m 700 -o root -g $(BGID) ldd $(DEST)

clean:
	rm -f $(OBJ) core ldd

lint:
	lint -abchx $(SRC)
@//E*O*F server/Makefile//
chmod u=r,g=r,o=r server/Makefile
 
echo x - server/data.c
sed 's/^@//' > "server/data.c" <<'@//E*O*F server/data.c//'

/*-------------------------------------------------------------------------
 * data.c - server
 *
 * routines that deal with the data structures maintained by the server.
 * the server uses a double linked list with qhead pointing at the head
 * and qtail pointing at the tail. if the queue is not empty then
 * qhead->back is always QNIL and qtail->fow is always QNIL. Insertions
 * also require that the time field increase (older to younger) from qhead
 * to qtail. 
 *
 * NOTE: that when nodes are added to the free list only the fow
 * link is altered so procedures that search through the list with the
 * intention of calling rmqueue must search from qtail to qhead because
 * rmqueue will destroy the nodes fow link.
 *-------------------------------------------------------------------------
 */

/* $Log$ */

#include "../h/common.h"
#include "../h/server.h"

extern struct qnode *qhead;
extern struct qnode *qtail;
extern struct qnode *freequeue;
extern int qcount;
extern int newlist;
extern int newstatus;

/*------------------------------------------------------------------------
 * rmqueue
 *
 * remove the node pointed at by work from the double linked list.
 *------------------------------------------------------------------------
 */
rmqueue(work)
struct qnode *work;
{
	/*
	 * set flags to indicate the list and status files are out of date
	 */
	newlist = 1;
	newstatus = 1;
	qcount--;

	/*
	 * splice the job out of the queue
	 */
	if (work->back == QNIL)
		qhead = work->fow;
	if (work->fow == QNIL)
		qtail = work->back;
	if (work->fow != QNIL)
		(work->fow)->back = work->back;
	if (work->back != QNIL)
		(work->back)->fow = work->fow;
	work->fow = freequeue;
	freequeue = work;
}

/*-------------------------------------------------------------------------
 * addqueue
 *
 * add a node to the queue if it is not already in it.
 * note that when clients poll the server to see if it is still alive they
 * send another "queue" command. This is why addqueue must 
 * check if the job is still queued.
 *-------------------------------------------------------------------------
 */
addqueue(work)
struct request *work;
{
	register struct qnode *spot;
	register struct qnode *spot2;
	register struct qnode *ptr;
	extern int full;
	extern char *malloc();
	extern char *strcpy();

	/*
	 * find the place in the queue for this request. The
	 * time field is used for this oldest requests belong closer
	 * to the head of the queue.
	 */
	for (spot = qtail; spot != QNIL; spot = spot->back){
		/*
		 * it might be already in the queue as a client
		 * is just polling the server to see if the server is
		 * still alive
		 */
		if (spot->pid == work->pid)
			return(1);

		/*
		 * check to see if this job is older
		 */
		if (work->time > spot->time)
			break;
	}

	/*
	 * At this point, job is not in the queue at the correct point.
	 * either is a new job or a client checking to see if server is
	 * alive. If this is a check, look for job higher up in the queue.
	 */
	if (work->type != POLLCMD){
		/*
	 	 * at this point the node is a new one, reject if the
	 	 * queue is full.
	 	 */
		if (qcount >= full)
			return(-2);
	}else if (spot != QNIL){
		/*
		 * this job is just checking up to see if it is still
		 * queued.
		 */
		for (spot2 = spot->back; spot2 != QNIL; spot2 = spot2->back){
			/*
			 * job must have been moved
			 */
			if (spot2->pid == work->pid)
				return(1);
		}

		/*
		 * at this point the job is missing. it should have
		 * been in the queue. so put it back.
		 */
	}

	/*
	 * allocate space for qnode, check freelist first
	 */
	if (freequeue == QNIL)
		ptr = (struct qnode *)malloc(sizeof(struct qnode));
	else{
		ptr = freequeue;
		freequeue = ptr->fow;
	}
	if (ptr == QNIL){
		errlog("no space for a qnode");
		return(-1);
	}

	/*
	 * copy in the data from the datagram
	 */
	ptr->pid = work->pid;
	ptr->uid = work->uid;
	ptr->time = work->time;
	(void)strcpy(ptr->com, work->com);

	/*
	 * special case if queue was empty
	 */
	if (qcount == 0){
		if ((qhead != QNIL) || (qtail != QNIL)){
			errlog("Addqueue: qcount should not be 0");
			cleanup();
		}
		qhead = qtail = ptr;
		ptr->fow = ptr->back = QNIL;
		newlist = 1;
		newstatus = 1;
		qcount = 1;
		return(0);
	}
	/*
	 * do two integrity checks, yes we are paranoid
	 */
	if (qhead == QNIL){
		errlog("Addqueue: qhead should not be QNIL");
		cleanup();
	}
	if (qtail == QNIL){
		errlog("Addqueue: qtail should not be QNIL");
		cleanup();
	}

	/*
	 * if spot == qhead, belongs at very beginning of queue
	 */
	if (spot == QNIL){
		qhead->back = ptr;
		ptr->fow = qhead;
		ptr->back = QNIL;
		qhead = ptr;
	}else{
		/*
		 * insert into the queue
		 */
		ptr->fow = spot->fow;
		ptr->back = spot;
		if (spot->fow != QNIL)
			(spot->fow)->back = ptr;
		else
			qtail = ptr;
		spot->fow = ptr;
	}
	/*
	 * change newlist to show queue has changed
	 */
	newlist = 1;
	newstatus = 1;
	qcount++;
	return(1);
}


/*-------------------------------------------------------------------------
 * movequeue
 *
 * move the job pid to posistion pos in the queue. Note to maintain
 * insertion date requirements, the time field in the moved job is
 * altered.
 *-------------------------------------------------------------------------
 */
movequeue(pos,pid)
u_long pos;
u_long pid;
{
	register struct qnode *ptr;
	register struct qnode *work;
	extern int qcount;

	work = QNIL;
	for (ptr = qhead; ptr != QNIL; ptr = ptr->fow){
		/*
		 * look for the requested node, set work to point
		 */
		if (ptr->pid == pid){
			work = ptr;
			break;
		}
	}

	/*
	 * if not found return -1 as no such pid, or return 0
	 * if only one job queued
	 */
	if (work == QNIL)
		return(-1);
	if (qcount == 1)
		return(0);

	/*
	 * set ptr to point a position to move work to
	 * note: first position in queue is 1 (not 0).
	 */
	for (ptr = qhead; ((ptr != QNIL) && (pos > 1)); ptr = ptr->fow){
		if (ptr != work)
			/*
			 * must be moving the job to a lower position
			 * in the queue. So cannot count self.
			 */
			pos--;
	}

	/*
	 * if it is already at the requested position, or the pos is
	 * after the last node and the pid IS the last node, return
	 */
	if ((ptr == work) || ((ptr == QNIL) && (qtail == work)))
		return(0);
	
	newlist = 1;
	/*
	 * splice the node out of the queue
	 */
	if (work->fow != QNIL)
		(work->fow)->back = work->back;
	if (work->back != QNIL)
		(work->back)->fow = work->fow;
	if (qtail == work)
		qtail = work->back;
	if (qhead == work)
		qhead = work->fow;
	/*
	 * splice the node into the new position.
	 */
	if (ptr == QNIL){
		/*
		 * put at the end of the queue
		 */
		work->back = qtail;
		work->fow = QNIL;
		work->time = qtail->time + 1;
		qtail->fow = work;
		qtail = work;
	}else{
		/*
		 * belongs in the queue as ptr points at a node
		 */
		work->fow = ptr;
		work->back = ptr->back;
		/*
		 * see if the pid is being put at the head of the list
		 */
		if (ptr->back != QNIL){
			(ptr->back)->fow = work;
			work->time = ptr->time-((ptr->time-(ptr->back)->time)/2);
		}else{
			qhead = work;
			work->time = ptr->time - 1;
		}
		ptr->back = work;
	}
	return(0);
}
@//E*O*F server/data.c//
chmod u=r,g=r,o=r server/data.c
 
echo x - server/globals.c
sed 's/^@//' > "server/globals.c" <<'@//E*O*F server/globals.c//'

/*-------------------------------------------------------------------------
 * globals.c - server
 *
 * allocation of the variables that are global to the server.
 *-------------------------------------------------------------------------
 */

/* $Log$ */

#include "../h/common.h"
#include "../h/server.h"
#include <sys/uio.h>
#include <sys/socket.h>
#include <sys/un.h>
#include <sys/time.h>
#include <stdio.h>

int kmem = -1;				/* file desc for kmem to get load */
int cntrlsock = -1;			/* socket desc for control messages*/
int msgsock = -1;			/* socket for queue requests */
int qcount = 0;				/* count job in the queue */
int newlist = 1;			/* 1 when queue is new than last list*/
int newstatus = 1;			/* 1 when status variable are changed*/
int errorcount = 0;			/* count of number of recovered error*/
int timerstop = 1;			/* when when timer stopped, 0 runs */
u_long mqtime = MAXQTIME;		/* max time a job can be in queue */
int descsize = 0;			/* desc table size for select */
long loadaddr = 0;			/* address of load aver in kmem */
int alrmmask = 0;			/* mask for blocking SIGALRM */
int full = MAXINQUEUE;			/* max number of jobs waiting to run */
FILE *errfile;				/* file where errors are logged */
struct qnode *qhead = QNIL;		/* points at queue head */
struct qnode *qtail = QNIL;		/* points at queue tail */
struct qnode *freequeue = QNIL;		/* pointer to local freelist of qnode*/
struct itimerval startalrm = {{ALRMTIME,0},{ALRMTIME,0}}; /* alrm time */
struct itimerval stopalrm = {{0,0},{0,0}}; /* value used to stop timer */
struct timeval polltime = {WAITTIME,0};    /* wait time during poll */

#ifdef sun
long loadlevel = (long)(MAXLOAD*256);	/* load at which queueing starts */
#else
double loadlevel = MAXLOAD;		/* load at which queueing starts */
#endif
@//E*O*F server/globals.c//
chmod u=r,g=r,o=r server/globals.c
 
echo x - server/main.c
sed 's/^@//' > "server/main.c" <<'@//E*O*F server/main.c//'

/*-------------------------------------------------------------------------
 * main.c - server
 *
 * The server takes requests from client processes and the control
 * program, and performs various operations. The servers major task is
 * to attempt to maintain the systems load average close to a set limit
 * loadlevel. Client processes are kept in a queue and are waiting for a
 * command from the server (to run or abort). The server reads /dev/kmem
 * every ALRMTIME seconds checking to see if the load level has dropped
 * below the required loadlevel. If the queue is empty the timer is turned
 * off. While the timer is off, the server will only read /dev/kmem at the
 * receipt of a request to run from a client program.
 *
 * The server was designed to be as fault tolerant as possible and maintains
 * an errorfile of detectable errors. The server can safely be aborted and
 * restarted without deadlocking the clients. The server when restarted
 * will rebuild the queue of waiting processes to the state that exsisted
 * before the prvious server exited. The entire system was designed to allow
 * execution of user programs (even those under load control) even if the
 * server is not functioning properly! (user jobs will ALWAYS run, the system
 * will never hang).
 *
 * The effectiveness of the system depends on what fraction of the programs
 * that are causing the system overload are maintained under this system.
 * Processes can only remain in queue a maximium of "mqtime" seconds 
 * REGARDLESS of the loadlevel setting. This was done in case the programs 
 * that are keeping the systems loadlevel above the threshold are not
 * controlled by the server! So eventually all jobs will run.
 *
 * The control program allows users to remove their jobs from the queue and
 * allows root to adjust the operating parameters of the server while the
 * server is running.
 * 
 * All the programs and routines are commented and warnings about certain
 * sections of code are given when the code might be vague.
 * 
 * This system has ONLY BEEN RUN ON 4.2 UNIX (sun, vax and pyramid) and uses
 * datagrams in the AF_UNIX domain. (which seems to be extremely reliable).
 *
 * Author: Keith Muller
 *         University of California, San Diego
 *         Academic Computer Center C - 010
 *	   La Jolla, Ca 92093
 *	   (ucbvax!sdcsvax!sdcc3!muller)
 *	   (619) 452-6090
 *-------------------------------------------------------------------------
 */

/* $Log$ */

#include "../h/common.h"
#include "../h/server.h"
#include <sys/time.h>
#include <sys/file.h>
#include <stdio.h>
#include <errno.h>

/*--------------------------------------------------------------------------
 * main
 *
 *--------------------------------------------------------------------------
 */
main(argc, argv)
int argc;
char **argv;
{
	register int msgmask;
	register int cntrlmask;
	int numfds;
	int readfds;
	int readmask;
	extern int msgsock;
	extern int cntrlsock;
	extern int descsize;
	extern int errno;

	/*
	 * check the command line args
	 */
	doargs(argc, argv);

	/*
	 * setup the server
	 */
	setup();

	/*
	 * create all the sockets
	 */
	crsock();

	/*
	 * scan the spool for waiting clients and send them a POLLCMD
	 */
	scanspool();

	/*
	 * create the bit mask used by select to determine which descriptors
	 * are checked for available input ( datagrams).
	 */
	msgmask = 1 << msgsock;
	cntrlmask = 1 << cntrlsock;
	readmask = msgmask | cntrlmask;

	/*
	 * do this forever
	 */
	for(;;){
		readfds = readmask;

		/*
		 * wait for a datagram to arrive
		 */
		numfds = select(descsize,&readfds,(int *)0,(int *)0,(struct timeval *)0);
		if ((numfds < 0) && (errno != EINTR)){
			errlog("select error");
			cleanup();
		}

		/*
		 * if the interval timer interrupted us, go back to the select
		 */
		if (numfds <= 0)
			continue;
		/*
		 * WARNING! note that BOTH SOCKETS are always checked 
		 * when the select indicates at least one datagram is waiting.
		 * This was done to prevent a situation where one socket
		 * "locks" out the other if it is subject to high traffic!
		 */

		/*
		 * first check to see if there is a control message
		 */
		if (readfds & cntrlmask)
			cntrldis();

		/*
		 * now see if there is a queue message
		 */
		if (readfds & msgmask)
			msgdis();
	}

}


/*--------------------------------------------------------------------------
 * onalrm
 *
 * handler for the SIGALRM sent by the interval timer. This routine checks
 * the queue to see if there is any jobs that can be run. The two conditions
 * for running a job is that the load on the machine is below loadlimit or
 * the oldest job in the queue has exceed the maximium queue time and should
 * be run regardless of the load.
 *--------------------------------------------------------------------------
 */
onalrm()
{
	register int count;
	struct timezone zone;
	struct timeval now;
	struct itimerval oldalrm;
	extern struct itimerval stopalrm;
	extern struct qnode *qhead;
	extern u_long mqtime;
	extern int qcount;
	extern int timerstop;
	extern int newstatus;

	/*
	 * if the load average is below the limit run as many jobs as
	 * possable to bring the load up to the loadlimit.
	 * this could cause an overshoot of the loadlimit, but in most
	 * cases this overshoot will be small. This prevents excessive
	 * waiting of jobs due to momentary load peaks.
	 */
	if ((count = getrun()) != 0){
		while ((count > 0) && (qcount > 0)){
			/*
			 * only decrement count if there was really
			 * a waiting client (the client could be dead)
			 */
			if (outmsg(qhead->pid, RUNCMD) == 0)
				count--;
			rmqueue(qhead);
		}
	}else if (qcount > 0){
		/*
		 * load is too high to run a job, check if oldest can be run
		 */
		if (gettimeofday(&now, &zone) < 0){
			errlog("onalrm cannot get time");
			return;
		}
		while ((qcount>0)&&(((u_long)now.tv_sec - qhead->time)>mqtime)){
			/*
			 * determined oldest job can run. if job is
			 * dead try next one
			 */
			if (outmsg(qhead->pid, RUNCMD) == 0){
				rmqueue(qhead);
				break;
			}else
				rmqueue(qhead);
		}
	}

	/*
	 * if the queue is not empty or the interval timer is stopped
	 * then return
	 */
	if ((qcount != 0) || (timerstop == 1))
		return;

	/*
	 * otherwise stop the timer
	 */
	if (setitimer(ITIMER_REAL,&stopalrm, &oldalrm) < 0)
		errlog("stop timer error");
	else{
		timerstop = 1;
		newstatus = 1;
	}
}


/*-------------------------------------------------------------------------
 * getrun
 *
 * determines how many jobs can be run after obtaining current 1 minute
 * load average. since the load obtained from kmeme is an average, this
 * should provide some hysteresis so the server doesn't thrash around
 *-------------------------------------------------------------------------
 */
getrun()
{
	extern int qcount;
	extern int kmem;
	extern long loadaddr;
#ifdef sun
	long load;
	long run;
	extern long loadlevel;
#else
	double load;
	double run;
	extern double loadlevel;
#endif sun
	extern long lseek();

	/*
	 * seek out into kmem (yuck!!!)
	 */
	if (lseek(kmem, loadaddr, L_SET) == -1){
		errlog("lseek error");
		cleanup();
	}

	/*
	 * read the load
	 */
	if (read(kmem, (char *)&load, sizeof(load)) < 0){
		errlog("kmem read error");
		cleanup();
	}

	/*
	 * calculate the number of jobs that can run
	 * (will always overshoot by the fraction)
	 */
	if ((run = loadlevel - load) > 0){
#ifdef sun
		/*
	 	 * sun encodes the load average in a long. It is the
	 	 * load average * 256
	 	 */
		return(1 + (int)(run >> 8));
#else
		return(1 + (int)run);
#endif
	}else
		return(0);
}


/*------------------------------------------------------------------------
 * errlog
 *
 * log the erros into a log. should be small number (hopefully zero!!)
 *------------------------------------------------------------------------
 */
errlog (mess)
char *mess;

{
	struct timeval now;
	struct timezone zone;
	extern char *ctime();
	extern int errorcount;
	extern int errno;
	extern int sys_nerr;
	extern char *sys_errlist[];
	extern FILE *errfile;

	/*
	 * increase the errorcount
	 */
	errorcount = errorcount + 1;

	/*
	 * if called with an arg, print it first
	 */
	if (mess != (char *)0)
		fprintf(errfile,"%s: ", mess);
	/*
	 * if a valid error print the human message
	 */
	if ((errno > 0) && (errno < sys_nerr))
		fprintf(errfile," %s ", sys_errlist[errno]);
	/*
	 * stamp the time of occurance
	 */
	if (gettimeofday(&now, &zone) < 0)
		fprintf(errfile,"errlog cannot get time of day\n");
	else
		fprintf(errfile,"%s", ctime(&(now.tv_sec)));
	(void)fflush(errfile);
}


/*-------------------------------------------------------------------------
 * cleanup
 *
 * the whole system fell apart. close down the sockets log the server
 * termination and exit.
 *-------------------------------------------------------------------------
 */
cleanup()
{
	extern int msgsock;
	extern int cntrlsock;
	extern int errno;
	extern FILE *errfile;

	(void)close(msgsock);
	(void)close(cntrlsock);
	(void)unlink(MSGPATH);
	(void)unlink(CNTRLPATH);
	errno = 0;
	errlog("Server aborting at");
	(void)fclose(errfile);
	exit(1);
}
@//E*O*F server/main.c//
chmod u=r,g=r,o=r server/main.c
 
echo Inspecting for damage in transit...
temp=/tmp/shar$$; dtemp=/tmp/.shar$$
trap "rm -f $temp $dtemp; exit" 0 1 2 3 15
cat > $temp <<\!!!
      33      62     411 Makefile
     311    1144    7097 data.c
      44     288    1782 globals.c
     355    1341    9080 main.c
     743    2835   18370 total
!!!
wc  server/Makefile server/data.c server/globals.c server/main.c | sed 's=[^ ]*/==' | diff -b $temp - >$dtemp
if [ -s $dtemp ]
then echo "Ouch [diff of wc output]:" ; cat $dtemp
else echo "No problems found."
fi
exit 0

muller@sdcc3.UUCP (Keith Muller) (02/12/85)

This is part 7 of the load control system. Part 1 must be unpacked before any
other part.
	Keith Muller
	ucbvax!sdcsvax!muller


# This is a shell archive.  Remove anything before this line,
# then unpack it by saving it in a file and typing "sh file".
#
# Wrapped by sdcc3!muller on Sat Feb  9 13:58:16 PST 1985
# Contents:  server/commands.c
 
echo x - server/commands.c
sed 's/^@//' > "server/commands.c" <<'@//E*O*F server/commands.c//'

/*------------------------------------------------------------------------
 * commands.c - server
 *
 * Commands that can be executed by the server in response to client
 * datagrams
 *------------------------------------------------------------------------
 */

/* $Log$ */

#include "../h/common.h"
#include "../h/server.h"
#include <sys/file.h>
#include <sys/ioctl.h>
#include <sys/stat.h>
#include <sys/uio.h>
#include <sys/socket.h>
#include <sys/un.h>
#include <sys/time.h>
#include <stdio.h>
#include <signal.h>
#include <errno.h>

/*-----------------------------------------------------------------------
 * cntrldis
 *
 * cntrldis reads the datagram on control socket port. Then call the
 * appropriate routine as encoded in the datagrams type field. 
 * NOTE:
 *      The control program that sent the datagram ALWAYS waits for an
 *      indication that the datagram was processed. Each routine is 
 *      required to send an indication to the control program that the
 *      request was processed.
 *-----------------------------------------------------------------------
 */

cntrldis()
{
	struct request work;	/* datagram space */
	int oldmask;		/* old value of signal mask  */
	int fromlen = 0;
	extern int cntrlsock;
	extern int alrmmask;
	extern int newstatus;
#ifdef sun
	extern long loadlevel;
#else
	extern double loadlevel;
#endif sun
	extern int full;
	extern int errno;
	extern u_long mqtime;
	extern struct qnode *qhead;

	/*
	 * BLOCK OFF SIGALRM as the called routines modify the 
	 * internal data structures and cannot be interrupted
	 * by the interval timer. That would corrupt the linked
	 * lists.
	 */
	oldmask = sigblock(alrmmask);

	if (recvfrom(cntrlsock,&work,sizeof(struct request),0,(struct sockaddr *)0,&fromlen) <= 0){
		if (errno != EINTR)
			errlog("error in cntrldis recv");
		(void)sigsetmask(oldmask);
		return;
	}

	/*
	 * dispatch on type of request
	 */
	switch(work.type){
		case RJOBCMD:
			/*
			 * run a job by pid in the queue
			 * (privledged command)
			 */
			runjob(&work);
			break;
		case RUSRCMD:
			/*
			 * remove all a users jobs in the queue
			 * (privledged command)
			 */
			runusr(&work);
			break;
		case PJOBCMD:
			/*
			 * a request to remove a job from the queue
			 * (also can be called from msgdis())
			 */
			prjob(&work, cntrlsock);
			break;
		case PUSRCMD:
			/*
			 * remove all a users jobs in the queue
			 */
			prusr(&work);
			break;
		case PALLCMD:
			/*
			 * purge the ENTIRE queue
			 * (privledged command)
			 */
			prall(&work);
			break;
		case ABORTCMD:
			/*
	 		 * make sure socket is owned by root, otherwise
	 		 * reject it
	 		 */
			if (chksock(CNTRLPRE, work.pid, 0) == 0){
				(void)outcntrl(work.pid, STOPCMD);
				return;
			}

			/*
			 * tell the server to terminate
			 * (privledged command)
			 */
			(void)outcntrl(work.pid, RUNCMD);
			cleanup();
			break;
		case MOVECMD:
			/*
	 		 * make sure socket is owned by root, otherwise
	 		 * reject it
	 		 */
			if (chksock(CNTRLPRE, work.pid, 0) == 0){
				(void)outcntrl(work.pid, STOPCMD);
				return;
			}

			/*
			 * move a process in the queue
			 * (privledged command)
			 */
			if (movequeue(work.time, work.uid) == 0)
				(void)outcntrl(work.pid, RUNCMD);
			else
				(void)outcntrl(work.pid, STOPCMD);
			break;
		case LOADLIMCMD:
			/*
	 		 * make sure socket is owned by root, otherwise
	 		 * reject it
	 		 */
			if (chksock(CNTRLPRE, work.pid, 0) == 0){
				(void)outcntrl(work.pid, STOPCMD);
				return;
			}

			/*
			 * change the load level queueing starts
			 * (privledged command)
			 */
#ifdef sun
			loadlevel = (long)work.time;
#else
			loadlevel = ((double)work.time)/256.0;
#endif sun
			newstatus = 1;
			(void)outcntrl(work.pid, RUNCMD);
			break;
		case STATUSCMD:
			/*
			 * update the status file if necessary
			 */
			status(&work);
			break;
		case LISTCMD:
			/*
			 * update the queue list file if necessary 
			 */
			list(&work);
			break;
		case MQTIMECMD:
			/*
	 		 * make sure socket is owned by root, otherwise
	 		 * reject it
	 		 */
			if (chksock(CNTRLPRE, work.pid, 0) == 0){
				(void)outcntrl(work.pid, STOPCMD);
				return;
			}

			/*
			 * change the maximium time a job can wait
			 * (privledged command)
			 */
			mqtime = work.time;
			newstatus = 1;
			(void)outcntrl(work.pid, RUNCMD);
			break;
		case QUEUESIZE:
			/*
	 		 * make sure socket is owned by root, otherwise
	 		 * reject it
	 		 */
			if (chksock(CNTRLPRE, work.pid, 0) == 0){
				(void)outcntrl(work.pid, STOPCMD);
				return;
			}

			/*
			 * change the maximium size limit on the
			 * queue of waiting jobs
			 * (privledged command)
			 */
			full = (int)work.time;
			newstatus = 1;
			(void)outcntrl(work.pid, RUNCMD);
			break;
		case CHTIMER:
			/*
			 * change interval when load level checked 
			 * (privledged command)
			 */
			chtimer(&work);
			break;
		default:
			errno = 0;
			errlog("cntrldis bad command");
			(void)outcntrl(work.pid, STOPCMD);
			break;
	}

	/*
	 * UNBLOCK SIGALRM so interval timer can check load
	 * to dispatch a job.
	 */

	(void)sigsetmask(oldmask);
}

/*-----------------------------------------------------------------------
 * msgdis
 *
 * msgdis reads the datagram on msg socket port. Then calls the
 * appropriate routine as encoded in the datagrams type field. 
 * NOTE:
 *      The client that sent the datagram ALWAYS waits for an indication
 *      that the datagram was processed. Each routine is required to send
 *      an indication to the client that the request has been processed.
 *-----------------------------------------------------------------------
 */

msgdis()
{
	struct request work;	/* datagram space */
	int oldmask;		/* old value of signal mask  */
	int fromlen = 0;
	extern int msgsock;
	extern int alrmmask;
	extern int errno;

	/*
	 * BLOCK OFF SIGALRM as the called routines modify the 
	 * internal data structures and cannot be interrupted
	 * by the iterval timer. That would corrupt the linked
	 * lists.
	 */
	oldmask = sigblock(alrmmask);

	if (recvfrom(msgsock,&work,sizeof(struct request),0,(struct sockaddr *)0,&fromlen) <= 0){
		if (errno != EINTR)
			errlog("error in msgdis recv");
		(void)sigsetmask(oldmask);
		return;
	}

	/*
	 * dispatch on type of request
	 */
	if (work.type == POLLCMD){
		/*
		 * a client making sure he is in the queue
		 * same as a QCMD, but addjob handles them differently.
		 */
		addjob(&work);
	}else if (work.type == QCMD){
		/*
		 * a request to queue a process
		 */
		addjob(&work);
	}else if (work.type == PJOBCMD){
		/*
		 * a request to remove a job from the queue
		 * should only be from a terminating client
		 * (also called from cntrldis())
		 */
		prjob(&work, msgsock);
	}else{
		errno = 0;
		errlog("msgdis bad command");
		(void)outmsg(work.pid, STOPCMD);
	}

	/*
	 * UNBLOCK SIGALRM so interval timer can check load
	 * to dispatch a job.
	 */
	(void)sigsetmask(oldmask);
}

/*-------------------------------------------------------------------------
 * addjob 
 *
 * check a job request to be queued. the request is in datagram work.
 * jobs are only added if the load is above the set loadlimit threshold,
 * otherwise they are told to run.
 * If the queue is full, then the job is rejected.
 *-------------------------------------------------------------------------
 */

addjob(work)
struct request *work;
{
	struct itimerval oldalrm;
	extern int full;
	extern int qcount;
	extern struct itimerval startalrm;
	extern int addqueue();
	extern int timerstop;
	extern struct qnode *qhead;

	/*
	 * if the queue is empty and the load is below the
	 * limit, just run the job.
	 */
	if ((qcount == 0) && (getrun() > 0)){
		(void)outmsg(work->pid, RUNCMD);
		return;
	}

	switch (addqueue(work)){
		case 0:
			/*
		 	 * queue was empty, turn the timer back on
		 	 */
			if (setitimer(ITIMER_REAL,&startalrm, &oldalrm)<0){
				errlog("start timer error");
				exit(1);
			}
			timerstop = 0;
			/*
			 * fall through to case 1 below, and send queued
			 * message
			 */
		case 1:
			/*
			 * job is in queue, all is ok
			 */
			 (void)outmsg(work->pid, QCMD);
			 break;
		case -1:
			/*
		 	 * addqueue failed see if we can free up a space by
		 	 * telling oldest job to run.
		 	 */
			if (qcount > 0){
				(void)outmsg(qhead->pid, RUNCMD);
				(void)rmqueue(qhead);
				(void)addqueue(work);
				(void)outmsg(work->pid, QCMD);
			}else{
				(void)outmsg(work->pid, RUNCMD);
			}
			break;
		case -2:
			/*
			 * this is a new job and the queue is full.
			 * Reject the job.
			 */
			(void)outmsg(work->pid, FULLQUEUE);
			break;
		default:
			/*
			 * bad return from addqueue()
			 */
			errlog("addqueue returned bad value");
			exit(1);
			break;
	}
}

/*-------------------------------------------------------------------------
 * chksock
 *
 * make sure that the bound socket is owned by the proper person. This is only
 * checked for control messages (which are very infrequent).
 *-------------------------------------------------------------------------
 */

chksock(prefix, jpid, juid)
char *prefix;
u_long jpid;
int juid;
{
	char namebuf[64];
	struct stat statbuf;
	extern char *sprintf();

	(void)sprintf(namebuf, "%s%u", prefix, jpid);
	if (stat(namebuf, &statbuf) != 0)
		return(0);
	if ((unsigned int)statbuf.st_uid != juid)
		return(0);
	return(1);
}

/*-------------------------------------------------------------------------
 * chtimer
 *
 * change the interval timer. The interval timer is used to force the server
 * to check the queue every n seconds to see if the load is low enough to
 * let some jobs run.
 *-------------------------------------------------------------------------
 */

chtimer(work)
struct request *work;
{
	struct itimerval oldalrm;
	extern struct itimerval startalrm;
	extern int timerstop;
	extern int newstatus;
	extern int outmsg();

	/*
	 * make sure that this is from a socket owned by root, otherwise
	 * reject it
	 */
	if (chksock(CNTRLPRE, work->pid, 0) == 0){
		(void)outcntrl(work->pid, STOPCMD);
		return;
	}

	startalrm.it_interval.tv_sec = work->time;
	startalrm.it_value.tv_sec = work->time;
	newstatus = 1;

	/*
	 * if the timer is already stopped, just leave.
	 */
	if (timerstop == 1){
		(void)outcntrl(work->pid, RUNCMD);
		return;
	}

	/*
	 * restart the timer with the new interval
	 */
	if (setitimer(ITIMER_REAL,&startalrm, &oldalrm) < 0){
		errlog("start timer error");
		(void)outcntrl(work->pid, STOPCMD);
		cleanup();
	}

	/*
	 * tell the client the command was sucessful
	 */
	(void)outcntrl(work->pid, RUNCMD);
}

/*------------------------------------------------------------------------
 * list
 *
 * if necessary update the list file with the current queue status
 * then tell the client that the list file is up to date.
 * The data is stored in a file to avoid any chance the server could block
 * writing to a stopped control program.
 *
 * NOTE:
 * The users uids are NOT looked up it the passwd file. That must be done
 * by the programs that read the list file. Looking things up in the passwd
 * file are real expensive (even with dbm hashing) and this cannot be
 * afforded.
 *------------------------------------------------------------------------
 */
list(work)
struct request *work;
{
	register struct qnode *ptr;	/* pointer to walk through queue */
	FILE *out;			/* file where list will be written */
	extern int newlist;
	extern int qcount;
	extern struct qnode *qhead;
	extern FILE *fopen();

	/*
	 * if the queue is the same as the last time it was listed in
	 * the file, just tell the client to read the file.
	 */
	if (newlist == 0){
		(void)outcntrl(work->pid, RUNCMD);
		return;
	}

	if ((out = fopen(LISTFILE, "w")) == NULL){
		errlog("list cannot open LISTFILE");
		(void)outcntrl(work->pid, STOPCMD);
		return;
	}

	/*
	 * write out the number of waiting clients
	 */
	fprintf(out, "%d\n", qcount);

	/*
	 * write each queue entry
	 */
	for (ptr = qhead; ptr != QNIL; ptr = ptr->fow)
		fprintf(out,"%u %u %u %s\n",ptr->uid,ptr->pid,ptr->time,ptr->com);
	(void)fclose(out);

	/*
	 * set the flag to indicate that the list was updated.
	 */
	newlist = 0;

	/*
	 * tell the client to read the file
	 */
	(void)outcntrl(work->pid, RUNCMD);
}

/*-----------------------------------------------------------------------
 * outcntrl
 *
 * send the indicated message to the waiting control program who pid is
 * "pid". control sockets are always the CNTRLPRE followed by the pid.
 *-----------------------------------------------------------------------
 */
outcntrl(pid, cmd)
u_long pid;
char cmd;
{
	int len;                 /* the size of the datagram header */
	struct sockaddr_un name; /* datagram recipient */
	extern int cntrlsock;
	extern char *sprintf();
	extern int errno;

	/*
	 * set up the address of the target of the message
	 */
	name.sun_family = AF_UNIX;
	(void)sprintf(name.sun_path, "%s%u", CNTRLPRE, pid);
	len = strlen(name.sun_path) + sizeof(name.sun_family) + 1;

	if (sendto(cntrlsock, &cmd, sizeof(cmd), 0, &name, len) >= 0)
		return(0);

	/*
	 * If this point is reached:
	 *
	 * The sendto FAILED, either control died and left the old socket
	 * entry in the filesystem (so remove it) or terminated and
	 * cleaned up the old socket entry.
	 */
	if ((errno == ENOTSOCK) || (errno == ECONNREFUSED) || (errno == ENOENT)
	     || (errno == EPROTOTYPE))
		(void)unlink(name.sun_path);
	else
		errlog("outcntrl sendto failed");
	return(1);
}

/*-----------------------------------------------------------------------
 * outmsg
 *
 * send the indicated message to the waiting client who pid is "pid".
 * clients sockets are always the CLIENTPRE followed by the clients pid.
 *-----------------------------------------------------------------------
 */
outmsg(pid, cmd)
u_long pid;
char cmd;
{
	int len;                 /* the size of the datagram header */
	struct sockaddr_un name; /* datagram recipient */
	extern int msgsock;
	extern char *sprintf();
	extern int errno;

	/*
	 * set up the address of the target of the message
	 */
	name.sun_family = AF_UNIX;
	(void)sprintf(name.sun_path, "%s%u",CLIENTPRE,pid);
	len = strlen(name.sun_path) + sizeof(name.sun_family) + 1;

	if (sendto(msgsock, &cmd, sizeof(cmd), 0, &name, len) >= 0)
		return(0);

	/*
	 * If this point is reached:
	 *
	 * The sendto FAILED, either client died and left the old socket
	 * entry in the filesystem (so remove it) or terminated and
	 * cleaned up the old socket entry.
	 */
	if ((errno == ENOTSOCK) || (errno == ECONNREFUSED) || (errno == ENOENT)
	     || (errno == EPROTOTYPE))
		(void)unlink(name.sun_path);
	else
		errlog("outmsg sendto failed");
	return(1);
}

/*------------------------------------------------------------------------
 * prall
 *
 * remove ALL the waiting tasks in the queue. The jobs are told to 
 * terminate.
 *------------------------------------------------------------------------
 */
prall(work)
struct request *work;
{
	register struct qnode *ptr;
	extern struct qnode *qtail;

	/*
	 * make sure control socket is owned by root
	 * otherwise reject it
	 */
	if (chksock(CNTRLPRE, work->pid, 0) == 0){
		(void)outcntrl(work->pid, STOPCMD);
		return;
	}

	for (ptr = qtail; ptr != QNIL; ptr= ptr->back){
		(void)outmsg(ptr->pid, STOPCMD);
		rmqueue(ptr);
	}

	/*
	 * tell the client the control program the queue is purged
	 */
	(void)outcntrl(work->pid, RUNCMD);
}

/*--------------------------------------------------------------------------
 * prjob
 *
 * remove a job (specified by its pid) from the queue. The job is told to
 * terminate. If the job is not found, tell the requesting client.
 *--------------------------------------------------------------------------
 */
prjob(work, port)
register struct request *work;
int port;
{
	register struct qnode *ptr;
	extern struct qnode *qtail;
	extern int cntrlsock;
	extern int msgsock;

	/*
	 * check to see if this is a control program request or
	 * a client request. check the validity of the bound
	 * socket in either case
	 */
	if (port == cntrlsock){
		if (chksock(CNTRLPRE, work->pid, 0) == 0){
			(void)outcntrl(work->pid, STOPCMD);
			return;
		}
	}else if (chksock(CLIENTPRE, work->pid, (int)work->uid) == 0){
		(void)outmsg(work->pid, STOPCMD);
		return;
	}

	for (ptr = qtail; ptr != QNIL; ptr = ptr->back){
		if (ptr->pid != work->time)
			continue;
		/*
		 * found the job, ONLY remove if the requester owns
		 * the job, if the requester is root, or this is a
		 * client that is terminating from a signal and is
		 * sending it's "last breath".
		 */
		if (work->pid == work->time){
			/*
			 * clients "last breath". just remove from
			 * queue as by now the client is dead.
			 */
			rmqueue(ptr);
			return;
		}
		if ((work->uid == 0) || (ptr->uid == work->uid)){
			(void)outmsg(ptr->pid, STOPCMD);
			rmqueue(ptr);
			if (port == cntrlsock)
				(void)outcntrl(work->pid, RUNCMD);
			else
				(void)outmsg(work->pid, RUNCMD);
			return;
		}else
			break;
	}
	/*
	 * command failed, tell the process that sent the datagram
	 * only if this is not a "last breath" message (should really
	 * never happen!)
	 */
	if (port == cntrlsock)
		(void)outcntrl(work->pid, STOPCMD);
	else if (work->pid != work->time)
		(void)outmsg(work->pid, STOPCMD);
}

/*-------------------------------------------------------------------------
 * prusr
 *
 * remove all the jobs queued that belong to a specified user. Only root
 * or the user can request his jobs to be removed.
 * (check for user field must be done in the control program).
 *-------------------------------------------------------------------------
 */
prusr(work)
register struct request *work;
{
	register struct qnode *ptr;
	int found = 0;
	extern struct qnode *qtail;

	/*
	 * check to see if this is a valid control program request
	 */
	if (chksock(CNTRLPRE, work->pid, 0) == 0){
		(void)outcntrl(work->pid, STOPCMD);
		return;
	}

	for (ptr = qtail; ptr != QNIL; ptr = ptr->back){
		/*
		 * found a job owned by that user.
		 */
		if (ptr->uid == work->uid){
			(void)outmsg(ptr->pid, STOPCMD);
			rmqueue(ptr);
			found = 1;
		}
	}
	if (found == 1)
		(void)outcntrl(work->pid, RUNCMD);
	else
		(void)outcntrl(work->pid, STOPCMD);
}

/*-------------------------------------------------------------------------
 * runjob
 *
 * run a specified job (by pid) REGARDLESS of the load.
 *-------------------------------------------------------------------------
 */
runjob(work)
register struct request *work;
{
	register struct qnode *ptr;
	extern struct qnode *qtail;

	/*
	 * check to see if this is a control program request
	 */
	if (chksock(CNTRLPRE, work->pid, 0) == 0){
		(void)outcntrl(work->pid, STOPCMD);
		return;
	}

	for (ptr = qtail; ptr != QNIL; ptr = ptr->back){
		if (ptr->pid == work->time){
			/*
		 	 * found the job
		 	 */
			(void)outmsg(ptr->pid, RUNCMD);
			rmqueue(ptr);
			(void)outcntrl(work->pid, RUNCMD);
			return;
		}
	}
	(void)outcntrl(work->pid, STOPCMD);
}

/*-------------------------------------------------------------------------
 * runusr
 *
 * run all jobs owned by a user REGARDLES of the load
 *-------------------------------------------------------------------------
 */
runusr(work)
register struct request *work;
{
	register struct qnode *ptr;
	int found = 0;
	extern struct qnode *qtail;

	/*
	 * check to see if this is a control program request
	 */
	if (chksock(CNTRLPRE, work->pid, 0) == 0){
		(void)outcntrl(work->pid, STOPCMD);
		return;
	}

	for (ptr = qtail; ptr != QNIL; ptr = ptr->back){
		if (ptr->uid == work->uid){
			/*
		 	 * found a job owned by that user
		 	 */
			(void)outmsg(ptr->pid, RUNCMD);
			rmqueue(ptr);
			found = 1;
		}
	}

	if (found == 1)
		(void)outcntrl(work->pid, RUNCMD);
	else
		(void)outcntrl(work->pid, STOPCMD);
}

/*-------------------------------------------------------------------------
 * status
 *
 * update the status file. the status file contains the current setting
 * of server paramters which can be changed by the control program.
 *-------------------------------------------------------------------------
 */
status(work)
struct request *work;
{
	FILE *out;
	extern int errorcount;
	extern int newstatus;
	extern int qcount;
	extern int full;
	extern int timerstop;
#ifdef sun
	extern long loadlevel;
#else
	extern double loadlevel;
#endif sun
	extern u_long mqtime;
	extern struct itimerval startalrm;
	extern FILE *fopen();

	/*
	 * status is the same since the last request.
	 */
	if (newstatus == 0){
		(void)outcntrl(work->pid, RUNCMD);
		return;
	}

	if ((out = fopen(STATUSFILE, "w")) == NULL){
		errlog("status cannot open STATUSFILE");
		(void)outcntrl(work->pid, STOPCMD);
		return;
	}

#ifdef sun
	fprintf(out,"%d %d %d %ld %u %d %ld\n",qcount,full,timerstop,
#else
	fprintf(out,"%d %d %d %ld %u %d %lf\n",qcount,full,timerstop,
#endif
			startalrm.it_value.tv_sec,mqtime,errorcount,loadlevel);

	(void)fclose(out);
	newstatus = 0;
	(void)outcntrl(work->pid, RUNCMD);
}
@//E*O*F server/commands.c//
chmod u=r,g=r,o=r server/commands.c
 
echo Inspecting for damage in transit...
temp=/tmp/shar$$; dtemp=/tmp/.shar$$
trap "rm -f $temp $dtemp; exit" 0 1 2 3 15
cat > $temp <<\!!!
     878    2737   20872 commands.c
!!!
wc  server/commands.c | sed 's=[^ ]*/==' | diff -b $temp - >$dtemp
if [ -s $dtemp ]
then echo "Ouch [diff of wc output]:" ; cat $dtemp
else echo "No problems found."
fi
exit 0

muller@sdcc3.UUCP (Keith Muller) (02/12/85)

This is part 8 (last one!) of the load control system. Part 1 must be unpacked
before any other part.

	Keith Muller
	ucbvax!sdcsvax!muller


# This is a shell archive.  Remove anything before this line,
# then unpack it by saving it in a file and typing "sh file".
#
# Wrapped by sdcc3!muller on Sat Feb  9 13:58:45 PST 1985
# Contents:  server/setup.c
 
echo x - server/setup.c
sed 's/^@//' > "server/setup.c" <<'@//E*O*F server/setup.c//'

/*-------------------------------------------------------------------------
 * setup.c - server
 *
 * routines needed to start up the server.
 *-------------------------------------------------------------------------
 */

/* $Log$ */

#include "../h/common.h"
#include "../h/server.h"
#include <stdio.h>
#include <sys/time.h>
#include <sys/file.h>
#include <sys/ioctl.h>
#include <sys/stat.h>
#include <sys/uio.h>
#include <sys/socket.h>
#include <sys/un.h>
#include <sys/dir.h>
#include <sys/resource.h>
#include <nlist.h>
#include <signal.h>
#include <errno.h>

/*-------------------------------------------------------------------------
 * doargs
 *
 * check the command line arguement list and set up the global parameters.
 *
 * Note that both -X value and -Xvalue format for any flag X is accepted
 *-------------------------------------------------------------------------
 */

doargs(argc, argv)
int argc;
char **argv;
{
	register int i;
	register char *ptr;
	int lasti;
	int badarg;
#ifdef sun
	extern long loadlevel;
#else
	extern double loadlevel;
#endif sun
	extern u_long mqtime;
	extern struct itimerval startalrm;
	extern long atol();
	extern int atoi();
	extern double atof();

	badarg = 0;
	for (i = 1; i < argc; i++){
		if (argv[i][0] != '-'){
			fprintf(stderr,"bad arg: %s\n", argv[i]);
			badarg = 1;
			break;
		}
		lasti = i;
		/*
		 * set ptr to point at start of flags VALUE.
		 * if strlen > 2 must be -Xvalue format
		 * otherwise set ptr to point at next argv
		 */
		if (strlen(argv[i]) > 2)
			ptr = &(argv[i][2]);
		else if ((i+1) < argc)
			ptr = argv[++i];
		else{
			fprintf(stderr,"bad arg: %s\n", argv[i]);
			badarg = 1;
			break;
		}

		switch(argv[lasti][1]){
			case 'L':
				/*
				 * load level to queue at
				 */
#ifdef sun
				if ((loadlevel = (long)(atof(ptr)*256)) <= 0){
					fprintf(stderr,"bad loadlevel: %ld\n",atof(ptr));
#else
				if ((loadlevel = atof(ptr)) <= 0){
					fprintf(stderr,"bad loadlevel: %lf\n",loadlevel);
#endif
					badarg = 1;
				}
				break;
			case 'T':
				/*
				 * timer cycle time for load checks
				 */
				if ((startalrm.it_value.tv_sec = atol(ptr))<1){
					fprintf(stderr,"bad alarmtime: %ld\n",atol(ptr));
					badarg = 1;
				}
				break;
			default:
				fprintf(stderr,"unknown arg: %s\n",argv[lasti]);
				badarg = 1;
				break;
		}
		if (badarg == 1)
			break;
	}
	if (badarg == 1){
		fprintf(stderr,"Useage: %s [-L load] [-T alarm]\n",argv[0]);
		exit(1);
	}
}

/*--------------------------------------------------------------------------
 * setup
 *
 * a collection of code need at startup to set up the server. such as 
 * checking to make sure only one server runs, detatch server from control
 * terminal etc
 *--------------------------------------------------------------------------
 */
setup()
{

	register int i;
	int lockfile;
	int temp;
	char line[20];
	static struct nlist avenrun[] = { {"_avenrun"}, {""}};
	extern int alrmmask;
	extern long loadaddr;
	extern int descsize;
	extern int kmem;
	extern int errno;
	extern int errorcount;
	extern FILE *errfile;
	extern FILE *fopen();
	extern int getpid();
	extern int onalrm();
	extern int getpid();
	extern int setpriority();
	extern char *sprintf();

	if (getuid() != 0){
		fprintf(stderr, "must run as root\n");
		exit(1);
	}

	/*
	 * see if the spool dir where the client sockets are bound exsists
	 */
	if (access(SPOOLDIR, F_OK) == -1){
		fprintf(stderr,"No directory: %s\n", SPOOLDIR);
		exit(1);
	}

	/*
	 * see if the spool dir where the server sockets are bound exsists
	 */
	if (access(SERVERDIR, F_OK) == -1){
		fprintf(stderr,"No directory: %s\n", SERVERDIR);
		exit(1);
	}

	/*
	 * detach from foreground
	 */
	if (fork() != 0)
		exit(0);

	/*
	 * close down all open descriptors 
	 * so the server is no longer attached to any tty
	 */
	descsize = getdtablesize();
	for (i = 0; i < descsize; i++)
		(void)close(i);
	(void)open("/dev/null",O_RDONLY);
	(void)open("/dev/null", O_WRONLY);
	(void)open("/dev/null", O_WRONLY);

	/*
	 * do an ioctl to /dev/tty to detach server from ttys
	 */
	if ((i = open("/dev/tty", O_RDWR)) > 0){
		(void)ioctl(i, TIOCNOTTY, (char *)0);
		(void)close(i);
	}

	/*
	 * set umask to remove all others permissions
	 */
	(void)umask(027);

	/*
	 * open the error logging file
	 */
	errfile = fopen(ERRORPATH,"a+");
	if (errfile < 0)
		exit(1);

	/*
	 * check lockfile for other servers already running uses advisory
	 * locking
	 */
	lockfile = open (LOCK, O_WRONLY|O_CREAT, 0640);
	if (lockfile < 0){
		errlog("cannot create lockfile");
		exit(1);
	}

	if (flock(lockfile, LOCK_EX|LOCK_NB) < 0){
		if (errno == EWOULDBLOCK)
			exit(0);
		errlog("cannot lock lockfile");
		exit(1);
	}

	/*
	 * write the pid of this server in the lock file in case you
	 * need to blow the server away. (not currently used).
	 */
	i = getpid();
	(void)ftruncate(lockfile, 0);
	(void)sprintf(line, "%d\n", i);
	temp = strlen(line);
	if (write(lockfile, line, temp) != temp)
		errlog("cannot write server pid");

	/*
	 * mark the logfile that a new server is starting
	 */
	(void)fprintf(errfile,"server pid: %d ",i);
	errno = 0;
	errlog("started at");
	errorcount = 0;

	/*
	 * lower the server priority so that under heavy load the server
	 * can get the machine cycles when it needs them. The server
	 * uses very small amounts of cpu, so this is not going to impact
	 * the system.
	 */
	if (setpriority(0, i, PRIO) < 0 )
		errlog("cannot lower priority");

	/*
	 * open kmem where the load average will be read from
	 */
	if ((kmem = open("/dev/kmem", O_RDONLY)) < 0){
		errlog("cannot open kmem");
		cleanup();
	}

	/*
	 * get the address in this vmunix where the load average is 
	 * loacated in the kernel data space
	 */
	nlist("/vmunix", avenrun);
	if (avenrun[0].n_value == 0){
		errlog("cannot find _avenrun");
		cleanup();
	}
	loadaddr = (long)avenrun[0].n_value;

	/*
	 * bind the signal handlers now
	 */
	(void)signal(SIGALRM, onalrm);

	/*
	 * mask used to block off sigalrm when the data structures
	 * are being changed and must not service a timer interrupt
	 * to check the load average
	 */
	alrmmask = 1 << (SIGALRM - 1);
}

/*---------------------------------------------------------------------------
 * scanspool
 *
 * when the server is restarted it could be right after an older server
 * just terminated and left a lot of jobs queued up. since the queue is
 * kept in memory for speed, no record exsists anymore about the queued
 * clients. Since all the client sockets are bound in the same spool 
 * directory simply search the directory for a client socket and send it
 * a POLLCMD. the client will respond to the POLLCMD by resubmitting its
 * work request datagram. The addqueue routine rebuilds the queue by time
 * so the queue will be in the proper order.  Any dead cleints whose bound
 * sockets are still in the spool will be removed (the bound sockets).
 *---------------------------------------------------------------------------
 */
scanspool()
{

	register int i;
	int numfds;
	int readfds;
	u_long pid;
	int cnprlen;
	int clprlen;
	int tag;
	int msgmask;
	struct direct *dp;
	struct stat statbuf;
	DIR *dirp;
	extern struct timeval polltime;
	extern int msgsock;
	extern int descsize;
	extern long atol();

	/*
	 * open a directory for scanning
	 */
	if ((dirp = opendir(SPOOLDIR)) == NULL){
		errlog("cannot open spool directory");
		cleanup();
	}

	/*
	 * cd to the directory, this allows short names for binding and
	 * has a place to look for core dumps if they might occur
	 */
	if (chdir(SPOOLDIR) == -1){
		errlog("cannot cd to spool");
		cleanup();
	}

	/*
	 * clprlen is length of the prefix of client socket. 
	 * cnprlen is length of prefix of control program socket.
	 * needed to extract the uid
	 */
	clprlen = strlen(CLIENTPRE);
	cnprlen = strlen(CNTRLPRE);
	msgmask = 1 << msgsock;
	for (dp = readdir(dirp); dp != NULL; dp = readdir(dirp)){
		/*
		 * if not a possable client, go to next entry
		 */
		if (dp->d_ino == 0)
			continue;
	    	if (strncmp(CLIENTPRE, dp->d_name, clprlen) == 0)
			tag = 1;
		else if (strncmp(CNTRLPRE, dp->d_name, cnprlen) == 0)
			tag = 0;
		else
			continue;
		if (stat(dp->d_name, &statbuf) != 0){
			errlog("stat on spool file failed");
			continue;
		}

		/*
		 * file has client or cntrol like name but is not a socket,
		 * remove as could cause problems later
		 */
		if ((statbuf.st_mode & S_IFSOCK) == 0){
			(void)unlink(dp->d_name);
			continue;
		}

		if (tag == 0){
			/*
		 	 * send a message to waiting control program. outcntrl
		 	 * will remove if this is an old socket.
		 	 */
			pid = (u_long)atol(dp->d_name + cnprlen);
			(void)outcntrl(pid, POLLCMD);
			continue;
		}

		/*
		 * is a client socket must force a resubmit of the job. If it
		 * is an old socket outmsg will remove.
		 */
		pid = (u_long)atol(dp->d_name + clprlen);

		/*
		 * throw a couple of POLLCMDS at the client to see if
		 * he is still alive. If the system is loaded could take
		 * a while to swap back in so give him time.
		 */
		for (i = 0; i < MAXPOLLS; i++){
			if (outmsg(pid, POLLCMD) != 0)
				break;
			readfds = msgmask;
			numfds = select(descsize,&readfds,(int*)0,(int*)0,&polltime);
			if ((numfds < 0) && (errno != EINTR)){
				errlog("select error in scanspool");
				cleanup();
			}
			/*
			 * time in select expired and no answer from client
			 * try again
			 */
			if (numfds <= 0){
				continue;
			}
			/*
			 * got a datagram, figure it out
			 */
			msgdis();
		}
	}
}


/*-------------------------------------------------------------------------
 * crsock
 *
 * create all the sockets used by the server
 *-------------------------------------------------------------------------
 */
crsock()
{
	int len;
	struct sockaddr_un name;
	extern int msgsock;
	extern int cntrlsock;
	extern char *strcpy();

	/*
	 * create the msgsocket where queue requests appear
	 */
	name.sun_family = AF_UNIX;
	msgsock = socket(AF_UNIX, SOCK_DGRAM, 0);
	if (msgsock < 0){
		errlog("cannot create msgsock");
		cleanup();
	}
	/*
	 * remove any entry in the file system for this name else the bind
	 * will fail. We are sure from the locking that this is ok to do.
	 */
	(void)unlink(MSGPATH);
	(void)strcpy(name.sun_path, MSGPATH);
	len = strlen(name.sun_path) + sizeof(name.sun_family) + 1;
	if (bind(msgsock, &name, len) < 0){
		errlog("cannot bind msgsock");
		cleanup();
	}

	/*
	 * create the control socket for high priority control commands
	 */
	cntrlsock = socket(AF_UNIX, SOCK_DGRAM, 0);
	if (cntrlsock < 0){
		errlog("cannot create cntrlsock");
		cleanup();
	}
	(void)unlink(CNTRLPATH);
	(void)strcpy(name.sun_path, CNTRLPATH);
	len = strlen(name.sun_path) + sizeof(name.sun_family) + 1;
	if (bind(cntrlsock, &name, len) < 0){
		errlog("cannot bind cntrlsock");
		cleanup();
	}
}
@//E*O*F server/setup.c//
chmod u=r,g=r,o=r server/setup.c
 
echo Inspecting for damage in transit...
temp=/tmp/shar$$; dtemp=/tmp/.shar$$
trap "rm -f $temp $dtemp; exit" 0 1 2 3 15
cat > $temp <<\!!!
     461    1512   10759 setup.c
!!!
wc  server/setup.c | sed 's=[^ ]*/==' | diff -b $temp - >$dtemp
if [ -s $dtemp ]
then echo "Ouch [diff of wc output]:" ; cat $dtemp
else echo "No problems found."
fi
exit 0

muller@sdcc3.UUCP (Keith Muller) (02/21/85)

Unpack part 1 before this part.

# This is a shell archive.  Remove anything before this line,
# then unpack it by saving it in a file and typing "sh file".
#
# Wrapped by sdcc3!muller on Sat Feb  9 13:44:50 PST 1985
# Contents:  client/Makefile client/main.c scripts/addldd scripts/makedirs
#	scripts/qaddldd scripts/rmldd scripts/saddldd
 
echo x - client/Makefile
sed 's/^@//' > "client/Makefile" <<'@//E*O*F client/Makefile//'
#
# Makefile for batch client
#

CFLAGS= -O

HDR=	../h/common.h ../h/client.h

SRC=	main.c

DEST1=	/bin
TARG1=	binclient
TARG1Q=	qbinclient
DEST2=	/usr/bin
TARG2=	usrbinclient
TARG2Q=	qusrbinclient
DEST3=	/usr/local
TARG3=	usrlocclient
TARG3Q=	qusrlocclient
DEST4=	/usr/ucb
TARG4=	usrucbclient
TARG4Q=	qusrucbclient
DEST5=	/usr/games
TARG5=	gamesclient
TARG5Q=	qgamesclient
DEST6=	/usr/new
TARG6=	usrnewclient
TARG6Q=	qusrnewclient
#
#the spec macros have the name of the program
#SPEC1=	test
#DESTSPEC1= /tmp

all:	$(TARG1) $(TARG1Q) $(TARG2) $(TARG2Q) $(TARG3) $(TARG3Q) $(TARG4) \
	$(TARG4Q) $(TARG5) $(TARG5Q) $(TARG6) $(TARG6Q)

clean:
	rm -f core *client 

lint:
	lint -abchx  main.c

install: $(DEST1)/.client $(DEST1)/.qclient \
	 $(DEST2)/.client $(DEST2)/.qclient \
	 $(DEST3)/.client $(DEST3)/.qclient \
	 $(DEST4)/.client $(DEST4)/.qclient \
	 $(DEST5)/.client $(DEST5)/.qclient
#	 $(DEST5)/.client $(DEST5)/.qclient \
#	 $(DEST6)/.client $(DEST6)/.qclient 
####################################################################
#	 Have the two commented lines replace the last line of the
#	 dependency list if your machine has /usr/new.
#
#        The following line is a sample for $(SPEC1) which would
#	 have to be added into the install dependency for each
#	 SPEC defined.
#        $(DESTSPEC1)/.$(SPEC1)client
####################################################################

####################################################################
# $(SPEC1) is a sample of a special client for controlling
# a single binary (like pi)
# NOARGV should be the full path name of where the real binary is
# is store (usually a .code directory)
####################################################################

$(SPEC1)client: main.c $(HDR)
	cc $(CFLAGS) -DNOARGV=\"$(DESTSPEC1)/.code/$(SPEC1)\" main.c -o $(SPEC1)client

$(DESTSPEC1)/.$(SPEC1)client: $(SPEC1)client
	install -c -m 4711 -o root $(SPEC1)client $(DESTSPEC1)/.$(SPEC1)client

####################################################################

$(TARG1): main.c $(HDR)
	cc $(CFLAGS) -DCODEPATH=\"$(DEST1)/.code/\" main.c -o $(TARG1)

$(TARG1Q): main.c $(HDR)
	cc $(CFLAGS) -DCODEPATH=\"$(DEST1)/.code/\" -DQUIET main.c -o $(TARG1Q)

$(TARG2): main.c $(HDR)
	cc $(CFLAGS) -DCODEPATH=\"$(DEST2)/.code/\" main.c -o $(TARG2)

$(TARG2Q): main.c $(HDR)
	cc $(CFLAGS) -DCODEPATH=\"$(DEST2)/.code/\" -DQUIET main.c -o $(TARG2Q)

$(TARG3): main.c $(HDR)
	cc $(CFLAGS) -DCODEPATH=\"$(DEST3)/.code/\" main.c -o $(TARG3)

$(TARG3Q): main.c $(HDR)
	cc $(CFLAGS) -DCODEPATH=\"$(DEST3)/.code/\" -DQUIET main.c -o $(TARG3Q)

$(TARG4): main.c $(HDR)
	cc $(CFLAGS) -DCODEPATH=\"$(DEST4)/.code/\" main.c -o $(TARG4)

$(TARG4Q): main.c $(HDR)
	cc $(CFLAGS) -DCODEPATH=\"$(DEST4)/.code/\" -DQUIET main.c -o $(TARG4Q)

$(TARG5): main.c $(HDR)
	cc $(CFLAGS) -DCODEPATH=\"$(DEST5)/.code/\" main.c -o $(TARG5)

$(TARG5Q): main.c $(HDR)
	cc $(CFLAGS) -DCODEPATH=\"$(DEST5)/.code/\" -DQUIET main.c -o $(TARG5Q)

$(TARG6): main.c $(HDR)
	cc $(CFLAGS) -DCODEPATH=\"$(DEST6)/.code/\" main.c -o $(TARG6)

$(TARG6Q): main.c $(HDR)
	cc $(CFLAGS) -DCODEPATH=\"$(DEST6)/.code/\" -DQUIET main.c -o $(TARG6Q)

$(DEST1)/.client: $(TARG1)
	install -c -m 4711 -o root $(TARG1) $(DEST1)/.client

$(DEST1)/.qclient: $(TARG1Q)
	install -c -m 4711 -o root $(TARG1Q) $(DEST1)/.qclient

$(DEST2)/.client: $(TARG2)
	install -c -m 4711 -o root $(TARG2) $(DEST2)/.client

$(DEST2)/.qclient: $(TARG2Q)
	install -c -m 4711 -o root $(TARG2Q) $(DEST2)/.qclient

$(DEST3)/.client: $(TARG3)
	install -c -m 4711 -o root $(TARG3) $(DEST3)/.client

$(DEST3)/.qclient: $(TARG3Q)
	install -c -m 4711 -o root $(TARG3Q) $(DEST3)/.qclient

$(DEST4)/.client: $(TARG4)
	install -c -m 4711 -o root $(TARG4) $(DEST4)/.client

$(DEST4)/.qclient: $(TARG4Q)
	install -c -m 4711 -o root $(TARG4Q) $(DEST4)/.qclient

$(DEST5)/.client: $(TARG5)
	install -c -m 4711 -o root $(TARG5) $(DEST5)/.client

$(DEST5)/.qclient: $(TARG5Q)
	install -c -m 4711 -o root $(TARG5Q) $(DEST5)/.qclient

$(DEST6)/.client: $(TARG6)
	install -c -m 4711 -o root $(TARG6) $(DEST6)/.client

$(DEST6)/.qclient: $(TARG6Q)
	install -c -m 4711 -o root $(TARG6Q) $(DEST6)/.qclient

####################################################################
# tmpclient is used for testing the ldc system with a /tmp/.code
# directory. This is not normally used.
####################################################################

tmpclient: main.c $(HDR)
	cc $(CFLAGS) -DCODEPATH=\"/tmp/.code/\" main.c -o tmpclient
	install -c -m 4711 -o root tmpclient /tmp/.client
@//E*O*F client/Makefile//
chmod u=r,g=r,o=r client/Makefile
 
echo x - client/main.c
sed 's/^@//' > "client/main.c" <<'@//E*O*F client/main.c//'

/*--------------------------------------------------------------------------
 * main.c - client
 *
 * Front end program that communicates with the ldd server. This front
 * end replaces the program to be controlled. The controlled binary is
 * hidden in a directory that is only accessable through group privledges.
 * Only one client executable is needed for each protected binary directory.
 * The real name of the program to be executed is extracted from argv[0]
 * unless NOARGV is defined. When defined, NOARGV has the name of the program
 * to be exec'ed wired in. The NOARGV option is necessary for programs like
 * pi and px which use argv[0] to pass data to them (YUCK!!!) when they are
 * called from pix. Usually all the front ends can just be links (hard or soft)
 * to the same code file.
 * The QUIET option allows suppression of the client status messages (this
 * is good for nroff). The ONETIME option exempts all child processes from
 * being queued once the parent process has passed through load control
 * once. (Good for queueing individual passes of a compiler, make, etc).
 * If for any reason the server is dead or not responding, this program will
 * simply exec the proper code file. This allows the load control system to
 * be quickly disabled by killing off the ldd server program. 
 * The front end checks every QUEUETIME seconds to see if the server is
 * still running and has this process queued up. If this poll fails the
 * control program is exec'ed. This protects against the system locking up due
 * to server death. The system WILL NOT be overloaded from a rash of executing
 * jobs as each job will expire relative to the time it was queued (which will
 * be spread out over time).
 *
 * Author : Keith Muller
 *          University of California, San Diego
 *	    Academic Computer Center C - 010
 *	    La Jolla Ca 92093
 *	    ucbvax!sdcsvax!sdcc3!muller
 *	    (619) 452-6090
 *---------------------------------------------------------------------------
 */

/* $Log$ */

#include "../h/common.h"
#include "../h/client.h"
#include <sys/uio.h>
#include <sys/socket.h>
#include <sys/un.h>
#include <sys/time.h>
#include <stdio.h>
#include <signal.h>
#include <errno.h>

int queued = 0;				 /* 1 if queued */
int msgsock = -1;			 /* desciptor of socket */
int len = 0;				 /* used to store address len */
char *ptr;				 /* ptr for pulling apart argv[0] */
char clientpath[255];			 /* buffer for socket name */
char binary[255];			 /* buffer for real binary's path */
struct request job;			 /* datagram for the server */
struct sockaddr_un name;		 /* socket address of server */
struct timeval polltime = {WAITTIME, 0}; /* waitime to check server */
extern int onint();			 /* interrupt handler */

/*-----------------------------------------------------------------------
 * main
 *
 *-----------------------------------------------------------------------
 */
main(argc, argv, envp)
int argc;
char **argv, **envp;
{
	register int i;			/* general counter */
	int msgmask;			/* mask for select */
	int readfds;			/* mask for desc to select on */
	int numfds;			/* number of desc select ret */
	int egid;			/* effective group id */
	int rgid;			/* real group id */
	int uid;			/* real user id */
	int pollcount;			/* number of polls to server */
	int descsize;			/* size of desc table */
	int sigmask;			/* signal mask before block */
	char msg;			/* answer from server */
	struct timeval now;		/* time (secs) value */
	struct timezone zone;		/* timezone value (unused) */
	int fromlen = 0;
#ifndef QUIET
	int announce;			/* limits "queued" messages */
	char *eptr;
	extern char *getenv();
	extern int strcmp();
#endif QUIET
	extern char *strcpy();
	extern char *strncpy();
	extern char *strcat();
	extern int getpid();
	extern int getegid();
	extern int getgid();
	extern int getuid();
	extern int sigblock();
	extern int sigsetmask();
	extern int errno;
	extern char *sprintf();
	extern char *rindex();

	/*
	 * the client front end runs ONLY setuid to root. so get real user
	 * and both effective and real gids.
	 */
	egid = getegid();
	rgid = getgid();
	uid = getuid();

	/*
	 * set the users real and effective uid (no limits on root). also set
	 * the group id to LDDGID so a socket can be bound in the spool
	 * directory and a datagram can be sent to the server. (the spool
	 * directory MUST BE in group LDDGID and mode 0730 only!
	 * NO OTHER PRIVLEDGES AT ALL!!!!!)
	 */
	(void)setregid(rgid, LDDGID);
	(void)setreuid(uid, uid);

	/*
	 * If NOARGV is defined, then this is a special client which
	 * will only exec a SINGLE program. This is to get around things
	 * like pi which can use argv[0] to pass data. Otherwise we must
	 * find the base name of the requested program. Since argv[0]
	 * can be a long ugly path name, ugly looking code is needed
	 * to strip off the path.
	 */
#ifdef NOARGV
	/*
	 * NOARGV is set in the makefile to have the FULL path of where the
	 * real binary lives: for example /usr/bin/.code/yuck
	 */
	(void)strcpy(binary, NOARGV);
	if ((ptr = rindex(binary, '/')) == (char *)0)
		ptr = binary;
	else
		ptr++;
#else
	/*
	 * must pull the path out of the argv[0]
	 */
	if ((ptr = rindex(argv[0], '/')) == (char *)0)
		ptr = argv[0];
	else
		ptr++;
	(void)sprintf(binary, "%s%s", CODEPATH, ptr);
#endif
	/*
	 * If ONETIME is defined, then all child processes of this job are
	 * EXEMPT from being queued. This is useful for things like pi which
	 * can be called both by a user and from pix.
	 * This works because if the effective gid of the process
	 * is the group LDDGID this process must be a decendent of a process
	 * that has already passed through the load control system. This
	 * mechanism will work only if this program is setuid and IS NOT
	 * setgid.
	 *
	 * root is always exempt!
	 *
	 * NOTE: ptr will be used later to build up the command line buffer
	 * in the datagram request packet sent to the server.
	 */

#ifdef ONETIME
	if ((egid == LDDGID) || (uid == 0))
		run(argv, envp);
#else
	if (uid == 0)
		run(argv, envp);
#endif ONETIME

	/*
	 * create the socket and the datagram. if anything fails
	 * just run. cannot afford to have this process HANG!
	 */
	msgsock = socket(AF_UNIX, SOCK_DGRAM, 0);
	if (msgsock < 0)
		run(argv, envp);

	/*
	 * bind the handler to clean up
	 */
	(void)signal(SIGINT, onint);
	(void)signal(SIGHUP, onint);
	(void)signal(SIGQUIT, onint);
	(void)signal(SIGTERM, onint);

	/*
	 * make the datagram up
	 */
	job.pid = (u_long)getpid();
	(void)sprintf(clientpath,"%s/%s%u",SPOOLDIR,CLIENTPRE,job.pid);
	(void)strcpy(name.sun_path, clientpath);
	name.sun_family = AF_UNIX;
	len = strlen(name.sun_path) + 1 + sizeof(name.sun_family);

	/*
	 * block off interrupt and control z until we get datagram
	 * sent
	 */
	sigmask = sigblock((1<<(SIGINT-1)) | (1<<(SIGTSTP-1)) | (1<<(SIGHUP-1)) |
			   (1<<(SIGQUIT-1)) | (1<<(SIGTERM-1)));

	/*
	 * bind the socket, if it fails just run
	 */
	(void)unlink(name.sun_path);

	if (bind(msgsock, &name, len) < 0){
		(void)sigsetmask(sigmask);
		run(argv, envp);
	}

	/*
	 * build up the command line that will be displayed in the
	 * when the user interrogates the queue. This helps in identifying
	 * which job is which.
	 */
	(void)strncpy(job.com, ptr, COMLEN - 1);
	i = 1;
	len = strlen(job.com) + 1;
	while((i < argc) && ((len = len + strlen(argv[i]) + 1) <= COMLEN)){
		(void)strcat(job.com, " ");
		(void)strcat(job.com, argv[i]);
		i++;
	}

	/*
	 * put the path name of the servers datagram in the sockaddr struct
	 */
	(void)strcpy(name.sun_path, MSGPATH);
	len = strlen(name.sun_path) + 1 + sizeof(name.sun_family);

	/*
	 * time stamp the datagram, and place my pid in it (identfies the name
	 * of this clients bound socket
	 */
	if (gettimeofday(&now, &zone) < 0)
		run(argv, envp);
	job.time = now.tv_sec;
	job.type = QCMD;
	job.uid = (u_long)uid;

	/*
	 * send the request to the server
	 */
	if (sendto(msgsock, &job, sizeof(struct request), 0, &name, len) < 0){
		(void)sigsetmask(sigmask);
		run(argv, envp);
	}

	descsize = getdtablesize();
	msgmask = 1 << msgsock;
	pollcount = 0;

	/*
	 * unblock the signals
	 */
	 (void)sigsetmask(sigmask);

#ifndef QUIET
	/*
	 * set announce to 0 to get at least one status message
	 * if user has ENVNAME = quiet in his environment, no messages!
	 */
	if (((eptr=getenv(ENVNAME))==(char *)0)||(strcmp(eptr,"quiet")!=0))
		announce = 0;
	else
		announce = -1;
#endif QUIET

	/*
	 * wait for the word from the server!
	 */
	for(;;){
		readfds = msgmask;
		numfds = select(descsize,&readfds,(int *)0,(int *)0,&polltime);

		/*
		 * if there is a screwup here just run
		 */
		if (((numfds<0)&&(errno!=EINTR))||((numfds<=0)&&(pollcount>MAXPOLLS))){
#ifndef QUIET
			if (announce == 1)
				fprintf(stderr,"running\n");
#endif QUIET
			run(argv, envp);
		}
		
		/*
		 * we waitied polltime seconds and no word from the server
		 * so send the datagram again in case the system lost it
		 * OR else we got a garbage answer!
		 */
		if ((numfds<=0)||(recvfrom(msgsock,&msg,sizeof(msg),0,(struct sockaddr *)0,&fromlen)<=0)){
			pollcount++;
			/*
			 * oh server where are you?
			 */
			if (sendto(msgsock,&job,sizeof(struct request),0,&name,len)<0){
#ifndef QUIET
				if (announce == 1)
					fprintf(stderr,"running\n");
#endif QUIET
				run(argv, envp);
			}else{
				/*
				 * the datagram was sent so switch to WAITTIME
				 * for an answer. WAITTIME is much shorter
				 * than QUEUETIME as we want to be queued.
				 */
				polltime.tv_sec = WAITTIME;
				continue;
			}
		}

		/*
		 * we got the word see what to do
		 */
		switch(msg){
			case RUNCMD:
				/*
				 * we can run
				 */
#ifndef QUIET
			 	if (announce == 1)
					fprintf(stderr,"running\n");
#endif QUIET
				run(argv, envp);
				break;
			case STOPCMD:
				/*
				 * bye bye
				 */
#ifndef QUIET
			 	if (announce == 1)
					fprintf(stderr,"stopped\n");
#endif QUIET
				(void)close(msgsock);
				(void)unlink(clientpath);
				exit(1);
			case QCMD:
				/*
				 * we have been queued 
				 * switch to QUEUETIME so to check later
				 * that the server is still around (so we
				 * don't wait forever!
				 * Switch to POLLCMD so that the server knows
				 * we have been in the queue at least once.
				 */
				polltime.tv_sec = QUEUETIME;
				queued = 1;
				pollcount = 0;
				job.type = POLLCMD;
#ifndef QUIET
				/*
				 * tell user he is being queued
				 */
				if (announce == 0){
					fprintf(stderr,"Queued, waiting to run....");
					announce = 1;
				}
#endif QUIET
				break;
			case FULLQUEUE:
				/*
				 * The queue is full, this job cannot be
				 * accepted. This prevents the system
				 * from running out of slots in the
				 * process table.
				 */
			 	fprintf(stderr,"Cannot run, the system is overloaded. Try again later.\n");
				(void)close(msgsock);
				(void)unlink(clientpath);
				exit(1);
			case POLLCMD:
				/*
				 * server wants the data again
				 * The only way we can get this is from
				 * a new server during startup.
				 * So reset the datagram to a QCMD.
				 * (fall through to default below),
				 */
				job.type = QCMD;
			default:
				/*
				 * or got garbage
				 */
				if (sendto(msgsock,&job,sizeof(struct request),0,&name,len)<0){
#ifndef QUIET
			 		if (announce == 1)
						fprintf(stderr,"running\n");
#endif QUIET
					run(argv, envp);
				}

				/*
				 * switch back to WAITTIME to be a pest until
				 * we get queued
				 */
				polltime.tv_sec = WAITTIME;
				queued = 0;
				pollcount = 0;
				break;
		}
	}
}

/*-----------------------------------------------------------------------------
 * onint
 *
 * what to do when the user wants out
 *-----------------------------------------------------------------------------
 */
onint()
{

	/*
	 * if we are already queued say goodbye to the server
	 */
	if (queued == 1){
		/*
		 * Send a message to the server we are quitting so the server
		 * can remove us from the queue.
		 */
		job.type = PJOBCMD;
		job.time = job.pid;
		(void)sendto(msgsock,&job,sizeof(struct request),0,&name,len);
	}
	(void)close(msgsock);
	(void)unlink(clientpath);
	exit(0);
}

/*-----------------------------------------------------------------------------
 * run
 *
 * routine that execs the real program after getting the ok 
 *-----------------------------------------------------------------------------
 */

run(argv, envp)
char **argv, **envp;
{
	extern int msgsock;
	extern char binary[];
	extern char clientpath[];

	/*
	 * shut down the socket and remove it from the spool
	 */
	if (msgsock != -1)
		(void)close(msgsock);
	(void)unlink(clientpath);

	/*
	 * all is set try to run!
	 * this works because the directory where the REAL code file is
	 * is mode 0730 and the group MUST BE LDDGID!
	 * and we are now running with an effective gid of LDDGID
	 */
	(void)execve(binary, argv, envp);

	/*
	 * from now on something screwed up! print the error and
	 * hope the user reports it!
	 */
	perror(binary);
	exit(0);
}
@//E*O*F client/main.c//
chmod u=r,g=r,o=r client/main.c
 
echo x - scripts/addldd
sed 's/^@//' > "scripts/addldd" <<'@//E*O*F scripts/addldd//'
#! /bin/csh -f
#
# script to add file1 through filen located in directory to the processes
# controlled by the ldd system.
#
# THIS IS FOR COMMANDS THAT SHOULD HAVE STATUS ANNOUNCEMENTS

umask 022
if ($#argv < 2) then
	echo "usage: addldd directory file1 file2 .. filen"
else
	echo "cd $1"
	cd $1
	shift
	while($#argv)
		if (-e .client == 0) then
			echo "there is no .client front end. Do a make install."
			break
		else if (-e .code/$1) then
			echo "$1 is already load controlled"
		else if (-e $1) then
			echo "putting $1 under load control"
			echo "mv $1 .code/$1"
			/bin/mv $1 .code/$1
			echo "ln -s .client $1"
			/bin/ln -s .client $1
		else
			echo "$1 does not exsist"
		endif
		shift
	end
endif
@//E*O*F scripts/addldd//
chmod u=rx,g=rx,o=rx scripts/addldd
 
echo x - scripts/makedirs
sed 's/^@//' > "scripts/makedirs" <<'@//E*O*F scripts/makedirs//'
#! /bin/csh -f
# make all the directories needed by the load control system
# NOTE: lddgrp must be defined as the same group id as in h/client.h
#
echo "making the directories for load control"

foreach i (/bin /usr/bin /usr/local /usr/ucb /usr/new /usr/games)
	if (-e $i && -e $i/.code == 0) then
		echo "mkdir $i/.code"
		/bin/mkdir $i/.code
	endif
	if (-e $i/.code) then
		echo "chown root $i/.code"
		/etc/chown root $i/.code
		echo "chgrp lddgrp $i/.code"
		/bin/chgrp lddgrp $i/.code
		echo "chmod 0710 $i/.code"
		/bin/chmod 0710 $i/.code
	endif
end
#
if (-e /usr/spool/ldd == 0) then
	echo "mkdir /usr/spool/ldd"
	/bin/mkdir /usr/spool/ldd
endif
if (-e /usr/spool/ldd/sr == 0) then
	echo "mkdir /usr/spool/ldd/sr"
	/bin/mkdir /usr/spool/ldd/sr
endif
if (-e /usr/spool/ldd/cl == 0) then
	echo "mkdir /usr/spool/ldd/cl"
	/bin/mkdir /usr/spool/ldd/cl
endif
#
# spool/ldd/cl MUST BE GROUP writeable
# all others should NOT be GROUP writeable
#
echo "chown root /usr/spool/ldd"
/etc/chown root /usr/spool/ldd

echo "chgrp lddgrp /usr/spool/ldd"
/bin/chgrp lddgrp /usr/spool/ldd

echo "chmod 0710 /usr/spool/ldd"
/bin/chmod 0710 /usr/spool/ldd

echo "chown root /usr/spool/ldd/cl"
/etc/chown root /usr/spool/ldd/cl

echo "chgrp lddgrp /usr/spool/ldd/cl"
/bin/chgrp lddgrp /usr/spool/ldd/cl

echo "chmod 0730 /usr/spool/ldd/cl"
/bin/chmod 0730 /usr/spool/ldd/cl

echo "chown root /usr/spool/ldd/sr"
/etc/chown root /usr/spool/ldd/sr

echo "chgrp lddgrp /usr/spool/ldd/sr"
/bin/chgrp lddgrp /usr/spool/ldd/sr

echo "chmod 0710 /usr/spool/ldd/sr"
/bin/chmod 0710 /usr/spool/ldd/sr
@//E*O*F scripts/makedirs//
chmod u=rx,g=rx,o=rx scripts/makedirs
 
echo x - scripts/qaddldd
sed 's/^@//' > "scripts/qaddldd" <<'@//E*O*F scripts/qaddldd//'
#! /bin/csh -f
#
# script to add file1 through filen located in directory to the processes
# controlled by the ldd system.
#
# THIS IS FOR COMMANDS THAT *DO* *NOT* HAVE STATUS ANNOUNCEMENTS

umask 022
if ($#argv < 2) then
	echo "usage: qaddldd directory file1 file2 .. filen"
else
	echo "cd $1"
	cd $1
	shift
	while($#argv)
		if (-e .qclient == 0) then
			echo "there is no .qclient front end. Do a make install."
			break
		else if (-e .code/$1) then
			echo "$1 is already load controlled"
		else if (-e $1) then
			echo "putting $1 under load control (quiet mode)"
			echo "mv $1 .code/$1"
			/bin/mv $1 .code/$1
			echo "ln -s .qclient $1"
			/bin/ln -s .qclient $1
		else
			echo "$1 does not exsist"
		endif
		shift
	end
endif
@//E*O*F scripts/qaddldd//
chmod u=rx,g=rx,o=rx scripts/qaddldd
 
echo x - scripts/rmldd
sed 's/^@//' > "scripts/rmldd" <<'@//E*O*F scripts/rmldd//'
#! /bin/csh -f
#
# script to remove process file1 through filen in directory from the
# load control system. 

umask 022
if ($#argv < 2) then
	echo "usage: rmldd directory file1 file2 .. filen"
else
	echo "cd $1"
	cd $1
	shift
	while($#argv)
		if (-e .code/$1) then
			echo "removing $1 from load control"
			echo "rm $1"
			/bin/rm $1
			echo "mv .code/$1 $1"
			/bin/mv .code/$1 $1
		else
			echo "$1 is not load controlled"
		endif
		shift
	end
endif
@//E*O*F scripts/rmldd//
chmod u=rx,g=rx,o=rx scripts/rmldd
 
echo x - scripts/saddldd
sed 's/^@//' > "scripts/saddldd" <<'@//E*O*F scripts/saddldd//'
#! /bin/csh -f
#
# script to add file file1 as a SPECIAL client in directory directory.
#
# THIS IS FOR COMMANDS THAT REQUIRE SPECIAL PRIVATE CLIENTS

umask 022
if ($#argv != 2) then
	echo "usage: saddldd directory file"
else
	echo "cd $1"
	cd $1
	shift
	if (-e .$1client == 0) then
		echo "there is no .$1client front end. Do a make install."
		break
	else if (-e .code/$1) then
		echo "$1 is already load controlled"
	else if (-e $1) then
		echo "putting $1 under load control (special client)"
		echo "mv $1 .code/$1"
		/bin/mv $1 .code/$1
		echo "ln -s .$1client $1"
		/bin/ln -s .$1client $1
	else
		echo "$1 does not exsist"
	endif
endif
@//E*O*F scripts/saddldd//
chmod u=rx,g=rx,o=rx scripts/saddldd
 
echo Inspecting for damage in transit...
temp=/tmp/shar$$; dtemp=/tmp/.shar$$
trap "rm -f $temp $dtemp; exit" 0 1 2 3 15
cat > $temp <<\!!!
     154     491    4545 Makefile
     471    1929   12916 main.c
      32     123     711 addldd
      63     202    1580 makedirs
      32     126     733 qaddldd
      25      76     454 rmldd
      28     113     644 saddldd
     805    3060   21583 total
!!!
wc  client/Makefile client/main.c scripts/addldd scripts/makedirs scripts/qaddldd scripts/rmldd scripts/saddldd | sed 's=[^ ]*/==' | diff -b $temp - >$dtemp
if [ -s $dtemp ]
then echo "Ouch [diff of wc output]:" ; cat $dtemp
else echo "No problems found."
fi
exit 0

muller@sdcc3.UUCP (Keith Muller) (02/21/85)

This is part 1 of the load control system. This part MUST be unpacked
BEFORE any other part.


# This is a shell archive.  Remove anything before this line,
# then unpack it by saving it in a file and typing "sh file".
#
# Wrapped by sdcc3!muller on Sat Feb  9 13:40:15 PST 1985
# Contents:  client/ control/ h/ scripts/ server/ man/ README NOTICE Makefile
#	man/Makefile man/ldc.8 man/ldd.8 man/ldq.1 man/ldrm.1
 
echo x - README
sed 's/^@//' > "README" <<'@//E*O*F README//'
TO INSTALL: (you MUST be root) (January 24, 1985 version)

1) Select a group id for load control system to use. No user should be in this
   group. Add this group to /etc/groups and call it lddgrp.
   ** By default the group id 25 is used. **

2) Look at the file h/common.h. Make sure that LDDGID is defined to be the
   same group id as you selected in step 1.

3) cd to the scripts directory. Inspect the paths used in the file makedirs.
   The script makedirs creates the required directories with the proper modes
   groups and owners. The .code directories are where the real executable
   files are hidden, protected by group access (the directory is protected
   from all "other" access). Each directory which contains programs that you
   want load controlled must have a .code subdirectory.

   NOTE: You really do not have to change makedirs at all except to ADD
   any additional directories you want controlled. It is perfectly safe to
   just run this system on any 4.2 system without ANY path changes (this
   includes sun, vax and pyramid versions).

4) If you alter or add any pathnames in makedirs, you might have to adjust
   the makefiles. For each subdirectory (client, server, control) adjust
   or add the paths in the Makefiles. 

5) If you alter any pathname in makedirs you will have to check all the h
   files in the directory h. Change any paths as required. 

6) run makedirs (if you have an older release of ldd: You should shut down
   the ldd server and remove the old status and errlog file. Then run 
   makedirs.) Makedirs can be run any number of times without harm. It will
   reset the owners and groups of all directories to the correct state.

7) In the top level directory (The same directory as this README file is in),
   run make. then make install. All the binaries are now in place.

8) Start the ldd server:
	/etc/ldd [-T cycle] [-L load]

   The server will detach itself and wait for requests. You should get no
   messages from the server. The two flags are optional. The -L flag
   specifies the number of seconds between each load average check. The
   -L flag specifies the load average queueing starts. If neither are
   specified the defaults are used. (see the manual page for ldd). You
   can change the defaults by editing h/server.h. ALRMTIME is the cycle
   time, and MAXLOAD is the load average.

   The following are good values to start with:

   machine		cycle 			load
   ----------------------------------------------------------
   pyramid 90x		25			10.0
   pyramid 90mx		15			15.0
   vax 780		50			9.0
   vax 750		60			7.5
   vax 730		60			6.0
   sun 2		60			6.5

9) add the following lines to /etc/rc.local (change path and add any ldd
   arguements as selected from the above table). See the man page on ldd
   for more info.

if [ -f /etc/ldd ]; then
	/bin/rm -f /usr/spool/ldd/sr/errors
	/etc/ldd & echo -n ' ldd'			>/dev/console
fi

10) for each directory to be controlled select those programs you want under
    the load control system. The programs you select should be jobs that 
    usually do not require user interaction, though nasty systems like macsyma
    might be load controlled anyway. Never load control things that have time
    response requirements. The jobs you select will determine the overall
    usefullness of the load control system. For the load control system to
    be completely effective, all the programs that cause any significant load
    on the system should be placed under load control. For example the cc
    command is a very typical of a program that should be load controlled.
    When run, cc uses large amount of resources which increases as the size
    of the program being compiled increases. When there are many cc's running
    simultaneously the machine gets quite overloaded and your system thrashes.
    A poor choice would be a command like cat. Sure cat can do a lot of i/o,
    but even ten cat's reading very large files do not impact the system
    very much. Troff is a very good command to load control. It is not very 
    interactive, and a lot of them running would bring even slow a cray.
    Watching your system when it is overoaded with ps au should tell you which
    programs on your system need to be load controlled.

    The following is a list of programs I have under load control:

    /bin/cc /bin/make /bin/passwd /usr/bin/pc /usr/bin/pix /usr/bin/liszt
    /usr/bin/lisp /usr/bin/vgrind /usr/ucb/f77 /usr/ucb/lint /usr/ucb/nroff
    /usr/ucb/spell /usr/ucb/troff /usr/ucb/yacc

    The following is the list of places to look for other candidates for load
    control:
	a) /bin
	b) /usr/bin
	c) /usr/ucb
	d) /usr/new
   	e) /usr/local
	f) /usr/games

    i)  some programs use argv[0] to pass data (so far only the ucb pi
	does this when called by pix). These programs must be treated
	differently (since they mangle argv[0], it cannot be used to
	determine which binary to execute). A special client called
	.NAMEclient where NAME is the actual name of the program must be
	created. These special programs must be specified in the 
	client/Makefile.  See the sample for $(SPEC1) which is for a program
	called test in /tmp. Run the script onetime/saddldd for these programs.

    ii) run the script scripts/addldd with each program to be load controlled
	that requires a STATUS MESSAGE ("Queued waiting to run.") as an
	arguement (i.e. addldd /bin cc make)

    iii)run the script scripts/qaddldd with each program to be load controlled
	that DOES NOT require a STATUS MESSAGE as an arguement
	(i.e. qaddldd /usr/bin nroff)

    addldd/qaddldd/saddldd moves the real binary into the .code file and
    replaces it with a "symbolic link" to either .client (for addldd and
    qaddldd) or a .NAMEclient (for saddldd) So the command:
	addldd /bin cc
    moves cc to /bin/.code/cc and creates the symbolic link /bin/cc
    to /bin/.client.

11) any changes to any file in the load control system from now on
    will be correctly handled by a make install from the top level directory.

12) the script script/rmldd can be used to remove programs from the ldd system.

13) Compilers like cc and pc should have all the intermediate passes protected.
    Each pass must be in group lddgrp and have the others access turned off
    For example:
	chmod 0750 /lib/c2
	chgrp lddgrp /lib/c2

14) When the system is running you might have to adjust the operating 
    parameters of ldd for the job mix and the capacity of your machine.
    Use ldc to adjust these parameters while the load control system is
    running and watch what happens. The .h files as supplied use values that
    will safely work on any machine, but might not be best values for your
    specific needs. In the vast majority of cases, only the load point
    and cycle time need to be changed and these can be set with arguements to
    ldd when it is first invoked.  Be careful as radical changes to
    the defaults might make defeat the purpose of ldd. If things ever get
    really screwed up, you can just kill -9 the server (or from ldc: abort
    server) and things will run just like the load control doesn't exsist.
    (Note the pid of the currently running ldd is always stored in the lock
    file "spool/ldd/sr/lock"). (See the man page on ldd for more).

15) If load control does not stop the system load to no more than the load
    limit + 2.5 then there are programs that are loading down the machine
    which are not under load control. Find out what they are and load control
    them. 

16) To increase the response of the system you can lower the load threshold.
    Of course if the threshold gets too low the system can end up with long
    wait times for running. Long wait times are usually around 3000 seconds
    for super loaded vaxes. On the very fast pyramids, 500 seconds (48 users
    and as many large cc as the students can get running) seems the longest
    delay I have seen. You can also play with the times between checks. This
    has some effect on vaxes but 50 - 60 seconds seems optimal. On pyramids
    it is quite different. Since the throughput is so very much greater
    than vaxes (four times greater at the very least), the load needs to be
    checked at least every 25 seconds. If this check time is too long you
    risk having the machine go idle for a number of seconds. Since the whole
    point is to squeeze out every last cpu cycle out of the machine, idle
    time must be avoided. Watching the machine with vmstat or the mon program
    is useful for this. Try to keep the user percentage of the cpu as high
    as possible. Try to have enough jobs runnable so the machine doesn't
    go idle do to a lack of jobs (yes this can happen with lots of disk io).

17) If you want/need more info on the inner workings of the ldd system, you
    can read the comments in the .h files and the source files. If you have
    problems drop me a line. I will be happy to answer any questions.

    Keith Muller
    University of California, San Diego
    Mail Code C-010
    La Jolla, CA  92093
    ucbvax!sdcsvax!muller
    (619) 452-6090
@//E*O*F README//
chmod u=r,g=r,o=r README
 
echo x - NOTICE
sed 's/^@//' > "NOTICE" <<'@//E*O*F NOTICE//'
DISCLAIMER
  "Although each program has been tested by its author, no warranty,
  express or implied, is made by the author as to the accuracy and
  functioning of the program and related program material, nor shall
  the fact of distribution constitute any such warranty, and no
  responsibility is assumed by the author in connection herewith."
  
  This program cannot be sold, distributed or copied for profit, without
  prior permission from the author. You are free to use it as long the
  author is properly credited with it's design and implementation.

  Keith Muller
  Janaury 15, 1985 
  San Diego, CA
@//E*O*F NOTICE//
chmod u=r,g=r,o=r NOTICE
 
echo x - Makefile
sed 's/^@//' > "Makefile" <<'@//E*O*F Makefile//'
#
#	Makefile for ldd server and client 
#
#

all:
	cd server; make ${MFLAGS}
	cd client;  make ${MFLAGS}
	cd control;  make ${MFLAGS}

lint: 
	cd server; make ${MFLAGS} lint
	cd client;  make ${MFLAGS} lint
	cd control;  make ${MFLAGS} lint

install: 
	cd server; make ${MFLAGS} install
	cd client;  make ${MFLAGS} install
	cd control;  make ${MFLAGS} install
	cd man; make ${MFLAGS} install

clean:
	cd server; make ${MFLAGS} clean
	cd client;  make ${MFLAGS} clean
	cd control;  make ${MFLAGS} clean
@//E*O*F Makefile//
chmod u=r,g=r,o=r Makefile
 
echo mkdir - client
mkdir client
chmod u=rwx,g=rx,o=rx client
 
echo mkdir - control
mkdir control
chmod u=rwx,g=rx,o=rx control
 
echo mkdir - h
mkdir h
chmod u=rwx,g=rx,o=rx h
 
echo mkdir - scripts
mkdir scripts
chmod u=rwx,g=rx,o=rx scripts
 
echo mkdir - server
mkdir server
chmod u=rwx,g=rx,o=rx server
 
echo mkdir - man
mkdir man
chmod u=rwx,g=rx,o=rx man
 
echo x - man/Makefile
sed 's/^@//' > "man/Makefile" <<'@//E*O*F man/Makefile//'

#
# Makefile for ldd manual pages
#

DEST=	/usr/man

TARG=	$(DEST)/man8/ldd.8 $(DEST)/man8/ldc.8 $(DEST)/man1/ldrm.1 \
	$(DEST)/man1/ldq.1

all:

install: $(TARG)

$(DEST)/man8/ldd.8: ldd.8
	install -c -o root ldd.8 $(DEST)/man8

$(DEST)/man8/ldc.8: ldc.8
	install -c -o root ldc.8 $(DEST)/man8

$(DEST)/man1/ldrm.1: ldrm.1
	install -c -o root ldrm.1 $(DEST)/man1

$(DEST)/man1/ldq.1: ldq.1
	install -c -o root ldq.1 $(DEST)/man1

clean:
@//E*O*F man/Makefile//
chmod u=r,g=r,o=r man/Makefile
 
echo x - man/ldc.8
sed 's/^@//' > "man/ldc.8" <<'@//E*O*F man/ldc.8//'
@.TH LDC 8 "24 January 1985"
@.UC 4
@.ad
@.SH NAME
ldc \- load system control program
@.SH SYNOPSIS
@.B /etc/ldc
[ command [ argument ... ] ]
@.SH DESCRIPTION
@.I Ldc
is used by the system administrator to control the
operation of the load control system, by sending commands to
@.I ldd
(the load control server daemon).
@.I Ldc
may be used to:
@.IP \(bu
list all the queued jobs owned by a single user,
@.IP \(bu
list all the jobs in the queue,
@.IP \(bu
list the current settings of changeable load control server parameters,
@.IP \(bu
abort the load control server,
@.IP \(bu
delete a job from the queue (specified by pid or by user name),
@.IP \(bu
purge the queue of all jobs,
@.IP \(bu
rearrange the order of queued jobs,
@.IP \(bu
run a job regardless of the system load (specified by pid or user name),
@.IP \(bu
change the load average at which jobs will be queued,
@.IP \(bu
change the limit on the number of jobs in queue,
@.IP \(bu
change the number of seconds between each check on the load average,
@.IP \(bu
print the contents of the servers error logging file,
@.IP \(bu
change the maximum time limit that a job can be queued.
@.PP
Without any arguments,
@.I ldc
will prompt for commands from the standard input.
If arguments are supplied,
@.IR ldc
interprets the first argument as a command and the remaining
arguments as parameters to the command.  The standard input
may be redirected causing
@.I ldc
to read commands from a file.
Commands may be abbreviated, as any unique prefix of a command will be
accepted.
The following is the list of recognized commands.
@.TP
? [ command ... ]
@.TP
help [ command ... ]
@.br
Print a short description of each command specified in the argument list,
or, if no arguments are given, a list of the recognized commands.
@.TP
abort server
@.br
Terminate the load control server.
This does 
@.I not
terminate currently queued jobs, which will run when they
next poll the server (usually every 10 minutes).
If the server is restarted these jobs will be inserted into the queue ordered
by the time at which the job was started.
Jobs will 
@.I not
be lost by aborting the server.
Both words "abort server" must by typed (or a unique prefix) as a safety
measure.
Only root can execute this command.
@.TP
delete [\f2pids\f1] [-u \f2users\f1]
@.br
This command has two modes. It will delete jobs listed by pid, or with the
@.B \-u
option delete all the jobs owned by the listed users.
Job that are removed from the queue will exit returning status 1 (they
do not run).
Users can only delete jobs they own from the queue, while root can delete any
job.
@.TP
errors
@.br
Print the contents of the load control server error logging file.
@.TP
list [\f2user\f1]
@.br
This will list the contents of the queue, showing each jobs rank, pid,
owner, time in queue, and a abbreviated line of the command to be executed
for the specified user. If no user is specifies, it defaults to be the
user running the command. (Same as the ldq command).
@.TP
loadlimit \f2value\f1
@.br
Changes the load average at which the load control system begins
to queue jobs to \f2value\f1.
Only root can execute this command.
@.TP
longlist
@.br
Same as list except prints ALL the jobs in the queue. This is expensive to
execute. (Same as the ldq -a command).
@.TP
move \f2pid rank\f1
@.br
Moves the process specified by process id 
@.I pid
to position 
@.I rank
in the queue.
Only root can execute this command.
@.TP
purge all
@.br
Removes ALL the jobs from the queue. Removed jobs terminate returning a
status of 1.
As a safety measure both the words "purge all" (or a prefix of) must be typed.
Only root can execute this command.
@.TP
quit
@.br
Exit from ldc.
@.TP
run [\f2pids\f1] [-u \f2users\f1]
@.br
Forces the jobs with the listed 
@.I pids
to be run 
@.I regardless 
of the system load.
The
@.B \-u
option forces all jobs owned by the listed users to be run regardless
of the system load.
Only root can execute this command.
@.TP
sizeset \f2size\f1
@.br
Sets the limit on the number of jobs that can be in the queue to be
@.I size.
This prevents the unix system process table from running out of slots if
the system is extremely overloaded. All job requests that are made while
the queue is at the limit are rejected and told to try again later.
The default value is 150 jobs.
Only root can execute this command.
@.TP
status
@.br
Prints the current settings of internal load control server variables.
This includes the number of jobs in queue, the load average above which
jobs are queued, the limit on the size of the queue, the time in seconds between
load average checks by the server, the maximum time in seconds a job can be
queued, and the number of recoverable errors detected by the server.
@.TP
timerset \f2time\f1
@.br
Sets the number of seconds that the server waits between system load average
checks to
@.I time.
(Every 
@.I time
seconds the server reads the current load average and if it below the load
average limit (see 
@.I loadlimit
) the jobs are removed from the front of the queue and told to run).
Only root can execute this command.
@.TP
waitset \f2time\f1
@.br
Sets the maximum number of seconds that a job can be queued regardless
of the system load to 
@.I time
seconds.
This will prevent the load control system from backing up with jobs that never
run do to some kind of degenerate condition.
@.SH EXAMPLES
To list the jobs owned by user joe:
@.sp
list joe
@.sp
To move process 45 to position 6 in the queue:
@.sp
move 45 6
@.sp
To delete all the jobs owned by users sam and joe:
@.sp
delete -u sam joe
@.sp
To run jobs with pids 1121, 1177, and 43:
@.sp
run 1121 1177 43
@.SH FILES
@.nf
/usr/spool/ldd/*	spool directory where sockets are bound
@.fi
@.SH "SEE ALSO"
ldd(8),
ldrm(1),
ldq(1)
@.SH DIAGNOSTICS
@.nf
@.ta \w'?Ambiguous command      'u
?Ambiguous command	abbreviation matches more than one command
?Invalid command	no match was found
?Privileged command	command can be executed by only by root
@.fi
@//E*O*F man/ldc.8//
chmod u=r,g=r,o=r man/ldc.8
 
echo x - man/ldd.8
sed 's/^@//' > "man/ldd.8" <<'@//E*O*F man/ldd.8//'
@.TH LDD 8 "24 January 1985"
@.UC 4
@.ad
@.SH NAME
ldd \- load system server (daemon)
@.SH SYNOPSIS
@.B /etc/ldd
[ 
@.B \-L 
@.I load
] [ 
@.B \-T
@.I alarm 
]
@.SH DESCRIPTION
@.TP
@.B \-L
changes the load average threshold to
@.I load
instead of the default (usually 10).
@.TP
@.B \-T
changes the time (in seconds) 
between load average checks to 
@.I alarm
seconds instead of the default (usually 60 seconds).
@.PP
@.I Ldd
is the load control server (daemon) and is normally invoked
at boot time from the
@.IR rc.local (8)
file.
The
@.I ldd
server attempts to maintain the system load average
below a preset value so interactive programs like
@.IR vi (1)
remain responsive.
@.I Ldd
works by preventing the system from thrashing
(i.e. excessive paging and high rates of context switches) and decreasing the
systems throughput by limiting the number runnable processes in the system
at a given moment.
When the system load average 
is above the threshold,
@.I ldd
will block specific cpu intensive processes from running and place
them in a queue.
These blocked jobs are not runnable and therefore do not 
contribute to the system load. When the load average drops below the threshold,
@.I ldd
will remove jobs from the queue and allow them to continue execution.
The system administration determines which programs are 
considered cpu intensive and places control of their execution under the
@.I ldd
server.
The system load average is the number of runnable processes,
and is measured by the 1 minute 
@.IR uptime (1)
statistics.
@.PP
A front end client program replaces each program controlled by the
@.I ldd
server.
Each time a user requests execution of a controlled program, the
client enters the request state,
sends a "request to run" datagram to the server and waits for a response. The
waiting client is blocked, waiting for the response from the
@.I ldd
server.
If the client does not receive an answer to a request after a certain
period of time has elapsed (usually 90 seconds), the request is resent.
If the request is resent a number of times (usually 3) 
without response from the server, the requested program is executed. 
This prevents the process from being blocked forever if the
@.I ldd
server fails.
@.PP
The
@.I ldd
server can send one of five different messages to the client.
A "queued message" indicates that the client has
been entered into the queue and should wait.
A "poll message" indicates that the server did not receive a message,
so the client should resend the message.
A "terminate message" indicates that the request cannot be honored
and the client should exit abnormally.
A "run message" indicates the requested program should be run.
A "full message" indicates that the ldd queue is full and this request cannot
be accepted. This limit is to prevent the Unix kernel process table from
running out of slots, since queued processes 
still use system process slots.
@.PP
When the server receives a "request to run",
it determines whether the job should run immediately, be rejected, 
or be queued.
If the queue is full, the job is rejected and the client exits.
If the queue is not empty, the request is added to the queue,
and the client is sent a "queued message" 
The client then enters the queued state
and waits for another command from the server.
If no further commands are received from the server after a preset time 
has elapsed (usually 10 minutes),
the client re-enters the request state and resends the request
to the server to ensure that the server has not terminated or
failed since the time the client was queued.
@.PP
If the queue is empty, the server checks the current load average, and
if it is below the threshold, the client is sent a "run message".
Otherwise the server queues the request, sends the client a "queued message",
and starts the interval timer.
The interval timer is bound to a handler that checks the system load every
few seconds (usually 60 seconds). 
If the handler finds the current load average is below the threshold,
jobs are removed from the head of the queue and sent a "run message".
The number of jobs sent "run messages" depends on how much the current 
load average has dropped below the limit.
If the load average is above the threshold, the handler checks
how long the oldest process has been waiting to run.
If that time is greater than a preset limit (usually 4 hours), the job is 
removed from the queue and allowed to run regardless of the load.
This prevents jobs from being blocked forever due to load averages that
remain above the threshold for long periods of time.
If the queue becomes empty, the handler will shut off the interval timer. 
@.PP
The
@.I ldd
server logs all recoverable and unrecoverable errors in a logfile. Advisory
locks are used to prevent more than one executing server at a time.
When the
@.I ldd
server first begins execution, it scans the spool directory for clients that
might have been queued from a previous
@.I ldd
server and sends them a "poll request". 
Waiting clients will resend their "request to run" message to the new
server, and re-enter the request state.
The
@.I ldd
server will rebuild the queue of waiting tasks 
ordered by the time each client began execution.
This allows the
@.I ldd
server to be terminated and be re-started without
loss or blockage of any waiting clients.
@.PP
The environment variable LOAD can be set to "quiet", which will
surpress the output to stderr of the status strings "queued" 
and "running" for commands which have been set up to display status.
@.PP
Commands can be sent to the server with the
@.IR ldc (8)
control program. These commands can manipulate the queue and change the
values of the various preset limits used by the server.
@.SH FILES
@.nf
@.ta \w'/usr/spool/ldd/sr/msgsock           'u
/usr/spool/ldd	ldd spool directory
/usr/spool/ldd/sr/msgsock	name of server datagram socket
/usr/spool/ldd/sr/cnsock	name do server socket or control messages
/usr/spool/ldd/sr/list		list of queued jobs (not always up to date)
/usr/spool/ldd/sr/lock	lock file (contains pid of server)
/usr/spool/ldd/sr/errors	log file of server errors
@.fi
@.SH "SEE ALSO"
ldc(8),
ldq(1),
ldrm(1).
@//E*O*F man/ldd.8//
chmod u=r,g=r,o=r man/ldd.8
 
echo x - man/ldq.1
sed 's/^@//' > "man/ldq.1" <<'@//E*O*F man/ldq.1//'
@.TH LDQ 1 "24 January 1985"
@.UC 4
@.SH NAME
ldq \- load system queue listing program
@.SH SYNOPSIS
@.B ldq
[
@.I user
] [
@.B \-a
]
@.SH DESCRIPTION
@.I Ldq
is used to print the contents of the queue maintained by the
@.IR ldd (8)
server.
For each job selected by
@.I ldq
to be printed, the rank (position) in the queue, the process id, the owner of
the job, the number of seconds the job has been waiting to run, and the
command line of the job (truncated in length to the first 16 characters)
is printed.
@.PP
With no arguments,
@.I ldq
will print out the status of the jobs in the queue owned by the user running
@.I ldq.
Another users jobs can be printed if that user is specified as an argument
to
@.I ldq.
The
@.B \-a
option will print all the jobs in the queue.
Of course the
@.B \-a
option is much more expensive to run.
@.PP
Users can delete any job they own by using either the
@.IR ldrm (1)
or
@.IR ldc (8)
commands.
@.SH FILES
@.nf
@.ta \w'/usr/spool/ldd/cl/*            'u
/usr/spool/ldd/cl/*	the spool area where sockets are bound
@.fi
@.SH "SEE ALSO"
ldrm(1),
ldc(8),
ldd(8)
@.SH DIAGNOSTICS
This command will fail if the
@.I ldd
server is not executing.
@//E*O*F man/ldq.1//
chmod u=r,g=r,o=r man/ldq.1
 
echo x - man/ldrm.1
sed 's/^@//' > "man/ldrm.1" <<'@//E*O*F man/ldrm.1//'
@.TH LDRM 1 "24 January 1985"
@.UC 4
@.SH NAME
ldrm \- remove jobs from the load system queue
@.SH SYNOPSIS
@.B ldrm
[
@.I pids
] [
@.B \-u
@.I users
]
@.SH DESCRIPTION
@.I Ldrm
will remove a job, or jobs, from the load control queue.
Since the server is protected, this and
@.IR ldc (8)
are the only ways users can remove jobs from the load control spool (other
than killing the waiting process directly).
When a job is removed, it will terminate returning status 1.
This method is preferred over sending a kill -KILL to the process as the
job will be removed from the queue, and will no longer appear in
lists produced by
@.IR ldq (1)
or
@.IR ldc (8).
@.PP
@.I Ldrm
can remove jobs specified either by pid or by user name.
With the
@.B \-u
flag,
@.I ldrm
expects a list of users who will have all their jobs removed from the
load control queue.
When given a list of pid's,
@.I ldrm
will remove those jobs from the queue.
A user can only remove jobs they own, while root can remove any job.
@.SH EXAMPLES
To remove the two jobs with pids 8144 and 47:
@.sp
ldrm 8144 47
@.sp
To remove all the jobs owned by the users joe and sam:
@.sp
ldrm -u joe sam
@.SH FILES
@.nf
@.ta \w'/usr/spool/ldd/cl/*   'u
/usr/spool/ldd/cl/*	directory where sockets are bound
@.fi
@.SH "SEE ALSO"
ldq(1),
ldc(8),
ldd(8)
@.SH DIAGNOSTICS
``Permission denied" if the user tries to remove files other than his
own.
@//E*O*F man/ldrm.1//
chmod u=r,g=r,o=r man/ldrm.1
 
echo Inspecting for damage in transit...
temp=/tmp/shar$$; dtemp=/tmp/.shar$$
trap "rm -f $temp $dtemp; exit" 0 1 2 3 15
cat > $temp <<\!!!
     182    1518    9101 README
      14      96     613 NOTICE
      25      76     502 Makefile
      27      52     439 Makefile
     215    1075    5877 ldc.8
     168    1045    6106 ldd.8
      55     221    1145 ldq.1
      59     261    1362 ldrm.1
     745    4344   25145 total
!!!
wc  README NOTICE Makefile man/Makefile man/ldc.8 man/ldd.8 man/ldq.1 man/ldrm.1 | sed 's=[^ ]*/==' | diff -b $temp - >$dtemp
if [ -s $dtemp ]
then echo "Ouch [diff of wc output]:" ; cat $dtemp
else echo "No problems found."
fi
exit 0