[net.unix-wizards] Wanted: New process scheduler with per-uid time slices

idallen@watmath.UUCP (10/08/84)

When our students goof an operating systems assignment, they sometimes
create many cpu-grinder processes that pummel our poor VAX/UNIX into
the ground and make it miserable for the rest of the students.

Any thoughts on implementing a CPU scheduler that assigns time slices
on a UID basis, so that every process the student generates has to share
the time-slice given that UID?  Thus, the student may have one process or
twenty; she/he has the same (small) impact on the other students.

My random thoughts:

Since the idea is to limit the time slice to one unit per logged-on user,
using the UID isn't fair to those logged on twice.  Perhaps the controlling
terminal would be a better way to determine the time-slice allocation.

Perhaps, most conveniently, this feature can be done with a system call
that creates a CPU-process-group (CPUPG).  All members of the same CPUPG
share the same CPU time slice.  A related system call turns on or off
automatic CPUPG creation.  With it turned on, every new process gets
its own unique CPUPG and the net result is every process is scheduled
on its own, as it is under the present scheduler.  With automatic CPUPG
disabled, every new process gets added to the CPUPG of the parent and
has to share the parent's time-slice.  Perhaps set-uid processes should
be exempt from this kind of inheritance.  Details to be worked out...
-- 
        -IAN!  (Ian! D. Allen)      University of Waterloo

sdyer@bbncca.ARPA (Steve Dyer) (10/08/84)

Actually, any operating systems course which runs on a general-purpose
student timesharing system has little business having their students
run multiple UNIX processes to emulate OS processes.  As you point out,
a bug (or even the lack of bugs) will quickly degrade response for all
the other users of the machine.

Rather than spending time hacking the UNIX scheduler, it would be time
better spend building a baby-OS emulator which runs in a single UNIX
process yet simulates multi-tasking.  Each student would link his own
routines into the emulator, and a runaway student project would merely
be a single runaway process, a situation more commonly seen and handled.
-- 
/Steve Dyer
{decvax,linus,ima,ihnp4}!bbncca!sdyer
sdyer@bbncca.ARPA

muller@sdcc3.UUCP (Keith Muller) (10/09/84)

The same problem exsists here at UCSD with large Computer Science classes
beating the life out any kind of machine thrown at them. The biggest
problem with running machines for student use has been the fact they are
very competitive and think nothing of running as many compiles in the
background as they feel they need. The problem is that unix will attempt
to time slice among all the runnable programs at any given moment. As the
number of runnable jobs increases, the size of the actual virtual memory in use
increases causing the system to page then eventually swap (for 4.2 systems).
The more the system pages the less of the cpu is available for running user
processes. This increasing rate of system overhead usage to support the larger
number of runnable jobs slowly erodes the throughput of the system to the point
were no real work is being done other than supporting the context switches
and their subsequent page faults. The actual point at which the throughput
begins to rapidly erode depends on the actual job mix at that time, but you
can roughly relate it to the load average.

However all is not as bad as it seems. It turns out that you can group
the job mix into two kinds of jobs: background jobs and interactive jobs.
I use the term background jobs to classify those jobs that do not require
user input after the job has started. This means that background jobs includes
compiles, nroff, etc. Interactive jobs are things like vi. If you look at
how the two different kinds of jobs effect the load on the machine you
quickly realize that the long term load average is mainly due to the background
jobs soaking up all the cpu cycles. Interactive jobs like vi are for the
most part (as averaged over time) blocked on i/o. What you want to do is to
limit the number of runnable background jobs at a given time to prevent the
vitrual memory system from thrashing. This also means that when an interactive
job becomes runnable it will quickly get it required time slices. This 
quick response to interactive jobs is really the single factor that a user
will use to classify a system as being fast or slow. The quick response a
user gets from interactive jobs seems to far outweigh the effects of queueing
background class jobs.

The solution used on the UCSD Academic computer center machines (vaxes, suns,
pyramid's all running 4.2) is through a mechanism called load control.
Load control consists of a server and client "front end" programs. Each
background class job is replaced by a client front end with the actual
binary being hidden in a place that users are prevented from accessing.
When the user invokes a background class job he is really running the front
end client program. The client sends a request to run packet to the server.
The server can either respond with a run message or a queue message. The
actual request depends on the current load situation. If the load average
of the machine is above a certain point the job is queued. When the client
gets a queued messages, he blocks waiting for a run message. By blocking he
does not contribute to the load so in effect the load on the machine is limited
to approximately the threshold point set in the server for queuing jobs. When
the server is queueing it checks the load at specific time intervals to see
if the load has dropped below the threshold load. If the load is below the
threshold, the server starts sending run commands to queued background jobs.
(The actual number of run commands sent depends on the current load average
and the load average queueing point). When the client gets a run message he
execs the protected binary he replaced (it uses group access restrictions to
protect the actual binary files).

The load control mechanism is quite successful. Before load control the
systems load would swing from long periods of very high load to periods
where the machine went idle. One machine for example had 52 users with
48 vi's and 25 c compiles running at once to create a load of 28 before
load control. After load control with the same mix of jobs the load
was reduced to around 10 (the threshold for queuing). The average time a
job was queued was around 90 seconds (the machine was a small pyramid 90x
a large 780 with the same job mix has an average queue time of around
400 seconds. The pyramid clearly has several times the throughput).
What the load control effectively does is to clip the high load peaks
and to fill in the low load "valleys" with the queued background class
jobs. This has increased throughput tremendously and prevents the mess
created from long periods of very high load averages.

The load control mechanism is actually much more sophisticated than described
above but the basic idea is the same. The real advantage to this approach is
that kernel based approaches can not easily distinguish between a vi and a
compile, causing interactive jobs to become unuseable. If users cannot
get their vi's to complete they do not get off the system (they never get
to input the file they want to compile, nroff etc).

	Keith Muller
	Academic Computer Center
	University of California, San Diego

yee@ucbvax.ARPA (Peter E. Yee) (10/10/84)

    Here at Berkeley, we do use a single process simulation of an OS,
called the Toy Operating System.  Mulitple process are simulated,
including simulation of system resources (drum, reader, copier, etc.)
This is for our intro OS course, CS162, which I have yet to take, so
I can't give more detailed information.

						-Peter Yee
						..ucbvax!yee
						yee@Berkeley.ARPA