idallen@watmath.UUCP (10/08/84)
When our students goof an operating systems assignment, they sometimes create many cpu-grinder processes that pummel our poor VAX/UNIX into the ground and make it miserable for the rest of the students. Any thoughts on implementing a CPU scheduler that assigns time slices on a UID basis, so that every process the student generates has to share the time-slice given that UID? Thus, the student may have one process or twenty; she/he has the same (small) impact on the other students. My random thoughts: Since the idea is to limit the time slice to one unit per logged-on user, using the UID isn't fair to those logged on twice. Perhaps the controlling terminal would be a better way to determine the time-slice allocation. Perhaps, most conveniently, this feature can be done with a system call that creates a CPU-process-group (CPUPG). All members of the same CPUPG share the same CPU time slice. A related system call turns on or off automatic CPUPG creation. With it turned on, every new process gets its own unique CPUPG and the net result is every process is scheduled on its own, as it is under the present scheduler. With automatic CPUPG disabled, every new process gets added to the CPUPG of the parent and has to share the parent's time-slice. Perhaps set-uid processes should be exempt from this kind of inheritance. Details to be worked out... -- -IAN! (Ian! D. Allen) University of Waterloo
sdyer@bbncca.ARPA (Steve Dyer) (10/08/84)
Actually, any operating systems course which runs on a general-purpose student timesharing system has little business having their students run multiple UNIX processes to emulate OS processes. As you point out, a bug (or even the lack of bugs) will quickly degrade response for all the other users of the machine. Rather than spending time hacking the UNIX scheduler, it would be time better spend building a baby-OS emulator which runs in a single UNIX process yet simulates multi-tasking. Each student would link his own routines into the emulator, and a runaway student project would merely be a single runaway process, a situation more commonly seen and handled. -- /Steve Dyer {decvax,linus,ima,ihnp4}!bbncca!sdyer sdyer@bbncca.ARPA
muller@sdcc3.UUCP (Keith Muller) (10/09/84)
The same problem exsists here at UCSD with large Computer Science classes beating the life out any kind of machine thrown at them. The biggest problem with running machines for student use has been the fact they are very competitive and think nothing of running as many compiles in the background as they feel they need. The problem is that unix will attempt to time slice among all the runnable programs at any given moment. As the number of runnable jobs increases, the size of the actual virtual memory in use increases causing the system to page then eventually swap (for 4.2 systems). The more the system pages the less of the cpu is available for running user processes. This increasing rate of system overhead usage to support the larger number of runnable jobs slowly erodes the throughput of the system to the point were no real work is being done other than supporting the context switches and their subsequent page faults. The actual point at which the throughput begins to rapidly erode depends on the actual job mix at that time, but you can roughly relate it to the load average. However all is not as bad as it seems. It turns out that you can group the job mix into two kinds of jobs: background jobs and interactive jobs. I use the term background jobs to classify those jobs that do not require user input after the job has started. This means that background jobs includes compiles, nroff, etc. Interactive jobs are things like vi. If you look at how the two different kinds of jobs effect the load on the machine you quickly realize that the long term load average is mainly due to the background jobs soaking up all the cpu cycles. Interactive jobs like vi are for the most part (as averaged over time) blocked on i/o. What you want to do is to limit the number of runnable background jobs at a given time to prevent the vitrual memory system from thrashing. This also means that when an interactive job becomes runnable it will quickly get it required time slices. This quick response to interactive jobs is really the single factor that a user will use to classify a system as being fast or slow. The quick response a user gets from interactive jobs seems to far outweigh the effects of queueing background class jobs. The solution used on the UCSD Academic computer center machines (vaxes, suns, pyramid's all running 4.2) is through a mechanism called load control. Load control consists of a server and client "front end" programs. Each background class job is replaced by a client front end with the actual binary being hidden in a place that users are prevented from accessing. When the user invokes a background class job he is really running the front end client program. The client sends a request to run packet to the server. The server can either respond with a run message or a queue message. The actual request depends on the current load situation. If the load average of the machine is above a certain point the job is queued. When the client gets a queued messages, he blocks waiting for a run message. By blocking he does not contribute to the load so in effect the load on the machine is limited to approximately the threshold point set in the server for queuing jobs. When the server is queueing it checks the load at specific time intervals to see if the load has dropped below the threshold load. If the load is below the threshold, the server starts sending run commands to queued background jobs. (The actual number of run commands sent depends on the current load average and the load average queueing point). When the client gets a run message he execs the protected binary he replaced (it uses group access restrictions to protect the actual binary files). The load control mechanism is quite successful. Before load control the systems load would swing from long periods of very high load to periods where the machine went idle. One machine for example had 52 users with 48 vi's and 25 c compiles running at once to create a load of 28 before load control. After load control with the same mix of jobs the load was reduced to around 10 (the threshold for queuing). The average time a job was queued was around 90 seconds (the machine was a small pyramid 90x a large 780 with the same job mix has an average queue time of around 400 seconds. The pyramid clearly has several times the throughput). What the load control effectively does is to clip the high load peaks and to fill in the low load "valleys" with the queued background class jobs. This has increased throughput tremendously and prevents the mess created from long periods of very high load averages. The load control mechanism is actually much more sophisticated than described above but the basic idea is the same. The real advantage to this approach is that kernel based approaches can not easily distinguish between a vi and a compile, causing interactive jobs to become unuseable. If users cannot get their vi's to complete they do not get off the system (they never get to input the file they want to compile, nroff etc). Keith Muller Academic Computer Center University of California, San Diego
yee@ucbvax.ARPA (Peter E. Yee) (10/10/84)
Here at Berkeley, we do use a single process simulation of an OS, called the Toy Operating System. Mulitple process are simulated, including simulation of system resources (drum, reader, copier, etc.) This is for our intro OS course, CS162, which I have yet to take, so I can't give more detailed information. -Peter Yee ..ucbvax!yee yee@Berkeley.ARPA