[comp.sys.apollo] process priorities

thompson@PAN.SSEC.HONEYWELL.COM (John Thompson) (10/22/90)

I like Apollos, I really do....  In fact, I want to buy one for myself, BUT...

This problem is specifically occurring on our 2-cpu DN10000, but the problem
exists on all other machines as well.  We have 4 or 5 groups that have work
to do on our DN10000.  All of them would like to have top priority, but as
things go, only 2 groups have been given the "top priority" stamp from upper
management.  These groups are allowed to raise their jobs' low-priority #
up to 5.

The problem is that most of the jobs are cpu-intensive.  Very rapidly, 
everyone drops down to their lowest priority. As soon as we get >= NUM_CPUs
high priority jobs running, everyone else can kiss their cpu-time goodbye.  
There is no mechanism for processes that are swapped out to have their
priorities rise over time, until they are allowed to swap back in.
Since the high-priority jobs don't have disk, execute, memory, I/O, etc
faults to wait on, they run merrily along.

What we would like is a mechanism that gives out cpu time to all intensive 
jobs, not just the high-priority ones.  If process priorities aged while
they were INVOLUNTARILY swapped out, we could probably find a set of
priorities that gave the approximate percentages.

Am I mistaken, or do _real_ unix priorities age the way I want DomainOS
priorites to?

John Thompson (jt)
Honeywell, SSEC
Plymouth, MN  55441
thompson@pan.ssec.honeywell.com

As ever, my opinions do not necessarily agree with Honeywell's or reality's.
(Honeywell's do not necessarily agree with mine or reality's, either)

krowitz@RICHTER.MIT.EDU (David Krowitz) (10/22/90)

On a real, honest to God, BSD or SYS V system, the "nice" command can be used
to give relative percentages of the available CPU resources to different jobs.
On our Alliant FX/40, for example, we found that jobs in the "nice +10" catagory
got about 40% of the CPU resources while the remaining jobs ("nice 0") got about
60% of the CPU resources. Unfortunately, "nice" on the Apollos seems to simply
alter your Aegis upper/lower priority bounds.


 -- David Krowitz

krowitz@richter.mit.edu   (18.83.0.109)
krowitz%richter.mit.edu@eddie.mit.edu
krowitz%richter.mit.edu@mitvma.bitnet
(in order of decreasing preference)

bhoughto@cmdnfs.intel.com (Blair P. Houghton) (10/22/90)

In article <9010220501.AA09206@pan.ssec.honeywell.com> thompson@PAN.SSEC.HONEYWELL.COM (John Thompson) writes:
>What we would like is a mechanism that gives out cpu time to all intensive 
>jobs, not just the high-priority ones.  If process priorities aged while
>they were INVOLUNTARILY swapped out, we could probably find a set of
>priorities that gave the approximate percentages.
>
>Am I mistaken, or do _real_ unix priorities age the way I want DomainOS
>priorites to?

Well, semantically, there is no _real_ unix, only various flavors
of frozen-yogurt unix...

BSD4.3, according to "The Design and Implementation of the 4.3BSD
UNIX(R) Operating System" by S. J. Leffler, et al, computes process
priority thusly:

	p_usrpri = PUSER + p_cpu/4 + 2*p_nice

where
	PUSER	is the offset between system and user priority as
		compiled into the kernel
	p_cpu	is the CPU utilization:  the time, in clock-ticks
		(1 per millisecond, usually), that the process
		has been in the RUN state
	p_nice	is the nice value as assigned (in your case,
		either 0 or -5)

p_cpu does "age"; it decays according to the following formula
which is computed once per second for each process in the queues:

	p_cpu = p+nice + p_cpu * (2*load)/(1 + 2*load)

where
	load	is an average of the run-queue length over the
		past minute

That's what "real" unix does.  I don't even know enough about
my Apollos yet to find the priority, so I can't compare the
computations, myself...

				--Blair
				  "All's I need's a good binary editor
				   and a brand new copy of cc..."

STEVE_LLOYD@RMC.CA (10/24/90)

Thanks John - I've never really understood why that happens.  The effect
is even more dramatic on a single board unit.

In research solutions to the problem I came across a product called NQS
(Network Queing System, I think).  This is a batching facility for Unix
machines that has been ported to the DN10k. It has very elegant partioning
and priorities.  It won't get rid of the non-aging problem but will allow
you to keep the number of jobs running to a low enough number that the
machine won't do it's nose dive to 2%.  With user cooperation you could
keep 1 queue running with several non swapping jobs and have a different
queue where the swappers would be forced into one at a time.  It certainly
isn't the best solution since the software costs real money, but it's
better than having a DN10k performing like a DN3000.

Unfortunately I can't find the advertising bumff on the product -
It's lost in a very disorganized pile of stuff.
I know that it was developed by NASA and is marketted by one of
their subcontractors - that narrows it down to about half the contractors
in the US.  If you're interested I'll look it up.  Perhaps someone else
out there has the info at their finger tips?

etb@milton.u.washington.edu (Eric Bushnell) (03/09/91)

Would someone be so kind as to explain how
Domain process priorities work? I have a 
sudden interest, following a strange situation
that occurred yesterday.

An ordinary, unprivileged user wasn't happy
with the priority of his batch job, which he
had started with /usr/bin/nohup. So he used
/etc/renice to change his priority to -20,
the highest priority in BSD unix. Only the
superuser can do this, right? Apparently not.

Unix priority -20 seems to map to a priority
value of 1 in Domain. That was higher than
everything else, so it must have blocked
all other processes. He couldn't stop or
kill it, because the shell was blocked. He
couldn't log out. I tried to get on as root
from another node--no go. Rlogin, telnet, and
crp were all blocked.

Lower level processes seemed to be running.
I was able to use ps //remote_node to see the
process list, but that's about it. I let
it run overnight, in case it finished and
exited normally. It was still going this morning
so I rebooted the node.

I've tried to recreate this in a more controlled
way. So far, all I can tell is that /etc/renice
works strangely.

I remember an earlier gripe about nice. Is this
a known bug? A known feature?

Eric Bushnell
UW Civil Engr
etb@zeus.ce.washington.edu

fridman@cpsc.ucalgary.ca (fridman) (03/09/91)

In article <18030@milton.u.washington.edu> etb@milton.u.washington.edu (Eric Bushnell) writes:

>   An ordinary, unprivileged user wasn't happy
>   with the priority of his batch job, which he
>   had started with /usr/bin/nohup. So he used
>   /etc/renice to change his priority to -20,
>   the highest priority in BSD unix. Only the
>   superuser can do this, right? Apparently not.

I just tried it.  One CAN increase the priority without being root.

	RF.

smv@apollo.HP.COM (Steve Valentine) (03/09/91)

In article <18030@milton.u.washington.edu> etb@milton.u.washington.edu (Eric Bushnell) writes:
>Would someone be so kind as to explain how
>Domain process priorities work?

You're running into one of those areas where Domain/OS maps a square UNIX peg
into a round Aegis hole.  As you probably already know, UNIX nice values range
from -20 to +19.  The nice value is fixed for a given process, (i.e. not changed
by the OS on its own) and a priority is adjusted, based on the nice value.
The scheduler then does fair-share scheduling based on the various priorities
of the various processes.  A process running CPU bound at nice -20 will get most
of the CPU time, but not all of it.

In Domain/OS, UNIX nice values are converted into a high and low Aegis priority.
The range of Aegis priorities is 1 to 16, and the default range (nice 0)
is 3 to 14.  The kernel adjusts priority of a process within this range
according to how much CPU time it is taking.
The upshot of all this is that nice doesn't work the way you think it does.
As you move further away from nice 0, one end of the range quickly hits the the
absolute limit, 1 or 16, and the other end approches the same limit,
that is the range shrinks.

For example: (this is close to right, but not guarenteed correct)
(Keep in mind here that higher numbers are higher priority in Aegis.)
Nice:	Aegis Priority range:
-20	16-16
-15	11-16
-10	09-16
 -5	05-16
 -2	03-16
 -1	03-15
  0	03-14
  1	02-14
  2	01-14
  3	01-13
  5	01-12
  10	01-08
  15	01-04
  19	01-01

The kernel preemptivly schedules the process with the highest priority,
and splits all the CPU time among the highest priority processes.
That is, if you set a process to nice -20, you peg it at priority 16. 
The kernel will split the CPU time among those processes at priority 16,
and give no time to any processes at priority 15 or lower, until all of the
processes at priority 16 block.  This is a feature of Aegis/Domain/OS,
and we're pretty sure that changing it would tick off more people than it
would help.  We are looking into expanding the range of Aegis priorities
so that the nice values can all map to distinct overlapping suburanges of
equal size.  If this is done, it should appear in sr10.4.
#include <limited resources sob story>
#include <disclamers>

>An ordinary, unprivileged user wasn't happy
>with the priority of his batch job, which he
>had started with /usr/bin/nohup. So he used
>/etc/renice to change his priority to -20,
>the highest priority in BSD unix. Only the
>superuser can do this, right? Apparently not.

It has long been the Apollo position that Domain Nodes are single user machines.
We try very hard to make the node as usefull as possible to it's one user.
An ordinary, unpriviledged user in this environment may very well have reason to
adjust the priority of some process on his or her node, and can only hurt
themselves by doing so.  When nodes are used in shared environments,
their users must learn that it is socially unacceptable to take advantage of
some of the features available.  If we required root access to renice processes,
we would be depriving our users of a feature that they have become accustomed to
and which we feel they can benefit from.
Steve Valentine - smv@apollo.hp.com
Hewlett-Packard Company, Apollo Systems Division, Chelmsford, MA
Hermits have no peer pressure. -Steven Wright

dpassage@sandstorm.Berkeley.EDU (David G. Paschich) (03/10/91)

In article <504080ad.20b6d@apollo.HP.COM> smv@apollo.HP.COM (Steve Valentine) writes:
>It has long been the Apollo position that Domain Nodes are single user machines.
>We try very hard to make the node as usefull as possible to it's one user.
>An ordinary, unpriviledged user in this environment may very well have reason to
>adjust the priority of some process on his or her node, and can only hurt
>themselves by doing so.  When nodes are used in shared environments,
>their users must learn that it is socially unacceptable to take advantage of
>If we required root access to renice processes,
>we would be depriving our users of a feature that they have become accustomed to
>and which we feel they can benefit from.

From the man page for renice:
>     Users other than the super-user may only alter the priority of processes
>     they own, and can only monotonically increase their "nice value" within
>     the range 0 to PRIO_MAX (20).

It would be nice if Apollo documentation were true.

Actually, just because Apollo lets users renice things doesn't mean you have
to:

sandstorm [100] su
Password:
# chown root //*/etc/renice
# chmod 700 //*/etc/renice

Drastic, but on an open cluster like ours, it's definitely necessary.  This
still isn't complete; nice, which is built into csh, will still let anyone set
negative priorities on processes when you start them.

David G. Paschich
dpassage@ocf.berkeley.edu
Just say not to huge .sigs!

rees@pisa.citi.umich.edu (Jim Rees) (03/11/91)

In article <1991Mar10.121848.8362@agate.berkeley.edu>, dpassage@sandstorm.Berkeley.EDU (David G. Paschich) writes:

  Actually, just because Apollo lets users renice things doesn't mean you have
  to:
  
  sandstorm [100] su
  Password:
  # chown root //*/etc/renice
  # chmod 700 //*/etc/renice

That doesn't help much unless you also close up /usr/apollo/lib/cc*.

#include <sys/time.h>
#include <sys/resource.h>

main(ac, av)
int ac;
char *av[];
{
	setpriority(PRIO_PROCESS, atoi(av[1]), -20);
}

dpassage@monsoon.Berkeley.EDU (David G. Paschich) (03/11/91)

In article <5049bf48.1bc5b@pisa.citi.umich.edu> rees@citi.umich.edu (Jim Rees) writes:
>In article <1991Mar10.121848.8362@agate.berkeley.edu>, dpassage@sandstorm.Berkeley.EDU (David G. Paschich) writes:
>
>  Actually, just because Apollo lets users renice things doesn't mean you have
>  to:
   [stuff I wrote deleted]
>
>That doesn't help much unless you also close up /usr/apollo/lib/cc*.
>
>#include <sys/time.h>
>#include <sys/resource.h>
>
>main(ac, av)
>int ac;
>char *av[];
>{
>	setpriority(PRIO_PROCESS, atoi(av[1]), -20);
>}

Good point.  Yet another example of Apollo saying, "it's not a bug, it's a 
feature!"
David G. Paschich
dpassage@ocf.berkeley.edu
Just say not to huge .sigs!

etb@milton.u.washington.edu (Eric Bushnell) (03/12/91)

Thanks, Steve Valentine,for the helpful explanation.

What I'm planning to do is restrict execution rights
on /etc/renice, and then write a wrapper for it
that will not allow negative priorities.

That won't stop determined hackers, as someone else
pointed out, but it's a start. And it's a hint
to cpu-greedy users.

What I'd like now is a way to control such processes
once they've started. If the priority is high enough,
it blocks crp and rlogin, so that I can't renice it
after the fact. Are there any low-level calls
that could be used to signal (kill) and adjust
priorities from another node? It's an odd feature,
of course, but if anybody can renice anything, who
knows what else is possible?

And thanks, all, for the stimulating discussion.


Eric Bushnell                  etb@milton.u.washington.edu
				etb@zeus.ce.washington.edu
University of Washington
Civil Engineering

wjw@ebe.eb.ele.tue.nl (Willem Jan Withagen) (03/13/91)

In article <FRIDMAN.91Mar8141257@aa.cpsc.ucalgary.ca> fridman@cpsc.ucalgary.ca (fridman) writes:
=>In article <18030@milton.u.washington.edu> etb@milton.u.washington.edu (Eric Bushnell) writes:
=>
=>
=>>   An ordinary, unprivileged user wasn't happy
=>>   with the priority of his batch job, which he
=>>   had started with /usr/bin/nohup. So he used
=>>   /etc/renice to change his priority to -20,
=>>   the highest priority in BSD unix. Only the
=>>   superuser can do this, right? Apparently not.

I complained about this feature to this forum and HPollo, and indeed was 
their remark: (about half a year ago)

	It not a bug, it's a feature.

Sadly that is again the story which I read about Apollo's being a single
user station. (PC are a lot cheaper, SUN's are cheaper and faster, .... )

Another sadening 'finding' is that the manual says something which is totally
different from reality. 

I'm I wrong to asume that most of the BSD4.3 stuff was just compiled of the
tape. And if it didn't work it would get hacked until it sort of did.
( I know I'm being a pig to all those hard working engineers at Apollo,
  I do like Apollo's. Only once in a while I'm happy that the windows don't
  open far enough to fit a system-case.)

sob, sob, sob
		Willem Jan Withagen
Eindhoven University of Technology   DomainName:  wjw@eb.ele.tue.nl    
Digital Systems Group, Room EH 10.10 
P.O. 513                             Tel: +31-40-473401
5600 MB Eindhoven                    The Netherlands

mike@vlsivie.tuwien.ac.at (Michael K. Gschwind) (03/14/91)

In article <504080ad.20b6d@apollo.HP.COM> smv@apollo.HP.COM (Steve Valentine) writes:
>It has long been the Apollo position that Domain Nodes are single user machines.
>We try very hard to make the node as usefull as possible to it's one user.
This is the old Apollo comment trying to sell bugs as features...

>An ordinary, unpriviledged user in this environment may very well have reason to
>adjust the priority of some process on his or her node, and can only hurt
>themselves by doing so.  When nodes are used in shared environments,

What happens if the user renices disk servers? The registry deamon? 
The glbd? There are vital services running which need to be available to
the net as a whole! (TCP gateways, servers for diskless computers) 

>their users must learn that it is socially unacceptable to take advantage of
>some of the features available.  If we required root access to renice processes,
>we would be depriving our users of a feature that they have become accustomed to
>and which we feel they can benefit from.

There are many simple ways to accomplish renicing other users' processes
if this is policy at some site:

1.) You can set the suid bit on /etc/renice.
2.) There could be some configuration file indicating whether every 
	user should be allowed to renice processes.

				bye,
					mike

Michael K. Gschwind, Dept. of VLSI-Design, Vienna University of Technology
mike@vlsivie.tuwien.ac.at	1-2-3-4 kick the lawsuits out the door 
mike@vlsivie.uucp		5-6-7-8 innovate don't litigate         
e182202@awituw01.bitnet		9-A-B-C interfaces should be free
Voice: (++43).1.58801 8144	D-E-F-O look and feel has got to go!
Fax:   (++43).1.569697

system@alchemy.chem.utoronto.ca (System Admin (Mike Peterson)) (03/20/91)

In article <504080ad.20b6d@apollo.HP.COM> smv@apollo.HP.COM (Steve Valentine) writes:
>In article <18030@milton.u.washington.edu> etb@milton.u.washington.edu (Eric Bushnell) writes:
>>Would someone be so kind as to explain how
>>Domain process priorities work?
>
>You're running into one of those areas where Domain/OS maps a square UNIX peg
>into a round Aegis hole.
> <... niceness<->Aegis priority mapping deleted>

This table is certainly wrong for nice values of 2 through 20 inclusive,
all of which result in Aegis priority range of 3-16, negating much of
the usefulness of nice/renice.

>>An ordinary, unprivileged user wasn't happy
>>with the priority of his batch job, which he
>>had started with /usr/bin/nohup. So he used
>>/etc/renice to change his priority to -20,
>>the highest priority in BSD unix. Only the
>>superuser can do this, right? Apparently not.
>
>It has long been the Apollo position that Domain Nodes are single user machines.
>We try very hard to make the node as usefull as possible to it's one user.
>An ordinary, unpriviledged user in this environment may very well have reason to
>adjust the priority of some process on his or her node, and can only hurt
>themselves by doing so.  When nodes are used in shared environments,
>their users must learn that it is socially unacceptable to take advantage of
>some of the features available.  If we required root access to renice processes,
>we would be depriving our users of a feature that they have become accustomed to
>and which we feel they can benefit from.

Our nodes were sold to us as multi-user workstations (especially the
DN10000); being able to renice any other user's process, including kernel
level processes, and to be able to raise any process priority is a gross
violation of UNIX. Apollo UNIX users should never have been given this
"feature" in the first place, which required hacking on the source code.

I agree completely with another posting which said that IBM and Apollo
shouldn't be calling their products "UNIX". Gratuitous changes are a
pain in the <insert favourite region here>.
-- 
Mike Peterson, System Administrator, U/Toronto Department of Chemistry
E-mail: system@alchemy.chem.utoronto.ca
Tel: (416) 978-7094                  Fax: (416) 978-8775

thompson@PAN.SSEC.HONEYWELL.COM (John Thompson) (03/20/91)

> > An ordinary, unpriviledged user in this environment may very well have reason to
> > adjust the priority of some process on his or her node, and can only hurt
> > themselves by doing so.  When nodes are used in shared environments,
> What happens if the user renices disk servers? The registry deamon? 
> The glbd? There are vital services running which need to be available to
> the net as a whole! (TCP gateways, servers for diskless computers) 
Things go real belly up.  (I assume it was a rhetorical question, but....)
We've had users raise their priorities through the roof, and drop other people's
priorities through the basement.  We've had the tcpd time out all connections,
because somebody 'needed to run fast', so changed their lower priority to 12
(Aegis-land - Unix equiv is renice -gawdawfullow).  IMHO, HP/Apollo is copping
out with their "that's a feature" argument.  Aborting processes was a feature, 
and at sr10 they came up with an extension (`node_data/node_owners).  The same
thing should hold true for priorities.  Each user may have his own system (though
that's arguable too), but they can certainly hurt other people with their antics.

> >their users must learn that it is socially unacceptable to take advantage of
> >some of the features available.  If we required root access to renice processes,
> >we would be depriving our users of a feature that they have become accustomed to
> >and which we feel they can benefit from.
That's why something like `node_data/node_owners would be good.  People who think
that shotguns are valuable computing tools should be able to fire at will.  Those
of us who like to preserve systems should be able to limit the people who can 
pull the trigger.
 
-- jt --
John Thompson
Honeywell, SSEC
Plymouth, MN  55441
thompson@pan.ssec.honeywell.com

Me?  Represent Honeywell?  You've GOT to be kidding!!!

nazgul@alphalpha.com (Kee Hinckley) (03/20/91)

In article <1991Mar20.062300.25980@alchemy.chem.utoronto.ca> system@alchemy.chem.utoronto.ca (System Admin (Mike Peterson)) writes:
>Our nodes were sold to us as multi-user workstations (especially the
They aren't.

>level processes, and to be able to raise any process priority is a gross
>violation of UNIX. Apollo UNIX users should never have been given this
>"feature" in the first place, which required hacking on the source code.
nice required source code hacking just to work.  We aren't talking a Unix
kernel here.  Aegis already had a process priority mechanism that didn't
require permissions to use.  Unix process priorities had to be mapped
onto that without breaking people depending on the previous behavior.

>shouldn't be calling their products "UNIX". Gratuitous changes are a
>pain in the <insert favourite region here>.
There isn't a gratuitous change there.  Someone had the following choices:

    1)  Be incompatible with older Aegis-based software which depends
	on being able to change process priorities.
    2)  Lull the Unix user into a false sense of security by not allowing
	Unix commands to change the priority, but still allowing Aegis
	commands to.
    3)  Relax the Unix protections.

#1 doesn't fit in with Apollo's stated standards (sure, they don't always
maintain compatibility, but they try).  #2 is clearly wrong - people get
really pissed when they discover that.  #3 fits in with the model (which
R&D certain had, if not marketing) that an Apollo workstation is a single-
user workstation.
-- 
Alfalfa Software, Inc.          |       Poste:  The EMail for Unix
nazgul@alfalfa.com              |       Send Anything... Anywhere
617/646-7703 (voice/fax)        |       info@alfalfa.com

I'm not sure which upsets me more: that people are so unwilling to accept
responsibility for their own actions, or that they are so eager to regulate
everyone else's.

rees@pisa.citi.umich.edu (Jim Rees) (03/21/91)

In article <1991Mar20.062300.25980@alchemy.chem.utoronto.ca>, system@alchemy.chem.utoronto.ca (System Admin (Mike Peterson)) writes:

  I agree completely with another posting which said that IBM and Apollo
  shouldn't be calling their products "UNIX". Gratuitous changes are a
  pain in the <insert favourite region here>.

Why not?  AT&T still calls their product "Unix," and look at all the things
they broke in system V.  And what about Berkeley?  Their version of Unix
bears little resemblance to anything that ever came out of Bell Labs.  You're
annoyed because Apollo Unix doesn't look like anyone else's, but in fact no
one's Unix looks anything like the original, so who's to say which one is
"real Unix?"

Besides, AT&T (who owns the name "Unix") likes to define Unix as anything
that passes the SVVS.  By this measure, Domain/OS passes, Berkeley doesn't.
But I'm not going to go out and buy a computer just because it's SVID
compliant.  In fact, that's a mark against it for me.  I'm going to buy the
computer that gets the job done the way I want it done.

thompson@PAN.SSEC.HONEYWELL.COM (John Thompson) (03/23/91)

<<forwarded message>>
> >Our nodes were sold to us as multi-user workstations (especially the
> They aren't.
But if they were _sold_ as such, HP/Apollo at least has an obligation to
support them this way.  Also, I'd have a hard time considering the dn10000
(that now-obsolete machine) a single-user system.

> >shouldn't be calling their products "UNIX". Gratuitous changes are a
> >pain in the <insert favourite region here>.
> There isn't a gratuitous change there.  Someone had the following choices:
> 
>     1)  Be incompatible with older Aegis-based software which depends
> 	on being able to change process priorities.
>     2)  Lull the Unix user into a false sense of security by not allowing
> 	Unix commands to change the priority, but still allowing Aegis
> 	commands to.
>     3)  Relax the Unix protections.
> #1 doesn't fit in with Apollo's stated standards (sure, they don't always
> maintain compatibility, but they try).  #2 is clearly wrong - people get
> really pissed when they discover that.  #3 fits in with the model (which
> R&D certain had, if not marketing) that an Apollo workstation is a single-
> user workstation.

You missed an option.
    4) Keep Unix protections tight, and create a permissions file for the 
       Aegis users. 
#4 was done for the sigp/kill pair!  In Unix, you still need to be the owner 
of a process (or root) to kill it.  In Aegis, you need to be the owner of a 
process, _OR_ be listed in the `node_data/node_owners ACL.  Now, that might
leave them (HP/Apollo) a little open, but if they (A) put a comment in the
release notes and/or (B) installed a locked-up ppri_owners file if Aegis
wasn't loaded on a system, they should be reasonably safe.  They'd also have
fewer problems, IMHO.


-- jt --
John Thompson
Honeywell, SSEC
Plymouth, MN  55441
thompson@pan.ssec.honeywell.com

Me?  Represent Honeywell?  You've GOT to be kidding!!!