thompson@PAN.SSEC.HONEYWELL.COM (John Thompson) (10/22/90)
I like Apollos, I really do.... In fact, I want to buy one for myself, BUT... This problem is specifically occurring on our 2-cpu DN10000, but the problem exists on all other machines as well. We have 4 or 5 groups that have work to do on our DN10000. All of them would like to have top priority, but as things go, only 2 groups have been given the "top priority" stamp from upper management. These groups are allowed to raise their jobs' low-priority # up to 5. The problem is that most of the jobs are cpu-intensive. Very rapidly, everyone drops down to their lowest priority. As soon as we get >= NUM_CPUs high priority jobs running, everyone else can kiss their cpu-time goodbye. There is no mechanism for processes that are swapped out to have their priorities rise over time, until they are allowed to swap back in. Since the high-priority jobs don't have disk, execute, memory, I/O, etc faults to wait on, they run merrily along. What we would like is a mechanism that gives out cpu time to all intensive jobs, not just the high-priority ones. If process priorities aged while they were INVOLUNTARILY swapped out, we could probably find a set of priorities that gave the approximate percentages. Am I mistaken, or do _real_ unix priorities age the way I want DomainOS priorites to? John Thompson (jt) Honeywell, SSEC Plymouth, MN 55441 thompson@pan.ssec.honeywell.com As ever, my opinions do not necessarily agree with Honeywell's or reality's. (Honeywell's do not necessarily agree with mine or reality's, either)
krowitz@RICHTER.MIT.EDU (David Krowitz) (10/22/90)
On a real, honest to God, BSD or SYS V system, the "nice" command can be used to give relative percentages of the available CPU resources to different jobs. On our Alliant FX/40, for example, we found that jobs in the "nice +10" catagory got about 40% of the CPU resources while the remaining jobs ("nice 0") got about 60% of the CPU resources. Unfortunately, "nice" on the Apollos seems to simply alter your Aegis upper/lower priority bounds. -- David Krowitz krowitz@richter.mit.edu (18.83.0.109) krowitz%richter.mit.edu@eddie.mit.edu krowitz%richter.mit.edu@mitvma.bitnet (in order of decreasing preference)
bhoughto@cmdnfs.intel.com (Blair P. Houghton) (10/22/90)
In article <9010220501.AA09206@pan.ssec.honeywell.com> thompson@PAN.SSEC.HONEYWELL.COM (John Thompson) writes: >What we would like is a mechanism that gives out cpu time to all intensive >jobs, not just the high-priority ones. If process priorities aged while >they were INVOLUNTARILY swapped out, we could probably find a set of >priorities that gave the approximate percentages. > >Am I mistaken, or do _real_ unix priorities age the way I want DomainOS >priorites to? Well, semantically, there is no _real_ unix, only various flavors of frozen-yogurt unix... BSD4.3, according to "The Design and Implementation of the 4.3BSD UNIX(R) Operating System" by S. J. Leffler, et al, computes process priority thusly: p_usrpri = PUSER + p_cpu/4 + 2*p_nice where PUSER is the offset between system and user priority as compiled into the kernel p_cpu is the CPU utilization: the time, in clock-ticks (1 per millisecond, usually), that the process has been in the RUN state p_nice is the nice value as assigned (in your case, either 0 or -5) p_cpu does "age"; it decays according to the following formula which is computed once per second for each process in the queues: p_cpu = p+nice + p_cpu * (2*load)/(1 + 2*load) where load is an average of the run-queue length over the past minute That's what "real" unix does. I don't even know enough about my Apollos yet to find the priority, so I can't compare the computations, myself... --Blair "All's I need's a good binary editor and a brand new copy of cc..."
STEVE_LLOYD@RMC.CA (10/24/90)
Thanks John - I've never really understood why that happens. The effect is even more dramatic on a single board unit. In research solutions to the problem I came across a product called NQS (Network Queing System, I think). This is a batching facility for Unix machines that has been ported to the DN10k. It has very elegant partioning and priorities. It won't get rid of the non-aging problem but will allow you to keep the number of jobs running to a low enough number that the machine won't do it's nose dive to 2%. With user cooperation you could keep 1 queue running with several non swapping jobs and have a different queue where the swappers would be forced into one at a time. It certainly isn't the best solution since the software costs real money, but it's better than having a DN10k performing like a DN3000. Unfortunately I can't find the advertising bumff on the product - It's lost in a very disorganized pile of stuff. I know that it was developed by NASA and is marketted by one of their subcontractors - that narrows it down to about half the contractors in the US. If you're interested I'll look it up. Perhaps someone else out there has the info at their finger tips?
etb@milton.u.washington.edu (Eric Bushnell) (03/09/91)
Would someone be so kind as to explain how Domain process priorities work? I have a sudden interest, following a strange situation that occurred yesterday. An ordinary, unprivileged user wasn't happy with the priority of his batch job, which he had started with /usr/bin/nohup. So he used /etc/renice to change his priority to -20, the highest priority in BSD unix. Only the superuser can do this, right? Apparently not. Unix priority -20 seems to map to a priority value of 1 in Domain. That was higher than everything else, so it must have blocked all other processes. He couldn't stop or kill it, because the shell was blocked. He couldn't log out. I tried to get on as root from another node--no go. Rlogin, telnet, and crp were all blocked. Lower level processes seemed to be running. I was able to use ps //remote_node to see the process list, but that's about it. I let it run overnight, in case it finished and exited normally. It was still going this morning so I rebooted the node. I've tried to recreate this in a more controlled way. So far, all I can tell is that /etc/renice works strangely. I remember an earlier gripe about nice. Is this a known bug? A known feature? Eric Bushnell UW Civil Engr etb@zeus.ce.washington.edu
fridman@cpsc.ucalgary.ca (fridman) (03/09/91)
In article <18030@milton.u.washington.edu> etb@milton.u.washington.edu (Eric Bushnell) writes: > An ordinary, unprivileged user wasn't happy > with the priority of his batch job, which he > had started with /usr/bin/nohup. So he used > /etc/renice to change his priority to -20, > the highest priority in BSD unix. Only the > superuser can do this, right? Apparently not. I just tried it. One CAN increase the priority without being root. RF.
smv@apollo.HP.COM (Steve Valentine) (03/09/91)
In article <18030@milton.u.washington.edu> etb@milton.u.washington.edu (Eric Bushnell) writes: >Would someone be so kind as to explain how >Domain process priorities work? You're running into one of those areas where Domain/OS maps a square UNIX peg into a round Aegis hole. As you probably already know, UNIX nice values range from -20 to +19. The nice value is fixed for a given process, (i.e. not changed by the OS on its own) and a priority is adjusted, based on the nice value. The scheduler then does fair-share scheduling based on the various priorities of the various processes. A process running CPU bound at nice -20 will get most of the CPU time, but not all of it. In Domain/OS, UNIX nice values are converted into a high and low Aegis priority. The range of Aegis priorities is 1 to 16, and the default range (nice 0) is 3 to 14. The kernel adjusts priority of a process within this range according to how much CPU time it is taking. The upshot of all this is that nice doesn't work the way you think it does. As you move further away from nice 0, one end of the range quickly hits the the absolute limit, 1 or 16, and the other end approches the same limit, that is the range shrinks. For example: (this is close to right, but not guarenteed correct) (Keep in mind here that higher numbers are higher priority in Aegis.) Nice: Aegis Priority range: -20 16-16 -15 11-16 -10 09-16 -5 05-16 -2 03-16 -1 03-15 0 03-14 1 02-14 2 01-14 3 01-13 5 01-12 10 01-08 15 01-04 19 01-01 The kernel preemptivly schedules the process with the highest priority, and splits all the CPU time among the highest priority processes. That is, if you set a process to nice -20, you peg it at priority 16. The kernel will split the CPU time among those processes at priority 16, and give no time to any processes at priority 15 or lower, until all of the processes at priority 16 block. This is a feature of Aegis/Domain/OS, and we're pretty sure that changing it would tick off more people than it would help. We are looking into expanding the range of Aegis priorities so that the nice values can all map to distinct overlapping suburanges of equal size. If this is done, it should appear in sr10.4. #include <limited resources sob story> #include <disclamers> >An ordinary, unprivileged user wasn't happy >with the priority of his batch job, which he >had started with /usr/bin/nohup. So he used >/etc/renice to change his priority to -20, >the highest priority in BSD unix. Only the >superuser can do this, right? Apparently not. It has long been the Apollo position that Domain Nodes are single user machines. We try very hard to make the node as usefull as possible to it's one user. An ordinary, unpriviledged user in this environment may very well have reason to adjust the priority of some process on his or her node, and can only hurt themselves by doing so. When nodes are used in shared environments, their users must learn that it is socially unacceptable to take advantage of some of the features available. If we required root access to renice processes, we would be depriving our users of a feature that they have become accustomed to and which we feel they can benefit from. Steve Valentine - smv@apollo.hp.com Hewlett-Packard Company, Apollo Systems Division, Chelmsford, MA Hermits have no peer pressure. -Steven Wright
dpassage@sandstorm.Berkeley.EDU (David G. Paschich) (03/10/91)
In article <504080ad.20b6d@apollo.HP.COM> smv@apollo.HP.COM (Steve Valentine) writes: >It has long been the Apollo position that Domain Nodes are single user machines. >We try very hard to make the node as usefull as possible to it's one user. >An ordinary, unpriviledged user in this environment may very well have reason to >adjust the priority of some process on his or her node, and can only hurt >themselves by doing so. When nodes are used in shared environments, >their users must learn that it is socially unacceptable to take advantage of >If we required root access to renice processes, >we would be depriving our users of a feature that they have become accustomed to >and which we feel they can benefit from. From the man page for renice: > Users other than the super-user may only alter the priority of processes > they own, and can only monotonically increase their "nice value" within > the range 0 to PRIO_MAX (20). It would be nice if Apollo documentation were true. Actually, just because Apollo lets users renice things doesn't mean you have to: sandstorm [100] su Password: # chown root //*/etc/renice # chmod 700 //*/etc/renice Drastic, but on an open cluster like ours, it's definitely necessary. This still isn't complete; nice, which is built into csh, will still let anyone set negative priorities on processes when you start them. David G. Paschich dpassage@ocf.berkeley.edu Just say not to huge .sigs!
rees@pisa.citi.umich.edu (Jim Rees) (03/11/91)
In article <1991Mar10.121848.8362@agate.berkeley.edu>, dpassage@sandstorm.Berkeley.EDU (David G. Paschich) writes: Actually, just because Apollo lets users renice things doesn't mean you have to: sandstorm [100] su Password: # chown root //*/etc/renice # chmod 700 //*/etc/renice That doesn't help much unless you also close up /usr/apollo/lib/cc*. #include <sys/time.h> #include <sys/resource.h> main(ac, av) int ac; char *av[]; { setpriority(PRIO_PROCESS, atoi(av[1]), -20); }
dpassage@monsoon.Berkeley.EDU (David G. Paschich) (03/11/91)
In article <5049bf48.1bc5b@pisa.citi.umich.edu> rees@citi.umich.edu (Jim Rees) writes: >In article <1991Mar10.121848.8362@agate.berkeley.edu>, dpassage@sandstorm.Berkeley.EDU (David G. Paschich) writes: > > Actually, just because Apollo lets users renice things doesn't mean you have > to: [stuff I wrote deleted] > >That doesn't help much unless you also close up /usr/apollo/lib/cc*. > >#include <sys/time.h> >#include <sys/resource.h> > >main(ac, av) >int ac; >char *av[]; >{ > setpriority(PRIO_PROCESS, atoi(av[1]), -20); >} Good point. Yet another example of Apollo saying, "it's not a bug, it's a feature!" David G. Paschich dpassage@ocf.berkeley.edu Just say not to huge .sigs!
etb@milton.u.washington.edu (Eric Bushnell) (03/12/91)
Thanks, Steve Valentine,for the helpful explanation. What I'm planning to do is restrict execution rights on /etc/renice, and then write a wrapper for it that will not allow negative priorities. That won't stop determined hackers, as someone else pointed out, but it's a start. And it's a hint to cpu-greedy users. What I'd like now is a way to control such processes once they've started. If the priority is high enough, it blocks crp and rlogin, so that I can't renice it after the fact. Are there any low-level calls that could be used to signal (kill) and adjust priorities from another node? It's an odd feature, of course, but if anybody can renice anything, who knows what else is possible? And thanks, all, for the stimulating discussion. Eric Bushnell etb@milton.u.washington.edu etb@zeus.ce.washington.edu University of Washington Civil Engineering
wjw@ebe.eb.ele.tue.nl (Willem Jan Withagen) (03/13/91)
In article <FRIDMAN.91Mar8141257@aa.cpsc.ucalgary.ca> fridman@cpsc.ucalgary.ca (fridman) writes: =>In article <18030@milton.u.washington.edu> etb@milton.u.washington.edu (Eric Bushnell) writes: => => =>> An ordinary, unprivileged user wasn't happy =>> with the priority of his batch job, which he =>> had started with /usr/bin/nohup. So he used =>> /etc/renice to change his priority to -20, =>> the highest priority in BSD unix. Only the =>> superuser can do this, right? Apparently not. I complained about this feature to this forum and HPollo, and indeed was their remark: (about half a year ago) It not a bug, it's a feature. Sadly that is again the story which I read about Apollo's being a single user station. (PC are a lot cheaper, SUN's are cheaper and faster, .... ) Another sadening 'finding' is that the manual says something which is totally different from reality. I'm I wrong to asume that most of the BSD4.3 stuff was just compiled of the tape. And if it didn't work it would get hacked until it sort of did. ( I know I'm being a pig to all those hard working engineers at Apollo, I do like Apollo's. Only once in a while I'm happy that the windows don't open far enough to fit a system-case.) sob, sob, sob Willem Jan Withagen Eindhoven University of Technology DomainName: wjw@eb.ele.tue.nl Digital Systems Group, Room EH 10.10 P.O. 513 Tel: +31-40-473401 5600 MB Eindhoven The Netherlands
mike@vlsivie.tuwien.ac.at (Michael K. Gschwind) (03/14/91)
In article <504080ad.20b6d@apollo.HP.COM> smv@apollo.HP.COM (Steve Valentine) writes: >It has long been the Apollo position that Domain Nodes are single user machines. >We try very hard to make the node as usefull as possible to it's one user. This is the old Apollo comment trying to sell bugs as features... >An ordinary, unpriviledged user in this environment may very well have reason to >adjust the priority of some process on his or her node, and can only hurt >themselves by doing so. When nodes are used in shared environments, What happens if the user renices disk servers? The registry deamon? The glbd? There are vital services running which need to be available to the net as a whole! (TCP gateways, servers for diskless computers) >their users must learn that it is socially unacceptable to take advantage of >some of the features available. If we required root access to renice processes, >we would be depriving our users of a feature that they have become accustomed to >and which we feel they can benefit from. There are many simple ways to accomplish renicing other users' processes if this is policy at some site: 1.) You can set the suid bit on /etc/renice. 2.) There could be some configuration file indicating whether every user should be allowed to renice processes. bye, mike Michael K. Gschwind, Dept. of VLSI-Design, Vienna University of Technology mike@vlsivie.tuwien.ac.at 1-2-3-4 kick the lawsuits out the door mike@vlsivie.uucp 5-6-7-8 innovate don't litigate e182202@awituw01.bitnet 9-A-B-C interfaces should be free Voice: (++43).1.58801 8144 D-E-F-O look and feel has got to go! Fax: (++43).1.569697
system@alchemy.chem.utoronto.ca (System Admin (Mike Peterson)) (03/20/91)
In article <504080ad.20b6d@apollo.HP.COM> smv@apollo.HP.COM (Steve Valentine) writes: >In article <18030@milton.u.washington.edu> etb@milton.u.washington.edu (Eric Bushnell) writes: >>Would someone be so kind as to explain how >>Domain process priorities work? > >You're running into one of those areas where Domain/OS maps a square UNIX peg >into a round Aegis hole. > <... niceness<->Aegis priority mapping deleted> This table is certainly wrong for nice values of 2 through 20 inclusive, all of which result in Aegis priority range of 3-16, negating much of the usefulness of nice/renice. >>An ordinary, unprivileged user wasn't happy >>with the priority of his batch job, which he >>had started with /usr/bin/nohup. So he used >>/etc/renice to change his priority to -20, >>the highest priority in BSD unix. Only the >>superuser can do this, right? Apparently not. > >It has long been the Apollo position that Domain Nodes are single user machines. >We try very hard to make the node as usefull as possible to it's one user. >An ordinary, unpriviledged user in this environment may very well have reason to >adjust the priority of some process on his or her node, and can only hurt >themselves by doing so. When nodes are used in shared environments, >their users must learn that it is socially unacceptable to take advantage of >some of the features available. If we required root access to renice processes, >we would be depriving our users of a feature that they have become accustomed to >and which we feel they can benefit from. Our nodes were sold to us as multi-user workstations (especially the DN10000); being able to renice any other user's process, including kernel level processes, and to be able to raise any process priority is a gross violation of UNIX. Apollo UNIX users should never have been given this "feature" in the first place, which required hacking on the source code. I agree completely with another posting which said that IBM and Apollo shouldn't be calling their products "UNIX". Gratuitous changes are a pain in the <insert favourite region here>. -- Mike Peterson, System Administrator, U/Toronto Department of Chemistry E-mail: system@alchemy.chem.utoronto.ca Tel: (416) 978-7094 Fax: (416) 978-8775
thompson@PAN.SSEC.HONEYWELL.COM (John Thompson) (03/20/91)
> > An ordinary, unpriviledged user in this environment may very well have reason to > > adjust the priority of some process on his or her node, and can only hurt > > themselves by doing so. When nodes are used in shared environments, > What happens if the user renices disk servers? The registry deamon? > The glbd? There are vital services running which need to be available to > the net as a whole! (TCP gateways, servers for diskless computers) Things go real belly up. (I assume it was a rhetorical question, but....) We've had users raise their priorities through the roof, and drop other people's priorities through the basement. We've had the tcpd time out all connections, because somebody 'needed to run fast', so changed their lower priority to 12 (Aegis-land - Unix equiv is renice -gawdawfullow). IMHO, HP/Apollo is copping out with their "that's a feature" argument. Aborting processes was a feature, and at sr10 they came up with an extension (`node_data/node_owners). The same thing should hold true for priorities. Each user may have his own system (though that's arguable too), but they can certainly hurt other people with their antics. > >their users must learn that it is socially unacceptable to take advantage of > >some of the features available. If we required root access to renice processes, > >we would be depriving our users of a feature that they have become accustomed to > >and which we feel they can benefit from. That's why something like `node_data/node_owners would be good. People who think that shotguns are valuable computing tools should be able to fire at will. Those of us who like to preserve systems should be able to limit the people who can pull the trigger. -- jt -- John Thompson Honeywell, SSEC Plymouth, MN 55441 thompson@pan.ssec.honeywell.com Me? Represent Honeywell? You've GOT to be kidding!!!
nazgul@alphalpha.com (Kee Hinckley) (03/20/91)
In article <1991Mar20.062300.25980@alchemy.chem.utoronto.ca> system@alchemy.chem.utoronto.ca (System Admin (Mike Peterson)) writes: >Our nodes were sold to us as multi-user workstations (especially the They aren't. >level processes, and to be able to raise any process priority is a gross >violation of UNIX. Apollo UNIX users should never have been given this >"feature" in the first place, which required hacking on the source code. nice required source code hacking just to work. We aren't talking a Unix kernel here. Aegis already had a process priority mechanism that didn't require permissions to use. Unix process priorities had to be mapped onto that without breaking people depending on the previous behavior. >shouldn't be calling their products "UNIX". Gratuitous changes are a >pain in the <insert favourite region here>. There isn't a gratuitous change there. Someone had the following choices: 1) Be incompatible with older Aegis-based software which depends on being able to change process priorities. 2) Lull the Unix user into a false sense of security by not allowing Unix commands to change the priority, but still allowing Aegis commands to. 3) Relax the Unix protections. #1 doesn't fit in with Apollo's stated standards (sure, they don't always maintain compatibility, but they try). #2 is clearly wrong - people get really pissed when they discover that. #3 fits in with the model (which R&D certain had, if not marketing) that an Apollo workstation is a single- user workstation. -- Alfalfa Software, Inc. | Poste: The EMail for Unix nazgul@alfalfa.com | Send Anything... Anywhere 617/646-7703 (voice/fax) | info@alfalfa.com I'm not sure which upsets me more: that people are so unwilling to accept responsibility for their own actions, or that they are so eager to regulate everyone else's.
rees@pisa.citi.umich.edu (Jim Rees) (03/21/91)
In article <1991Mar20.062300.25980@alchemy.chem.utoronto.ca>, system@alchemy.chem.utoronto.ca (System Admin (Mike Peterson)) writes:
I agree completely with another posting which said that IBM and Apollo
shouldn't be calling their products "UNIX". Gratuitous changes are a
pain in the <insert favourite region here>.
Why not? AT&T still calls their product "Unix," and look at all the things
they broke in system V. And what about Berkeley? Their version of Unix
bears little resemblance to anything that ever came out of Bell Labs. You're
annoyed because Apollo Unix doesn't look like anyone else's, but in fact no
one's Unix looks anything like the original, so who's to say which one is
"real Unix?"
Besides, AT&T (who owns the name "Unix") likes to define Unix as anything
that passes the SVVS. By this measure, Domain/OS passes, Berkeley doesn't.
But I'm not going to go out and buy a computer just because it's SVID
compliant. In fact, that's a mark against it for me. I'm going to buy the
computer that gets the job done the way I want it done.
thompson@PAN.SSEC.HONEYWELL.COM (John Thompson) (03/23/91)
<<forwarded message>> > >Our nodes were sold to us as multi-user workstations (especially the > They aren't. But if they were _sold_ as such, HP/Apollo at least has an obligation to support them this way. Also, I'd have a hard time considering the dn10000 (that now-obsolete machine) a single-user system. > >shouldn't be calling their products "UNIX". Gratuitous changes are a > >pain in the <insert favourite region here>. > There isn't a gratuitous change there. Someone had the following choices: > > 1) Be incompatible with older Aegis-based software which depends > on being able to change process priorities. > 2) Lull the Unix user into a false sense of security by not allowing > Unix commands to change the priority, but still allowing Aegis > commands to. > 3) Relax the Unix protections. > #1 doesn't fit in with Apollo's stated standards (sure, they don't always > maintain compatibility, but they try). #2 is clearly wrong - people get > really pissed when they discover that. #3 fits in with the model (which > R&D certain had, if not marketing) that an Apollo workstation is a single- > user workstation. You missed an option. 4) Keep Unix protections tight, and create a permissions file for the Aegis users. #4 was done for the sigp/kill pair! In Unix, you still need to be the owner of a process (or root) to kill it. In Aegis, you need to be the owner of a process, _OR_ be listed in the `node_data/node_owners ACL. Now, that might leave them (HP/Apollo) a little open, but if they (A) put a comment in the release notes and/or (B) installed a locked-up ppri_owners file if Aegis wasn't loaded on a system, they should be reasonably safe. They'd also have fewer problems, IMHO. -- jt -- John Thompson Honeywell, SSEC Plymouth, MN 55441 thompson@pan.ssec.honeywell.com Me? Represent Honeywell? You've GOT to be kidding!!!