[comp.unix.wizards] Changing the nice

mikej@lilink.UUCP (Michael R. Johnston) (01/20/89)

Today on a whim I did something on my system which quite interestingly
enough improved performance measurably. Perhaps someone can explain in
technical terms why this works. 

One of the systems I administer at work is an Altos 386/2000 running SVR3
Unix. This system is a 64 port box. Typically, we have about 30 to 35
users on simultaneously running our database applications software.

Before I made my change today, the system usually became just a tad sluggish
after about 25 to 30 users were online. Not enough to complain about but just
enough so the cursor would sometimes hesitate for a second or two. The system
doesn't swap since we have 16 meg of ram and sar reports several hundred
free block even on the heaviest of days.

When I ran sar I usually came out with an "idle%" figure of less than 10 
during mid-day loads. What I did was to change the invocation of the
program from:

program

to:

nice -10 program.

By golly, suddenly the cursor stopped hesitating and when I re-ran sar I
got figures that were in the 50 to 60 percent range. After some playing with
the nice() value I  was able to get a final result in the range of 60 to 70
percent most of the time. I was ecstatic! (as only a sysadmin can be...)

I don't know the technical reasons behind this but I can surmise the following.
By lowering the nice() value accross the board equally we devote less time
to idle processes which are probably in the majority at any given moment. 
Remember that this is an online database that is only queried on demand.
Having done this, it then has more time to service the important requests 
for work. Am I completely off base?

Now that I have discovered this I have had another thought. Could I 
automatically alter the nice() value of one of these running processes based 
upon the time it has been sitting idle via a daemon? I.E. could I check the
idle time for the tty every minute or so and adjust nice() for it lower and
lower until it finally came back into use? This sounds like the perfect
performance tuning tool. It could also have a preset threshold that would just
kill the user altogther once idle time had been exceeded.

Anyone?
---
Michael R. Johnston
System Administrator   {..!philabs!mergvax! | ..!uunet!ispi!} lilink!mikej
LILINK - Public Access Xenix   (516) 872-2137  1200/2400 8N1  Login: new

dlm@cuuxb.ATT.COM (Dennis L. Mumaugh) (01/22/89)

In article <363@lilink.UUCP> mikej@lilink.UUCP (Michael R.
Johnston) writes: about using nice to change process priorities
and make things work.

In a job a long time ago in a place far far away ....

I set the nice value of login to be 2:  I.e. just before login
does the setuid stuff it does nice(2).

Hence the default level of all users was 2.  Then the cc system
and nroff/troff had a nice of 3 set (actually one should nice to
max(3, current nice).  And all editors had a nice of 1.  Ps ran
with a nice of -10.

The result was that interactive users got good response and
nobody suffered.  Even with the new UNIX schedulers this scheme
ought to be adopted by all.
-- 
=Dennis L. Mumaugh
 Lisle, IL       ...!{att,lll-crg}!cuuxb!dlm  OR cuuxb!dlm@arpa.att.com

debra@alice.UUCP (Paul De Bra) (01/23/89)

In article <363@lilink.UUCP> mikej@lilink.UUCP (Michael R. Johnston) writes:
}Today on a whim I did something on my system which quite interestingly
}enough improved performance measurably. Perhaps someone can explain in
}technical terms why this works. 
}...
}Before I made my change today, the system usually became just a tad sluggish
}after about 25 to 30 users were online. Not enough to complain about but just
}enough so the cursor would sometimes hesitate for a second or two. The system
}doesn't swap since we have 16 meg of ram and sar reports several hundred
}free block even on the heaviest of days.
}
}When I ran sar I usually came out with an "idle%" figure of less than 10 
}during mid-day loads. What I did was to change the invocation of the
}program from:
}
}program
}
}to:
}
}nice -10 program.
}
}By golly, suddenly the cursor stopped hesitating and when I re-ran sar I
}got figures that were in the 50 to 60 percent range. After some playing with
}the nice() value I  was able to get a final result in the range of 60 to 70
}percent most of the time. I was ecstatic! (as only a sysadmin can be...)
}
}I don't know the technical reasons behind this but I can surmise the following.
}By lowering the nice() value accross the board equally we devote less time
}to idle processes which are probably in the majority at any given moment. 
}Remember that this is an online database that is only queried on demand.
}Having done this, it then has more time to service the important requests 
}for work. Am I completely off base?

Yes.

If all processes run at the same lowered priority the scheduling should
remain the same. What you did however had 2 effects:
1) The database program runs at a lower priority than your user interface
   which means that the user interface will respond better. This explains
   why the cursor is no longer hesitating.
2) It looks like you are running the program from a shell script. The
   "nice" command is not build in (in the bourne shell), which means you
   are now invoking twice as many processes, which aren't doing much, but
   the nice process sits waiting for the database program to finish. The
   system may start thrashing more, resulting in more idle time because
   the system is waiting for page faults. On a heavily loaded system like
   yours the cpu-usage should be close to 100%, and the higher the better.
   Lower values mean you system is thrashing.

To find out whether your apparent improvement is real you should measure
the time needed to solve queries. Cursor response is deceiving.

Paul.
-- 
------------------------------------------------------
|debra@research.att.com   | uunet!research!debra     |
------------------------------------------------------

dwc@homxc.ATT.COM (Malaclypse the Elder) (01/24/89)

In article <8818@alice.UUCP>, debra@alice.UUCP (Paul De Bra) writes:
> 
> To find out whether your apparent improvement is real you should measure
> the time needed to solve queries. Cursor response is deceiving.
> 
it is arguable that cursor response is the most important
measure since it is EXPECTED to be almost instantaneous while
database queries are EXPECTED to take a while.  but i agree
that the original poster is probably penalizing the database
backend process in favor of the interactive front end processes.

the problem will come if/when the cpu requirements of the
terminal processes ALONE start to saturate the cpu.  the
nice value will have a tendency to lock out the database process.
THAT IS, THE EFFECT OF NICE IS STATE DEPENDENT.  on lightly
loaded systems, the recent cpu used will dominate in differentiating
between processes.  on heavily loaded systems, the nice value
will tend to dominate (because of the fast rate at which the recent
cpu measure is decayed).

danny chen
att!homxc!dwc

willcox@urbana.mcd.mot.com (01/26/89)

>2) It looks like you are running the program from a shell script. The
>   "nice" command is not build in (in the bourne shell), which means you
>   are now invoking twice as many processes, which aren't doing much, but
>   the nice process sits waiting for the database program to finish. The
>   system may start thrashing more, resulting in more idle time because
>   the system is waiting for page faults. On a heavily loaded system like
>   yours the cpu-usage should be close to 100%, and the higher the better.
>   Lower values mean you system is thrashing.

Might I correct a couple of misconceptions here? 

 a) Using the nice command does NOT require any more processes.  "nice"
    does not run your command as a subprocess, but rather execs your
    command.  Leaving out many details, it does essentially:

	nice (nice_value);
	execvp (argv[0], argv);

 b) In any reasonable system, a process that's been sitting idle will
    get swapped or paged out, leaving the memory available for active
    processes.  Idle processes won't contribute to thrashing, and in
    particular won't increase the number of page faults significantly.
    (If you were constantly creating extra processes, that would
    increase overhead, but that didn't appear to be the situation here.)

 c) In general, I would agree that higher CPU usage is better.  However,
    that's only true if the CPU is spending the time doing USEFUL
    work.  If you can reduce overhead and thereby increase idle time,
    that's good.  In this case, they apparently are getting the same
    amount of work done, are getting better interactive response, and
    have more idle time.  How could you argue that this is bad?  I'd
    guess that by adjusting priorities the number of process switches
    has been reduced, eliminating that unnecessary overhead.

David A. Willcox
Motorola Urbana Design Center
1101 E. University Ave.
Urbana, IL 61801
217-384-8500
UUCP: ...!uiucuxc!mcdurb!willcox
ARPA:	willcox@xenurus.Gould.COM (temporary mail address)

leo@philmds.UUCP (Leo de Wit) (01/26/89)

In article <8818@alice.UUCP> debra@alice.UUCP () writes:
    []
|If all processes run at the same lowered priority the scheduling should
|remain the same. What you did however had 2 effects:
|1) The database program runs at a lower priority than your user interface
|   which means that the user interface will respond better. This explains
|   why the cursor is no longer hesitating.
|2) It looks like you are running the program from a shell script. The
|   "nice" command is not build in (in the bourne shell), which means you
|   are now invoking twice as many processes, which aren't doing much, but
|   the nice process sits waiting for the database program to finish.

If nice was implemented to wait for a child to terminate, this would be
the case. But this wouldn't be a nice implementation of nice (8-), since
all it has to do is lower the priority, then execve() its own (shifted)
arguments. On Ultrix:

$ nice -10 ps gx
  PID TT STAT  TIME COMMAND
 4766 ia TW    0:43 rn
 5914 ia R N   0:01 ps gx
17932 ia S     0:18 esh
$

No sign of a waiting nice process...

	 Leo.

aglew@urbana.mcd.mot.com (01/28/89)

>To find out whether your apparent improvement is real you should measure
>the time needed to solve queries. Cursor response is deceiving.
>
>Paul.

Cursor response may be deceiving, but it sure can be satisfying.
May reduce user frustration level, even if queries take a bit
longer.

On a slightly different level, at last year's SIGMETRICS
Wisconsin's DeWitt explained how he was speeding up complex
queries. Somebody from the audience said "But complex queries
are only a miniscule part of your workload". DeWitt responded
that simple queries didn't need to be sped up any more...

I'm oversimplifying, but the point is that frequency of use
doesn't necessarily correspond to importance (and I'm sure that
somebody can turn this around wrt. cursors and queries).