[pe.cust.sources] uucp speedup, part 1 - kernel changes

srd@peora.UUCP (Steve Davies) (04/15/85)

When uucp is receiving files, it takes up a goodly number of the
available cpu cycles.  The reasons for this are discussed in
<342@down.FUN>.  The first part of that article is presented here:

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

> Under certain circumstances, you may find that when 2 or 3 uucicos
> are running on your system, they are eating up all the CPU time,
> and system performance suffers horribly.  If this is your problem,
> you can do a "vmstat 5" and watch the system calls and context switches
> counters.  If they are both very high whenever 2 or more uucicos
> are running (100-200 system calls/second, over 100 context switches),
> chances are that the problem is as follows:

> When another system is sending you a file, your uucico reads characters
> from the line.  The read returns whatever is there waiting, or if
> nothing is waiting, waits for one character and returns.  Since uucico
> usually wants 64 characters at a time, at 1200 baud it's quite common
> to read these in 1 or 2 character pieces.  Each uucico will read 1 or
> 2 characters, wake up the user process, go back for more, there won't
> be any, so it hangs and gives up the CPU.  A very short time later,
> (often within the same clock tick) there will be a character available,
> the process will wake up, read one character, and try again.
> 
> This modification is very simple.  If the first read returned fewer
> characters than requested, before doing another read, the process
> will sleep for one second.  Then, when it wakes up, there will probably
> be as many characters waiting as it needs.

> This modification makes a big difference when you are RECEIVING a file
> from another system.  It won't make much difference when you are
> SENDING a file, because the user process doesn't usually have to hang
> to write to the line, and when it does, the high/low water mark
> mechanism in the tty driver keeps it from waking up too often.
> This change is intended for a V7 or 4BSD system.  It may not
> help much on System V, because uucp uses a USG tty driver feature
> to make it wake up only every 6 characters.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

A version of this modification suitable for Perkin-Elmer Edition7 2.4
systems follows.  It requires a kernel change to add a nap() call to
allow a high resolution sleep, and some simple changes to two uucp source
files.  This article gives the kernel changes; the next, the uucp changes.
This modification really really does work.  uucp no longers hogs all the
cpu time and the throughput rate even seems to have increased.

Steve Davies {decvax!ucf-cs  |  ihnp4!pesnta  |  vax135!petsd}!peora!srd
Perkin-Elmer SDC/2486 Sand Lake Road/Orlando, Fl. 32809/	(305)850-1033


These are the steps to add the nap system call

1.  add the nap library interface routine to libc.a
2.  change to directory /usr/src/sys/sys
3.  add the file nap.c to the /usr/src/sys/sys directory
4.  modify file sysent.c to add system call number 58.
5.  update the makefile
6.  make
7.  move file LIB1 to /usr/sys/sys
8.  change to directory /usr/sys/conf and remake /edition7
    (possibly, NCALL in para.c might need to be increased;
     refer to page 8 of Regenerating System Software in Vol 2)
9.  (optional)  There is a #define in /usr/include/sys/tty.h named TTYHOG.
    It is defined as 256.  The tty driver will throw away all the characters
    that have been input on a tty line if more than TTYHOG characters have
    been received and no process has read them.  Also, after TTYHOG/2
    characters have been received, a STOP character is sent out on the tty
    line.  Since uucico is now going to be naping and not continually trying
    to read from the tty line, it seems that this situation could occur
    frequently.  Anyway, increasing TTYHOG to something large, (we made it
    2048) may be advisable.  Its possible that if this is done, the number of
    CLISTS may need to be increased (Regenerating System Software has the
    details; we didn't find it necessary).

    BTW, increasing TTYHOG did help when using cu to call other computers. At
    1200 baud, it only takes a second to receive 256/2 characters.  When
    TTYHOG was set to 256, there was a lot of STOP and START characters being
    sent to the other computer.  This would cause output to be quite irregular
    when the system was loaded.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
assemble this routine and add it to libc.a
	as -o nap.o nap.s
	ar r libc.a nap.o
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
nap	title	unix c svc library -- nap

.nap	equ	58

	entry	nap
r0	equ	0
r1	equ	1
r2	equ	2
rf	equ	15
sp	equ	7
*
	pure
nap	equ	*
	l	r0,0(sp)
	bnpr	15
	svc	0,.nap
	br	rf
	align	2*adc
	end

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
	This is file /usr/src/sys/sys/nap.c
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
static char sccsid[] = "@(#)nap.c	1.0";

#include "sys/param.h"
#include "sys/systm.h"
#include "sys/buf.h"
#include "sys/filsys.h"
#include "sys/mount.h"
#include "sys/dir.h"
#include "sys/proc.h"
#include "sys/file.h"
#include "sys/user.h"

#define MAX_NAPPING (15)
#define PNAP (PZERO+1)
static int number_napping=0;

napwakeup(arg) caddr_t arg; {
	wakeup(arg);
	number_napping--;
}

nap() { /* #58 */

	register struct a {
		int naptime;
	} *uap;

	uap = (struct a *)u.u_ap;
	spl6();
	if (number_napping < MAX_NAPPING) {
		number_napping++;
		timeout(napwakeup, u.u_procp, uap->naptime);
		sleep(u.u_procp, PNAP);
	} else
		u.u_error = EINVAL;

	spl0();
}
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
	this is a diff of /usr/src/sys/sys/sysent.c
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

*** sysent.c.v7	Tue Sep 27 13:33:06 1983
--- sysent.c	Fri Mar  8 12:53:34 1985
***************
*** 78,83
  #ifdef SVC6
  int	getswit();
  #endif
  
  struct sysent sysent[64] =
  {

--- 76,86 -----
  #ifdef SVC6
  int	getswit();
  #endif
+ int	nap();
  
  struct sysent sysent[64] =
  {
***************
*** 146,152
  	3, 0, ioctl,			/* 54 = ioctl */
  	0, 0, nosys,			/* 55 = readwrite (in abeyance) */
  	4, 0, mpxchan,			/* 56 = creat mpx comm channel */
  	0, 0, nosys,			/* 57 = reserved for USG */
! 	0, 0, nosys,			/* 58 = reserved for USG */
  	3, 0, exece,			/* 59 = exece */
  	1, 0, umask,			/* 60 = umask */

--- 149,155 -----
  	3, 0, ioctl,			/* 54 = ioctl */
  	0, 0, nosys,			/* 55 = readwrite (in abeyance) */
  	4, 0, mpxchan,			/* 56 = creat mpx comm channel */
  	0, 0, nosys,			/* 57 = reserved for USG */
! 	1, 1, nap,			/* 58 = nap (high resolution sleep) */
  	3, 0, exece,			/* 59 = exece */
  	1, 0, umask,			/* 60 = umask */

***************

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
	diff of /usr/src/sys/sys/makefile
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

*** makefile.v7	Wed Sep  7 12:33:14 1983
--- makefile	Tue Nov 13 23:33:45 1984
***************
*** 4,10
  CFILES  = \
  	acct.c alloc.c clock.c fakemx.c fio.c iget.c machdep.c main.c \
  	malloc.c nami.c pipe.c prf.c prim.c rdwri.c sig.c slp.c subr.c \
!         sys1.c sys2.c sys3.c sys4.c sysent.c text.c trap.c ureg.c lockf.c
  # OFILES must be in order
  OFILES  = \
  	main.o trap.o sig.o iget.o prf.o slp.o subr.o rdwri.o clock.o fio.o \

--- 4,11 -----
  CFILES  = \
  	acct.c alloc.c clock.c fakemx.c fio.c iget.c machdep.c main.c \
  	malloc.c nami.c pipe.c prf.c prim.c rdwri.c sig.c slp.c subr.c \
!         sys1.c sys2.c sys3.c sys4.c sysent.c text.c trap.c ureg.c lockf.c \
! 	nap.c
  # OFILES must be in order
  OFILES  = \
  	main.o trap.o sig.o iget.o prf.o slp.o subr.o rdwri.o clock.o fio.o \
***************
*** 9,15
  OFILES  = \
  	main.o trap.o sig.o iget.o prf.o slp.o subr.o rdwri.o clock.o fio.o \
  	malloc.o alloc.o machdep.o nami.o pipe.o prim.o fakemx.o sysent.o \
!         sys3.o sys1.o sys4.o sys2.o acct.o text.o ureg.o lockf.o
  
  LIB1: $(CFILES)
  	$(CC) $(CFLAGS) $?

--- 10,17 -----
  OFILES  = \
  	main.o trap.o sig.o iget.o prf.o slp.o subr.o rdwri.o clock.o fio.o \
  	malloc.o alloc.o machdep.o nami.o pipe.o prim.o fakemx.o sysent.o \
!         sys3.o sys1.o sys4.o sys2.o acct.o text.o ureg.o lockf.o \
! 	nap.o
  
  LIB1: $(CFILES)
  	$(CC) $(CFLAGS) $?
***************

dave@lsuc.UUCP (David Sherman) (04/16/85)

There's another way to implement nap which also works;
that is to implement a new device to which you ioctl to
get the appropriate number of naps. I put it in our
11/23 v7 system two years ago after getting it off the
net, and transferred it to our 3220 last fall. If anyone
wants it, let me know. All it requires is a cdevsw entry,
a /usr/sys/dev/nap.c, and a mknod of /dev/nap. A /usr/src/libc/gen/nap.c
routine calls the ioctl for you as nap(3), or you can do it
by hand.

If anyone wants this, let me know. Since the system-call
approach has already been posted, you may want to use that
one instead. I don't think there's much technical reason
to prefer either one. The code for the one I use, since it's
in C, is more portable than the posted assembler source.

Dave Sherman
-- 
{utzoo pesnta nrcaero utcs hcr}!lsuc!dave
{allegra decvax ihnp4 linus}!utcsri!lsuc!dave