[net.bugs.2bsd] 2.9 BSD uucico slowness

scw@cepu.UUCP (Stephen C. Woods) (09/28/84)

**<- squished bugs
There seems to be a major problem with the 2.9BSD implementation of uucico
on 1200 baud lines talking to 2 different 2.9BSD systems we're only getting:

With ucla-an (09/27-15:25) total data 294648 bytes 12924 secs = 22.80 CPS
With ucla-an (09/27-15:25) recieved data 0 bytes 0 secs 
With ucla-an (09/27-15:25) sent data 294648 bytes 12924 secs = 22.80 CPS

With ucla-ci (09/27-15:25) total data 1874341 bytes 85608 secs = 21.89 CPS
With ucla-ci (09/27-15:25) recieved data 4422 bytes 279 secs = 15.85 CPS
With ucla-ci (09/27-15:25) sent data 1869919 bytes 85329 secs = 21.91 CPS

Typically with other systems (V7, sysIII, 4.xBSD, and Locus) that I
talk to at 1200 baud the Gross Xfer rate is ~100 CPS. Even the V7
system that I talk to at 300 baud is faster (~27 CPS).

Has anyone else seen this problem?  If so are there any fixes available?
Please respond directly to me, I'll post a summary (and any fixes that we
discover) to the net.
Thanks in advance.
-- 
Stephen C. Woods (VA Wadsworth Med Ctr./UCLA Dept. of Neurology)
uucp:	{ {ihnp4, uiucdcs}!bradley, hao, trwrb, sdcrdcf}!cepu!scw
ARPA: cepu!scw@ucla-cs location: N 34 3' 9.1" W 118 27' 4.3"

jerry@oliveb.UUCP (Jerry Aguirre) (10/02/84)

I ran into the same problem of uucico slowness when I installed Unix
2.9BSD.  In comparing the sources to the version we were running before
I came to the  conclusion that the problem was the sleep(1) call in 
the file pk1.c.
	for (nchars = 0; nchars < n; nchars += ret) {
		if (nchars > 0)
			sleep(1);
		ret = read(fn, b, n - nchars);
		if (ret == 0) {
			alarm(0);
			return(-1);
		}
		PKASSERT(ret > 0, "PKCGET READ", "", ret);
		b += ret;
	}
Apparently this call was added to improve efficiency.  The idea is that
if it takes more than one read to receive a packet then you are wasting
time and should sleep to allow some other process to run.  Meanwhile
the line will buffer up characters so you can read a whole packet in
one read instead of a single character.

This sounds good until you analyze what is really happening!  The
condition of receiving only a few(1) characters only happens when your
system is lightly loaded so you are trying to improve response time
when it is good and not doing a damn thing when it is bad.  The upshot
of this is that during peak load your uucp will run at full speed and
during the wee small hours of the night it will limp along.  Also the
faster the line speed, the slower the data transfers.

This sleep is probably ok at 300 baud (Yech! primitive) where it would
indeed cut down on the overhead.  I noticed little degradation on my
system at 1200 baud.  But on our internal 9600 baud links the result
was terrible!  Monitoring the line showed:

    blink(packet sent) [delay 1 second] blink(ack sent) [delay 1
    second]

The result is a packet sent once every 2 seconds.

There is a modification to this code that checks the line speed in
order to determine whether to do a sleep.  The test is whether the
speed is less that 4800 buad so I doubt it will do anything for you.
The better fix is to "nap" for a period less than a second.  This
requires the "nap" system call which is nonstandard.  Another
alternative is to compile the pk(2) routines into the kernel.

I would suggest that you just comment out the sleep or use your 2.8
uucp source.  In comparing our version to the 2.9 I decided that our
"old" source was more up to date.

scw@cepu.UUCP (10/03/84)

*<eat it>
I'm reposting this, as it appears that it didn't get out of my system 
(or even into my system).

Problem	    2.9BSD uucico is VERY slow
	    No, let me rephrase that, 2.9BSD uucico is *UNBELIEVEABLY* slow.

Repeat-by   tail -20 /usr/spool/uucp/SYSLOG, divide the number of bytes
	    transfered by the time it took.  Typical rates for a 1200
	    baud line will be ~22 bytes/sec.

Fix-by	    Remove the sleep(1) at line 375 in pk1.c

Cause	    It appears to be an incomplete removal of the cpu.speedup patch
	    from the news.2.10 dist.

	    A full reinstalation of the cpu.speedup patch has been posted to
	    net.sources.

Thanks to the following, who all pointed out the same error:
ihnp4!ut-sally!harvard!wjh12!sob
trwrb!wlbr!jm
ihnp4!harvard!tardis!ddl
and to ...cepu!ucla-cime!kyle and ...cepu!ucla-an!stan for letting me use
their systems for testing and implementation of the patch.

-- 
Stephen C. Woods (VA Wadsworth Med Ctr./UCLA Dept. of Neurology)
uucp:	{ {ihnp4, uiucdcs}!bradley, hao, trwrb, sdcrdcf}!cepu!scw
ARPA: cepu!scw@ucla-cs location: N 34 3' 9.1" W 118 27' 4.3"

mark@cbosgd.UUCP (Mark Horton) (10/04/84)

The sleep seems to be a (broken) variation of an idea I posted long ago.
Unfortunately, the code you posted is clearly wrong.  The right way
to do it is, after the read returns, if it returned short, to sleep.
The code posted unconditionally sleeps before trying the first read.

Measurements showed that at 1200 baud it cut way down on system load
(and this really makes a difference - a uucico at 1200 baud only adds
about .2 to our load average instead of 1 like it used to) with almost
no effect on throughput rates, but at 9600 baud it hurts throughput
drastically (since 1 second is far too long to sleep at 9600 baud)
and is a bad idea.  At 4800 it's a close call, and a local decision
should be made.

larry@decvax.UUCP (Larry Cohen) (10/07/84)

Here at decvax we will do anything to speed up uucico.
We also noticed that in the pk routines, several "reads"
were necessary to pull a packet off the network.  On the
average there were about 3 context switches per packet on
a 1200 baud line.  Of course this varied with the time of
day and speed of the line.  One "experiment" we tried was
to try a modified Berknet line discipline.  This line discipline
would read a specified number of characters and not return until
they all had arrived.  Internally it used circular queues. It
does not use clists at all.
The improvement was pretty good. 
I dont have the exact figures with me (I'm at home 4 months
after I ran the experiments) but It was something
in the ballpark of about 20 % improvement in overall throughput.
33 % reduction in context switches.  Less time was spent in the system,
and the user time was less also.  There was no reduction in performance
on 9600 baud lines.  I hope to try running on decvax before long.
						-Larry Cohen

jerry@oliveb.UUCP (Jerry Aguirre) (10/09/84)

What happened to the idea of putting the pk routines into the
kernel?  The man pages still advertize the kernel version of the
pk routines but I know of no system that uses them.  Does anybody
have them working?  Is there a reason not to put them in the
kernel?

					    Jerry Aguirre
{hplabs|fortune|idi|ihnp4|ios|tolerant|allegra|tymix}!oliveb!jerry