[net.bugs.uucp] Discrepancies in UUCP SYSLOG timings

fair@dual.UUCP (Erik E. Fair) (08/09/84)

I have been trying to use the information kept in the UUCP SYSLOG for a
generalized report of UUCP traffic. I total up the number of transfers,
number of bytes in both directions and the total time spent
transferring data with awk, and mail myself a report once a week. All
well and good.

Then I modified the awk script to give me average characters per second
for receive and transmit of data, and a big ugly split appeared. We
received data from 10 to 30 cps slower than we transmitted it to the
same site.  I thought there was something wrong with our system until I
set up ucbvax to keep the same statistics. Guess what? They show the
same split for all the sites that they talk to. When ucbvax sends to
the remote, it sends at around 110 cps. When it receives from the
remote, it receives at 80 or 90 cps.

This tended to imply something strange going on in the timing in UUCP
itself. To show you what I mean, here are four lines from the SYSLOGs
of dual and ucbvax:

==> DUAL-SYSLOG <==
ucbvax!daemon M (8/7-9:18:33) (C,7367,1) (0:8:12)  <- 26578 / 245 secs
ucbvax!daemon M (8/7-9:18:38) (C,7367,2) (0:8:17)  <- 68 / 2 secs
ucbvax!uucp M (8/7-17:33:33) (C,12681,1) (0:38:5)  -> 1689 / 15 secs
ucbvax!uucp M (8/7-17:33:39) (C,12681,2) (0:38:11)  -> 57 / 0 secs

==> UCBVAX-SYSLOG <==
daemon dual (8/7-9:19) (460743571) sent data 26578 bytes 243 secs
daemon dual (8/7-9:19) (460743576) sent data 68 bytes 0 secs
uucp dual (8/7-17:34) (460773283) received data 1689 bytes 17 secs
uucp dual (8/7-17:34) (460773288) received data 57 bytes 2 secs

As you can see, there is a two second difference in the times on either
end. The receiving end says it takes 2 seconds longer than the transmit
end claims. For the folks wondering about configurations, DUAL is a
System V R 1 10Mhz 68000, and UCBVAX is a 4.2 BSD DEC VAX-11/750.

I looked over the timing code (it's in each of the protocol modules;
gio.c for this example) and it seems to be reasonable. The only thing
that caught my attention is that there seems to be a lot more packet
code to wade through for receive than transmit. Is that all this is?
Excessive overhead in receive?

Has anyone out there ever profiled uucico for transmitting and
receiving data at 1200 baud?

	discussion of the problem is encouraged,

	Erik E. Fair	ucbvax!fair	fair@ucb-arpa.ARPA

	dual!fair@BERKELEY.ARPA
	{ihnp4,ucbvax,hplabs,decwrl,cbosgd,sun,nsc,apple,pyramid}!dual!fair
	Dual Systems Corporation, Berkeley, California

chris@umcp-cs.UUCP (08/10/84)

It's probably due to all the context switching that goes on when UUCP
is receiving data at low baud rates.  The ``4.2'' UUCP has a hack in it
to do sleep()s to cut down on the overhead, but this raises the overall
time since a sleep(1) is on the average 1 second too long for a 64 byte
packet at 1200 baud (if you assume that a sleep(1) takes between .5 and
1.5 seconds; I haven't figured out what sleep(1) really does under 4.2,
what with alarm being a subroutine that calls setitimer which...).

I stuck some code into pk1.c to use the select system call for
fractional timeouts based on how much data is expected.  It seems to
works pretty well.  Here's the modified pkcget() routine.  (Note that
it needs to #include <sys/types.h> and <sys/time.h>; also note that
it is blatantly 4.2-and-int-is-32-bits-dependent.)

/***
 *	pkcget(fn, b, n)	get n characters from input
 *	char *b;		- buffer for characters
 *	int fn;			- file descriptor
 *	int n;			- requested number of characters
 *
 *	return codes:
 *		0  - ok
 *		-1 - timeout or EOF
 */

jmp_buf Getjbuf;
cgalarm() { longjmp(Getjbuf, 1); }

pkcget (fn, b, n)
int fn;
register int n;
register char *b;
{
    register int r;
    struct timeval tv;
    register int itime = 100000;/* as a guess, it's been 1/10th sec
				   since we last read the line */
    extern int linebaudrate;

    if (setjmp (Getjbuf)) {
	Ntimeout++;
	PKDEBUG (4, "alarm %d\n", Ntimeout);
	return -1;
    }
    signal (SIGALRM, cgalarm);

    alarm (PKTIME);
    while (n > 0) {
	if (linebaudrate > 0) {
	    r = n * 100000;
	    r = r / linebaudrate;
	    r = (r * 100) - itime;
	    itime = 0;
	    if (r > 20000) {	/* we predict that more than 1/50th of a
				   second will go by before the read will
				   give back all that we want. */
		tv.tv_sec = r / 1000000;
		tv.tv_usec = r % 1000000;
		PKDEBUG (9, "PKCGET stall for %d", tv.tv_sec);
		PKDEBUG (9, ".%06d sec\n", tv.tv_usec);
		(void) select (0, (int *)0, (int *)0, (int *)0, &tv);
	    }
	}
	r = read (fn, b, n);
	if (r == 0) {
	    alarm (0);
	    return -1;
	}
	PKASSERT (r > 0, "PKCGET READ", "", r);
	b += r;
	n -= r;
    }
    alarm (0);
    return 0;
}
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci (301) 454-7690
UUCP:	{seismo,allegra,brl-bmd}!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@maryland

henry@utzoo.UUCP (Henry Spencer) (08/12/84)

Speaking without having looked at the code (I can't face it just now),
perhaps the discrepancy in times is a question of timing different things.
E.g., if some of the handshaking gets included when timing one way and not
when timing the other, this might explain the discrepancy.

The discrepancy may or may not be "real", but it would definitely appear
to be a result of some of the more recent twiddling with uucp.  Utzoo's
uucp -- stock V7 with a lot of bug fixes -- shows no such phenomenon, at
least on most of our links.  (Traffic on some of them is too unidirectional
for me to be sure, and I've only got a few days' data handy.)  Links on
which the phenomenon does *not* show up include utzoo-watmath, which has
a 4.2 on the other end, presumably including a 4.2 uucp.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry