[news.software.nntp] Batched headers

amanda@intercon.uu.net (Amanda Walker) (06/29/89)

Well, as you may remember, a couple weeks ago I experimented with a version
of the "HEAD" command that sent a whole range of headers instead of just
one, and concluded that it wasn't very useful.

I take it back :-).

It turns out not to be very useful under ideal conditions, but to be quite
a marked improvement in actual use.  My inital performance comparison was
done with a quick little program that ran on our UNIX box and talked to
the NNTP server.  With some tweaking, I was able to get this run almost as
fast with the normal "HEAD" command as with my version, which led me to
believe that my patch wasn't going to be very useful.  Unfortunately,
bandwidth isn't quite free, and packing the server's TCP input queue with
HEAD commands isn't all that wonderful on a real live network, for two main
reasons:

 - it generates a lot of network traffic, since there is no buffering
   going on in the server between articles; this means that both sides
   end up sending bunches of little packets at each other.  This is not
   a problem on the loopback interface :-), but is pretty unfriendly on
   a real net, and actual performance is pretty poor, even on an otherwise
   quiescent Ethernet.

 - the technique is not easily retrofitted to existing newsreaders, such
   as 'rn', 'vnews', or TMNN's readers, since you can't use the "we'll
   just pretend we're reading files" technique any more--you have to
   talk to the server asynchronously.

With this in mind, I'd like to propose the following patch to NNTP,
either as a replacement for the XHDR command or an extension of the HEAD
command.  The following shell archive contains the file "xhead.c", which
implements the command, as well as two short patches to help.c and serve.c.

Since the output is not flushed until the end of the last header, the
network traffic is a lot better, and since there is only one request per
batch of headers, the load on both the server and the client ends up
lighter.  For modern newsreaders and filtering programs, this is a
very good thing.

The XHEAD command takes a range of article numbers and returns a batch of
headers in the following format:

	<article number> CRLF
	<article header>
	CRLF

for each header, terminated by a line consisting only of a period followed
by CRLF.  For example (server responses are indented):

telnet intercon 119
	Trying...
	Connected to intercon.
	Escape character is '~'.
	200 intercon NNTP server version 1.5 (26 Feb 88) ready at ...
group comp.sys.mac
	211 393 16645 17351 comp.sys.mac
xhead 17348-17351
	221 Article headers follow
	17348
	Path: intercon!uunet!snjsn1!sjs!ahsanullah
	From: ahsanullah@sjs.sj.ate.slb.com (OHM R)
	Newsgroups: comp.sys.mac
	Subject: Re: SIMM prices (American Micro)
	Message-ID: <491@sjs.sj.ate.slb.com>
	Date: 26 Jun 89 09:28:14 GMT
	References: <1271@sunset.MATH.UCLA.EDU>
	Organization: Schlumberger Technologies Inc., ATE Division
	Lines: 17
	
	17349
	Path: intercon!uunet!mcvax!ukc!icdoc!syma!stevedc
	From: stevedc@syma.sussex.ac.uk (Stephen D Carter)
	Newsgroups: news.misc,news.sysadmin,comp.sys.mac,comp.sys.mac.programmer
	Subject: Re: Official Legal Announcement regarding Apple's Source Code
	Message-ID: <1117@syma.sussex.ac.uk>
	Date: 27 Jun 89 07:19:40 GMT
	References: <7367@cs.Buffalo.EDU>
	Organization: University of Sussex
	Lines: 32
	
	17350
	Path: intercon!uunet!dino!sharkey!mailrus!uflorida!usfvax2!pollock
	From: pollock@usfvax2.EDU (Wayne Pollock)
	Newsgroups: comp.sys.mac
	Subject: Re: RedRyder
	Summary: RR10.3 binary kermit broken
	Message-ID: <1276@usfvax2.EDU>
	Date: 27 Jun 89 16:00:19 GMT
	References: <337@tricord.tricord.MN.ORG>
	Reply-To: pollock@usfvax2.csee.usf.edu.UUCP (Wayne Pollock)
	Organization: University of South Florida at Tampa
	Lines: 17
	
	17351
	Path: intercon!uunet!lll-winken!ncis.tis.llnl.gov!helios.ee.lbl.gov!pasteur!ucbm
	From: nghiem@walt.cc.utexas.edu (Alex Nghiem)
	Newsgroups: comp.sys.mac
	Subject: Re: Kermit question
	Message-ID: <14586@ut-emx.UUCP>
	Date: 28 Jun 89 05:04:12 GMT
	References: <320@umabco.UUCP> <7233@cg-atla.UUCP> <32533@apple.Apple.COM> <3769>
	Sender: news@ut-emx.UUCP
	Reply-To: nghiem@walt.cc.utexas.edu (Alex Nghiem)
	Organization: The University of Texas at Austin, Austin, Texas
	Lines: 9
	Posted: Wed Jun 28 00:04:12 1989
	
	.
quit
	205 intercon closing connection.  Goodbye.
	Connection closed.
	Connection closed by foreign host.

Here's the shar file:
-------------------------CUT HERE------------------------------
#!/bin/sh
# shar:	Shell Archiver
#	Run the following text with /bin/sh to create:
#	serve.diff
#	help.diff
#	xhead.c
sed 's/^X//' << 'SHAR_EOF' > serve.diff
X25c25
X< extern	int	slave(), stat(), xhdr();
X---
X> extern	int	slave(), stat(), xhdr(), xhead();
X47a48,50
X> #ifdef XHEAD
X> 	"xhead",	xhead,
X> #endif XHEAD
SHAR_EOF
sed 's/^X//' << 'SHAR_EOF' > help.diff
X24c24,25
X< 	printf("\r\nAdditionally, the following extention is supported:\r\n\r\n");
X---
X> #ifdef XHDR
X> 	printf("\r\nAdditionally, the following extentions are supported:\r\n\r\n");
X25a27,28
X> 	printf("XHEAD       Retrieve headers from a range of articles.\r\n");
X> #endif
SHAR_EOF
sed 's/^X//' << 'SHAR_EOF' > xhead.c
X#ifndef lint
Xstatic char	*sccsid = "@(#)xhead.c	1.0	(InterCon) 6/8/89";
X#endif
X
X#include "common.h"
X
X#ifdef XHEAD
X
X/*
X * XHEAD [<messageid>|articlerange]
X *
X * articlerange is one of:
X *	an article number
X *	an article number followed by a dash to indicate all following
X *	an article number followed by a dash followed by another
X *		article number.
X * e.g.,
X * XHEAD		retrieve header of current article
X * XHEAD 5589-6325	retrieve headers of arts 5589 to 6325
X * XHEAD 5589-		retrieve headers of arts 5589 and up
X * XHEAD 5589		retrieve header of art 5589 only
X * XHEAD <123@ucbvax>	retrieve header of art <123@ucbvax>
X *
X * This command is an extention, and not included in RFC 977.
X */
X
Xxhead(argc, argv)
X	int		argc;
X	char		*argv[];
X{
X	char		buf[MAXPATHLEN];
X	register int	artptr;
X	register int	artnum;
X	register int	low, high;
X	register FILE	*fp;
X	register char	*cp;
X
X	if (argc > 2) {
X		printf("%d Usage: XHEAD [artrange|<message-id>]\r\n",
X			ERR_CMDSYN);
X		(void) fflush(stdout);
X		return;
X	}
X
X	if (!canread) {
X		printf("%d You only have permission to transfer, sorry.\r\n",
X			ERR_ACCESS);
X		(void) fflush(stdout);
X		return;
X	}
X
X	/* Handle message-id requests */
X
X	if (argc == 2 && *argv[1] == '<') {	/* Message ID */
X		fp = openartbyid(argv[1]);
X		if (fp == NULL) {
X			printf("%d No article by message-id %s, sorry.\r\n",
X				ERR_NOART, argv[1]);
X			(void) fflush(stdout);
X			return;
X		}
X		printf("%d 0 header of article %s.\r\n",
X			OK_HEAD, argv[1]);
X		print_xheader(fp, argv[1]);
X		(void) fclose(fp);
X
X		putchar('.');
X		putchar('\r');
X		putchar('\n');
X		(void) fflush(stdout);
X		return;
X	}
X
X	/*
X	 * It must be a range of articles, which means that we need
X	 * to be in a newsgroup already.
X	 */
X
X	if (!ingroup) {
X		printf("%d You are not currently in a newsgroup.\r\n",
X			ERR_NCING);
X		(void) fflush(stdout);
X		return;
X	}
X
X	artptr = 0;
X
X	if (argc == 1) {
X		if (art_ptr < 0 || art_ptr >= num_arts) {
X			printf("%d No article is currently selected.\r\n",
X				ERR_NOCRNT);
X			(void) fflush(stdout);
X			return;
X		}
X		high = low = art_array[art_ptr];
X		artptr = art_ptr;
X	} else {
X		cp = index(argv[1], '-');
X		if (cp == NULL)
X			low = high = atoi(argv[1]);
X		else {
X			*cp = '\0';
X			low = atoi(argv[1]);
X			cp++;
X			high = atoi(cp);
X			if (high < low)
X				high = art_array[num_arts-1];
X		}
X	}
X
X	printf("%d Article headers follow\r\n", OK_HEAD);
X
X	for (;; artptr++) {
X		if ((artnum = art_array[artptr]) < low)
X			continue;
X		if (artnum > high)
X			break;
X
X		(void) sprintf(buf, "%d", artnum);
X		fp = fopen(buf, "r");
X		if (fp == NULL)
X			continue;
X		print_xheader(fp, buf);
X		(void) fclose(fp);
X	}
X
X	putchar('.');
X	putchar('\r');
X	putchar('\n');
X	(void) fflush(stdout);
X}
X
X
Xprint_xheader(fp, artname)
X	register FILE	*fp;
X	register char	*artname;
X{
X	char		line[NNTP_STRLEN];
X	register char	*cp, *cp1;
X
X	printf("%s\r\n", artname);
X	while (fgets(line, sizeof (line), fp) != NULL) {
X		if (*line == '\n' || *line == '\0') {
X			putchar('\r');
X			putchar('\n');
X			return;
X		}
X		for (cp = line; *cp != '\n'; cp++)
X			putchar(*cp);
X		putchar('\r');
X		putchar('\n');
X	}
X}
X
X#else not XHEAD
X
X/* Kludge to get around Greenhills C compiler */
X
Xxhead_greenkluydge()
X{
X}
X
X#endif not XHEAD
SHAR_EOF
exit
--
Amanda Walker  <amanda@intercon.uu.net>
InterCon Systems Corporation
--
"Those preachers are right--there's more in these songs
than meets the eye..."  --Arlo Guthrie