[net.news.b] Hunting the wild truncation bug

donn@sdchema.UUCP (03/09/84)

After months of accumulating annoyance, I finally decided to try to
find the nearby site which had been chopping the fronts off of
articles.  In the course of sending out inordinate numbers of test
messages, I managed to accuse just about everyone within six hops of
our site of harboring the bug, and I probably need to apologize or at
least plead extenuating circumstances.  For future reference:

  +	The truncation bug appears in B news releases 2.10 and 2.10.1.
	Sites running 2.9 news or A news have different bugs.  The B
	news that comes with the 4.2 BSD distribution has the bug.

  +	The bug affects only articles whose bodies begin with leading
	white space, such as a tab or blanks.  The bug does not affect
	articles which have more than one blank line in front of the
	body.  Some kind sites (notably decvax) insert an extra blank
	line at the top of a choppable article when they get one.
	'inews -h' also seems to do this (the two facts may be
	related).

  +	The bug does not affect articles which are processed by
	directly running 'rnews' from uucp.  Only articles which arrive
	on a batch route are subject to chopping.

  +	Not every site has someone who reads mail sent to 'usenet' or
	'root'.  If you don't know the system manager at a suspect site
	personally, look their site up in the netnews map and use the
	address or phone number.

The person I most need to apologize to is Sid Stuart at linus, whom I
rather thoughtlessly fingered in a message seen by the entire network.
Sorry, Sid.  The site that was in fact responsible for chopping up poor
<725@linus.UUCP> was (drum roll please)...

	decvax!

Decvax is a major backbone node on the network (thanks, Armando) and
eliminating the bug there will probably drop the incidence of chopped
articles by a considerable amount.  Thanks should go to Jim McGinness
at decvax for installing the fix in timely fashion.  With any luck this
will mean far fewer crippled articles stumbling around the net.

Thanks to everyone who mailed mutilated test messages back to me, it
really was a help.  For those of you who may still suffer from the
ravages of the bug, I will append the bug fix at the end of this
article (bug fix courtesy of Guy Riddle at Bell Piscataway).

Gotta run, my pet tape is crying out for more cats,

Donn Seeley    UCSD Chemistry Dept.       ucbvax!sdcsvax!sdchema!donn

---------------------------------------------------------------------------

Path: sdchema!sdcsvax!philabs!seismo!harpo!eagle!allegra!alice!rabbit!pyuxmm!ggr
Newsgroups: net.news.b
Subject: Infamous "truncated article bug" fixed
Date: Tue, 6-Sep-83 18:17:10 PDT
Organization: Bell Labs, Piscataway

Here's a fix to the bug that was causing "truncated articles" where
pieces of the *beginnings* of articles were lost.
The problem would occur if the first line of the submitted article
began with a blank or tab, the size of the header was just under
512 bytes, and rnews was invoked with stdin not a file (for example,
a pipe (in batching)).  (How's that for a combination of circumstances!)

What was happening was that hfgets(), in header.c, checks to see if each
header line might be continued with a succeeding line beginning with
blank or tab.  But it even checks this for the blank line ending the header,
and will eat up the first line of the article if it begins with a blank
or tab.  Later rnews notices this and tries to put things straight
by fseek'ing back to before the article itself begins.  This will
work if stdin is a file, but fails if stdin is something you can't
seek on, a pipe or in my case, a local network connection.
The stdio library can't put things right if it has just crossed a
block boundary and loses about 1024 bytes.

The following version of hfgets fixes this by not checking for continuation
of the blank line that ends the header.

				=== Guy Riddle == BTL Piscataway ===

/*
 * hfgets is like fgets, but deals with continuation lines.
 * It also ensures that even if a line that is too long is
 * received, the remainder of the line is thrown away
 * instead of treated like a second line.
 */
char *
hfgets(buf, len, fp)
char *buf;
int len;
FILE *fp;
{
	register int c;
	register char *cp, *tp;

	cp = fgets(buf, len, fp);
	if (cp == NULL)
		return NULL;

	tp = cp + strlen(cp);
	if (tp[-1] != '\n') {
		/* Line too long - part read didn't fit into a newline */
		while ((c = getc(fp)) != '\n' && c != EOF)
			;
	} else if(tp == (cp+1))
		return(cp);	/* Don't look for continuation of blank lines */
	else
		*--tp = '\0';	/* clobber newline */

	while ((c = getc(fp)) == ' ' || c == '\t') {	/* for each cont line */
		/* Continuation line. */
		while ((c = getc(fp)) == ' ' || c == 't')	/* skip white space */
			;
		if (tp-cp < len) {*tp++ = ' '; *tp++ = c;}
		while ((c = getc(fp)) != '\n' && c != EOF)
			if (tp-cp < len) *tp++ = c;
	}
	*tp++ = '\n';
	*tp++ = '\0';
	if (c != EOF)
		ungetc(c, fp);	/* push back first char of next header */
	return cp;
}

rees@apollo.uucp (Jim Rees) (03/12/84)

Oh no!  The bug fix you posted has a bug, too.  The line

		while ((c = getc(fp)) == ' ' || c == 't')	/* skip white space */

should be

		while ((c = getc(fp)) == ' ' || c == '\t')	/* skip white space */

I think this bug is pretty innocuous.  I haven't tried, but a header
continuation line that looks like this would probably trigger it:

Bug-Infested-Header: doesn't matter what goes here
    this line must start with 't'