[net.news.b] Cause and fix of "Inbound news is garbled"

pag (03/09/83)

I noticed a recent rash of "Inbound news is garbled" diagnostics in our
news log file, and set up a method to catch the offending articles.
The problem is due to a bug caused by a combination of insufficient
storage of some arrays, not reading enough of a header line, and changed
formats of addresses.  Here is a sample header from an offending article:

From: seismo!harpo!eagle!mhuxt!mhuxa!mhuxh!mhuxm!mhuxv!burl!sb1!ll1!otuxa!we13!lime!orion!ariel!hou5f!hou5b!hou5c!hou5e!hou5a!hou5d!houxz!ihnp4!iheds!otto (George V.E. Otto)
Newsgroups: net.women,net.singles,net.books
Title: Women's hair in the Descent of Women
Article-I.D.: iheds.218
Posted: Sat Mar  5 19:53:15 1983
Reply-To: otto@iheds.UUCP (George V.E. Otto)

What happens is that "frmread()" in header.c barfs on the From line
a) because a declaration is not long enough, and b) the fgets() call
doesn't read enough bytes.  Also the inclusion of a name in parentheses
was not programmed into the header parsing.  Here are some code fragments
that decode the "From" line (from version 2.8):

in hread():

	if (((fgets(bfr, LBUFLEN, fp) != NULL && *bfr >= 'A' && *bfr <= 'Z') && index(bfr, ':')) || !strncmp(bfr, "From ", 5))
		if (frmread(fp, hp))
				goto strip;

In our version, BUFLEN was left unchanged from what was distributed (128).
One fix is to make BUFLEN 512 (or larger), but that is wasteful.  A cleaner
fix is to change BUFLEN to LBUFLEN in the fgets() call above.  Also
the declaration of
char bfr[BUFLEN];
in iparams.h and params.h should be changed to LBUFLEN.

in frmread():
	char wordfrom[100], uname[100], at[100], site[100];
			.
			.
			.
			.

			case FROM:
				if (!fromflag) {
					sscanf(bfr, "%s %s %s %s", wordfrom, uname, at, site);

Obviously "uname[100]" declaration is insufficient.  Fix: make it
uname[PATHLEN].  Also the scanf() format is not designed for an address
that has a name in parentheses at the end.  Sites running news software
that inserts users names or gcos fields at after the address may break
other sites news, and never get transmitted beyond that site.

Comments, Mark?

--peter gross