sra@lcs.mit.edu (Rob Austein) (07/31/89)
[I tried sending this to info-gnus-english via mail, but it bounced. Just as well, since I had forgotten to say that this is GNUS 3.12, Emacs 18.54. I'll send a separate note to the ohio-state people documenting the bounce.] Symptom: The GNUS newsreader doesn't like messages from Vint Cerf. It gets confused while trying to retrieve their headers from the NNTP server, merges the headers with those of the preceeding article, constructs a bogus article header vector for the preceeding article such that it looks like the preceeding article is from Vint, and forgets about Vint's article (ie, attempting to display the article with the bogus header vector displays the one preceeding Vint's article, and there is no way to display Vint's article at all). This is distressing, since, on those occasions when Vint does post to the net, it's worth reading. Diagnosis: There are two bugs here. One is in the mail composition program Vint is using. This is a known problem and the ISI people are working on fixing it by decomissioning the offending program, which dates from pre-RFC822 days. It generates Message-ID: headers in the old format that has been illegal since RFC822 came out. Eg: Message-ID: <[A.ISI.EDU]29-Jul-89.05:18:51.CERF> There is also a bug in GNUS, specificly in the function "nntp-retrieve-headers" (nntp.el). The code ;; Skip invalid field (ex. Subject:abc) (if (looking-at "^[^:]*:[^ \t]") (forward-line 1)) makes the invalid assumption that a Message-ID can not have a ":" character in it (it is legal, by RFC822 anyway, if it's enclosed in double quote characters). If there is a ":" character in a Message-ID, the cited code will skip over it, effectively merging the headers of the current article with those of the next. I'm not sure why this code is thought to be necessary at all, since a delete-non-matching-lines had already been run over the buffer to flush anything except the desired header lines. Commenting out the code fixes the problem without any known bad effects. In case the description is unclear, here is an example. If the contents of the nntp-server-buffer are [Exhibit A], the resulting GNUS *Subject* buffer will be [Exhibit B]. I've removed all the trailing <CR> characters from [Exhibit A], for ease of reading, and the NNTP server seems to dislike lines with just a dot on them, so I've replaced all occurences of "\n.\n" with "\n{.}\n" in [Exhibit A]. [Exhibit A] ================================================================ 221 1159 <447@warlock.UUCP> Article retrieved; head follows. Path: mintaka!bloom-beacon!tut.cis.ohio-state.edu!cs.utexas.edu!uunet!dowjone!gregb From: gregb@dowjone.UUCP (Gregory S. Baber) Newsgroups: comp.dcom.lans,comp.protocols.tcp-ip,comp.unix.questions Subject: help with WANs Message-ID: <447@warlock.UUCP> Date: 28 Jul 89 12:49:34 GMT Date-Received: 29 Jul 89 06:43:17 GMT Reply-To: gregb@dowjone.UUCP (Gregory S. Baber) Distribution: world Organization: Dow Jones, Inc. Princeton, NJ Xref: mintaka comp.dcom.lans:534 comp.protocols.tcp-ip:1159 comp.unix.questions:2570 Lines: 27 {.} 221 1160 <623@dtix.ARPA> Article retrieved; head follows. Path: mintaka!bloom-beacon!tut.cis.ohio-state.edu!cica!iuvax!uxc.cso.uiuc.edu!tank!mimsy!dtix!curt From: curt@dtix.ARPA (Curt Welch) Newsgroups: comp.dcom.lans,comp.protocols.tcp-ip Subject: "dead ports" on a Bridge LS/1 terminal server Message-ID: <623@dtix.ARPA> Date: 28 Jul 89 17:54:43 GMT Date-Received: 29 Jul 89 11:21:04 GMT Reply-To: curt@dtix.arpa Followup-To: comp.dcom.lans Distribution: usa Organization: David Taylor Research Center, Bethesda, MD Xref: mintaka comp.dcom.lans:536 comp.protocols.tcp-ip:1160 Lines: 26 {.} 221 1161 <[A.ISI.EDU]29-Jul-89.05:18:51.CERF> Article retrieved; head follows. Path: mintaka!bloom-beacon!tut.cis.ohio-state.edu!ucbvax!A.ISI.EDU!CERF From: CERF@A.ISI.EDU Newsgroups: comp.protocols.tcp-ip Subject: Re: Announcing a little board-room shakeup Message-ID: <[A.ISI.EDU]29-Jul-89.05:18:51.CERF> Date: 29 Jul 89 09:18:00 GMT Date-Received: 29 Jul 89 11:51:28 GMT References: <Jul.26.17.48.59.1989.12721@hardees.rutgers.edu> Sender: daemon@ucbvax.BERKELEY.EDU Distribution: world Organization: The Internet Lines: 16 {.} ================================================================ [Exhibit B] ================================================================ 1159: [ 27:gregb@dowjone] help with WANs 1160: [ 16:CERF@A.ISI.ED] Re: Announcing a little board-room shakeup ================================================================ Fix: In the immortal words of the DEC Software Dispatch: "Please!" I would think that just flushing the offending code would work, but it would be nice to know why it's there before doing so. --Rob Austein, MIT Lab for Computer Science
sra@lcs.mit.edu (Rob Austein) (07/31/89)
An afterthought. The (delete-non-matching-lines) in (nntp-retrieve-headers) is a little risky, since it doesn't take into account the possibility of RFC822 line continuation within the headers. Adding the two lines (goto-char (point-min)) (replace-regexp "[ \t]*\\(\n[ \t]+\\)+" " ") just after the comment ;; First, delete unnecessary lines. fixes this. --Rob