lamy@ai.toronto.edu (Jean-Francois Lamy) (06/23/88)
The machine that injected the message in the news system does indeed run C news. utcsri runs an archaic B news -- the machine is being phased out, and I have not heard of problems there. The news posting script is my own, heavily modified, version of the C news postnews script. The news machine supports about 80 news "clients" from at least 5 different subdomains of toronto.edu, and the simplest solution to acheive centralized posting was to use mail and use as much of the mail header as possible (at least that way the return addresses are sane and marginally harder to spoof). Yes, we know about NNTP. I don't think the tab is inserted on purpose. It may have been inserted by a mailer, could easily be filtered out, and in any case *I* certainly have no religious feelings about it. Jean-Francois Lamy lamy@ai.utoronto.ca, uunet!ai.utoronto.ca!lamy AI Group, Department of Computer Science, University of Toronto, Canada M5S 1A4
)) (06/23/88)
I've been having our B News (2.11 patch 14) hang occasionally while trying to parse batched news - goes into some sort of loop when it decides that an article is garbled. I'm still trying to fix the hang problem (the "#!rnews" piping business appears to hang in the write), but I've noticed that this usually comes from an article from utcsri (often from one of the upcoming events articles). This is part of an article: >#! rnews 6703 >Newsgroups: ut.na >Path: tmsoft!utgpu!jarvis.csri.toronto.edu!csri.toronto.edu!krj >From: krj@csri.toronto.edu (Ken Jackson) >Subject: NA Digest Volume 88 : Issue 24 >Message-ID: <8806201610.AA16854@gerrard.csri.toronto.edu> >Organization: University of Toronto, CSRI >Distribution: ut >Date: Mon, 20 Jun 88 10:50:36 EDT > ... rest of article deleted ... The article is considered garbled in this case because B-news was unable to parse out the date. Please look at the date shown above - after the ":" there's a tab instead of a space. Bnews does the parse by means of a heavily optimized macro/function combination that effectively does this: if (strncmp(artline, "Date: ", strlen("Date: ")) == 0) ... ^--- space Obviously, it'll fail and will always be considered garbled by B-news. I've sent off a copy of this mail item to some of the local SA's (including Henry Spencer who hasn't replied yet). B-news doesn't appear to like tabs, but C-news (according to some of the local C-newsing SA's, eg: Dave Mason at tmsoft, or Rayan at utai) does accept tabs and that this is supposedly legal according to the USENET RFC's (of which I don't presently have a copy). However, the "standards.mn" document (that comes with 2.11) says: A message consists of several header lines, followed by a blank line, followed by the body of the message. The header lines consist of a keyword, a colon, a blank, and some additional information. This is a subset of the ARPANET standard, simplified to allow simpler software to handle it. What do we do now? Should B-news be hacked to accept tabs here? It's very easy to do. However, as more people convert to C-news (What does B-news 3.0 do?) this will represent a bigger and bigger problem to people who can't or won't upgrade past their current B-news. Comments? Thanks, -- Chris Lewis, Spectrix Microsystems Inc, Phone: (416)-474-1955 UUCP: {uunet!mnetor, utcsri!utzoo, lsuc, yunexus}!spectrix!clewis Moderator of the Ferret Mailing List (ferret-list,ferret-request@spectrix)
)) (06/24/88)
In article <674@spectrix.UUCP>, clewis@spectrix.UUCP (Chris Lewis (It's loose again!)) writes: > I've been having our B News (2.11 patch 14) hang occasionally while trying > to parse batched news - goes into some sort of loop when it decides > that an article is garbled. I'm still trying to fix the hang problem > (the "#!rnews" piping business appears to hang in the write), > but I've noticed that this usually comes from an article from utcsri > (often from one of the upcoming events articles). [synopsis: when an incoming batch contains an article with a tab instead of a space after the "Date:" token, rnews will report that the article is garbled and then sometimes hang.] I seem to have resolved the hanging problem. SYNOPSIS: When the "rnews -S" invocation is parsing batches and tossing "#! rnews nnn"-prefixed chunks to separate forks of itself, if the forked rnews determines that the article is garbled, it will exit and the "rnews -S" will hang in the write to the pipe - potentially forever. Regardless of whether the batch is compressed, or how many other articles are in the batch. DISCUSSION: The "rnews -S" personality makes a decision about whether to create a temporary file or use a pipe to transmit one article to a forked copy of itself, by figuring if the article is smaller than the buffer (CPBFSZ in ifuncs.c) then pipe it, otherwise dump the file to a "/tmp/unb*" temporary, and fork itself with either the pipe or the temporary file as standard input. If the forked rnews determines that the message is garbled, it exits immediately. HOWEVER, if the article is bigger than the in-core kernel pipe buffer size (PIPEMAX in some System V implementations - see SVID, or pipe(2)), but smaller than CPBFSZ, the write will not return immediately - it has written the first pipe-buffer-full, but has to wait for the destination to read the buffer so as to send the rest of the write buffer - which never gets read since the child rnews has exitted. This only appears to be a problem when both the article and CPBFSZ are bigger than the in-core kernel pipe buffer size and the article is smaller than CPBFSZ. On Xenix 2.1.3 and NCR Tower SVR2 (and many other systems (ie: V7) - we only have documentation for the two specific ones) the pipe buffer size is 5120. And, CPBFSZ is 8192 (which I *think* is the BSD pipe buffer size). REPEAT BY: Create a batch file (with "#! rnews" header) which has an article about 6K long and garble the "Date: " header (eg: change the space to a tab). Then issue: rnews -S < batchfile It will say something like: inews: : Inbound news is garbled. Then hang. If it doesn't hang, then congrats, you're probably on a machine with a bigger pipe buffer, and you don't need to do anything. Otherwise, you might want to do the fix given below. The fix is relatively harmless even if you don't *have* to do it - there'll simply be a few more articles using temporary files rather than pipes - only a slight performance hit. FIX: in ifuncs.c, find the "#define" for "CPBFSZ" and change it to be the same or smaller than your pipe buffer size. We chose 4096. Rebuild, reinstall and viola. NOTE: This is a fragment of the pipeing code in ifuncs.c: /* parent of fork */ if (rc == asize) { /* article fits in buffer */ wc = write(piped[1], buf, rc); if (wc != rc) { fprintf(stderr, "write of %d to pipe returned %d", rc, wc); perror("rnews: write"); exit(1); } (void) close(piped[0]); (void) close(piped[1]); } We figured that part of the reason why the write didn't terminate was because the "close(piped[0])" was *after* the write not before - thus there are two people (both sides of the fork) with the input side of the pipe open - thus no broken pipe when the child exited. I tried moving it (before changing CPBFSZ - which is the size of buf in the write), and the test shown above under "REPEAT-BY" would print one "garbled" message, a blank line, and ignore the next article in the batch. -- Chris Lewis, Spectrix Microsystems Inc, Phone: (416)-474-1955 UUCP: {uunet!mnetor, utcsri!utzoo, lsuc, yunexus}!spectrix!clewis Moderator of the Ferret Mailing List (ferret-list,ferret-request@spectrix)
eric@snark.UUCP (Eric S. Raymond) (06/24/88)
In article <674@spectrix.uucp>, Chris Lewis (It's loose again!) writes: >What do we do now? Should B-news be hacked to accept tabs here? It's >very easy to do. However, as more people convert to C-news (What >does B-news 3.0 do?) this will represent a bigger and bigger problem to >people who can't or won't upgrade past their current B-news. The Right Thing in these situations is to be liberal about what you accept and conservative about what you send. Accordingly, B3.0 won't barf on tabs immediately after the colon, but (for header fields over which it has format control at header write time) it always generates a space there. I say unto the C news people: go thou and do likewise. -- Eric S. Raymond (the mad mastermind of TMN-Netnews) UUCP: {{uunet,rutgers,ihnp4}!cbmvax,rutgers!vu-vlsi,att}!snark!eric Post: 22 South Warren Avenue, Malvern, PA 19355 Phone: (215)-296-5718
)) (06/29/88)
In article <dSsed#3HOKJG=eric@snark.UUCP> eric@snark.UUCP (Eric S. Raymond) writes: >In article <674@spectrix.uucp>, Chris Lewis (It's loose again!) writes: >>What do we do now? Should B-news be hacked to accept tabs here? It's >>very easy to do. However, as more people convert to C-news (What >>does B-news 3.0 do?) this will represent a bigger and bigger problem to >>people who can't or won't upgrade past their current B-news. >The Right Thing in these situations is to be liberal about what you accept >and conservative about what you send. Accordingly, B3.0 won't barf on tabs >immediately after the colon, but (for header fields over which it has format >control at header write time) it always generates a space there. I say unto the >C news people: go thou and do likewise. So, I went and did likewise for B news. For the B news 2.11 people, 2.11 *almost* does likewise. It has "charmap" - a character mapping array that is used for comparing strings when you don't want to worry about case. To fix the "tab problem": In file "funcs.c", look for the initialization of "charmap". In the second line of initialization, the second char is '\011' (map tab to tab). Change that to '\040' (map tab to space). Rebuild. Voila. What this means is that for the purposes of string comparisons, consider tab to be a space. This is in line with charmap's normal usage: a case translation table to be used in header field comparisons (so MESSAGE-ID is the same as Message-ID etc). Charmap is used elsewhere, but I don't think that this will cause other problems. Further, when 2.11 regenerates the article to stuff into the spool area (hence of course outgoing batches), the thing after the colon is forced to be a space. Exactly the "Right Thing". [Thanks Eric for the phone call and the consult...] This is only necessary if one of your neighboring feeds is a C-news site, or someone's generating articles thru some non-standard approach. So, for those C news sites generating headers with tabs (utcsri, uthub etc) - well, if your stuff goes thru us, we'll fix it for you so that the rest of the world can see it. (For the next couple of days at least...) -- Chris Lewis, These addresses will last only til June 30: (clewis@lsuc afterwards) UUCP: {uunet!mnetor, utcsri!utzoo, lsuc, yunexus}!spectrix!clewis Moderator of the Ferret Mailing List (ferret-list,ferret-request@spectrix)