coolidge@brutus.cs.uiuc.edu (John Coolidge) (11/20/89)
There's an obvious problem that many people have remarked upon involving the contradicition between the C News batching code in nntpd vs. the continuous transmissions of nntplink. Since the batching code relies on only writing articles every so often, lots of articles are received when nntplink is run but aren't passed on to relaynews until later, while nntpxmit-style transfers send the article later but get it processed later. The end result is slower article propagation and lots more dups. I'm wondering what the various C News sites running NNTP have done to get around this problem. My solution has been to force nntpd to write each article as a separate batch and change the article-naming code to avoid collisions. There are some other solutions being worked on, I know (in fact, I'm working on one of them), and I'm interested in seeing what people are doing. Another side of the coin is this: if you're going to run nntplink to someone, make sure they're not doing C-style batching. Otherwise you're not helping either yourself or your connection any. I've had this problem with a few connections, since I'm now switched over to using nntplink for just about all my connections (with varied sleep times in nntplink --- this, perhaps, should be a flag). As a workaround, I've implemented a flag to nntplink that says "Always break connection". I _could_ just run nntpxmit, but I'd rather keep things consistently one way or the other. --John -------------------------------------------------------------------------- John L. Coolidge Internet:coolidge@cs.uiuc.edu UUCP:uiucdcs!coolidge Of course I don't speak for the U of I (or anyone else except myself) Copyright 1989 John L. Coolidge. Copying allowed if (and only if) attributed. You may redistribute this article if and only if your recipients may as well. New NNTP connections always available! Send mail if you're interested.
lamy@ai.utoronto.ca (Jean-Francois Lamy) (11/20/89)
Obviously, running nntplink to leaf nodes is one case where batching in nntpd does not get in the way. Also, by picking nntp feeds that don't talk to each other you can reduce the likelyhood of duplicates (i.e. there is a point of diminishing return in any flooding algorithm, that where you start getting... flooded). Even if no duplicate article was ever transmitted, NNTP feed mania can still lead to case where all you do all day is say no to IHAVE requests... I'd say: pick your feeds carefully, and weed out the less useful ones, independently of any technical fix you may come up for inews/rnews/relaynews. Jean-Francois Lamy lamy@ai.utoronto.ca, uunet!ai.utoronto.ca!lamy AI Group, Department of Computer Science, University of Toronto, Canada M5S 1A4
coolidge@brutus.cs.uiuc.edu (John Coolidge) (11/20/89)
wesommer@athena.mit.edu (William Sommerfeld) writes: >Funny you should notice. As it turns out, nntplink doesn't have to >change; only nntpd need change. Patches aren't available yet (and >might not be; I'm really busy; don't even ask for them), but the >changes are simple enough to describe: Yup, this is my opinion too. Really, there's nothing that nntplink _could_ do differently to fix the problem (except close the connection, but then what's the point :-) ). >- I made nntpd aware of the NEWSCTL/LOCKinput lock file. If relaynews >is running, this lock file exists. I rearranged the code in batch.c >to queue the batch into NEWSARTS/in.coming *first*, and only fork/exec >newsrun if relaynews isn't running. I've dispensed with nntpd forking off newsrun a long, long time ago. It turned out to be moderately costly, and (much worse) crashed our machine on a regular basis (something about having big things forking off from inetd-invoked code doesn't make SunOS4.0.3 very happy. Portmap died on a regular basis...). Of course, I've got something to take the place of newsrun, and most people don't (if I had the time to get things stable, it would help...). >If articles are flowing in a continuous stream (less than a >five-second delay between articles), they get batched using the >existing rules (five minutes or 300KB, whichever comes first). This is what I'm trying to avoid, however. My optimal situation is to have each article received back on the outgoing wire with no delay at all. That's impossible, but sub-30 seconds is a very reasonable approximation of the goal. This requires things to happen very quickly, which sort of blows batching away. It's good, though, for people (most of them, I suspect) who consider rapid propagation a secondary goal. --John -------------------------------------------------------------------------- John L. Coolidge Internet:coolidge@cs.uiuc.edu UUCP:uiucdcs!coolidge Of course I don't speak for the U of I (or anyone else except myself) Copyright 1989 John L. Coolidge. Copying allowed if (and only if) attributed. You may redistribute this article if and only if your recipients may as well. New NNTP connections always available! Send mail if you're interested.
wesommer@athena.mit.edu (William Sommerfeld) (11/20/89)
In article <1989Nov20.002159.26404@brutus.cs.uiuc.edu> coolidge@brutus.cs.uiuc.edu (John Coolidge) writes:
There's an obvious problem that many people have remarked upon
involving the contradicition between the C News batching code in
nntpd vs. the continuous transmissions of nntplink. Since the
batching code relies on only writing articles every so often, lots
of articles are received when nntplink is run but aren't passed on
to relaynews until later, while nntpxmit-style transfers send the
article later but get it processed later. The end result is slower
article propagation and lots more dups.
Funny you should notice. As it turns out, nntplink doesn't have to
change; only nntpd need change. Patches aren't available yet (and
might not be; I'm really busy; don't even ask for them), but the
changes are simple enough to describe:
- I made nntpd aware of the NEWSCTL/LOCKinput lock file. If relaynews
is running, this lock file exists. I rearranged the code in batch.c
to queue the batch into NEWSARTS/in.coming *first*, and only fork/exec
newsrun if relaynews isn't running.
- I rearranged the loop in serve.c to make the alarm timeout and
handler come from a global variable instead of a compiled-in constant.
At the top of the loop, the timeout and alarm handler variables are
reset to the default values.
- The function which implements the ihave command sets the timeout to
five seconds, and the alarm handler to a function which, if
NEWSCTL/LOCKinput doesn't exist, terminates the batch.
The effect is that: if articles are coming in one at a time and the
machine isn't backlogged, they get processed one at a time.
If articles are flowing in a continuous stream (less than a
five-second delay between articles), they get batched using the
existing rules (five minutes or 300KB, whichever comes first).
If the machine is backlogged (relaynews is running), the articles get
processed in batches.
We've only been running this way for a couple of days now on
bloom-beacon.mit.edu and snorkelwacker.mit.edu, and it *seems* to be
working well, but it still hasn't been exposed to a full volume
during-the-week feed, so I don't know if it will break down.
Given this kind of code in nntpd, it would make sense for nntplink to
*not* close the connection after every 20 articles... given average
article sizes, every 100 articles would be more like it; that way, if
the machine is backed up, you get large batches which allow C news to
run at full blast.
We're running C news/NNTP on slow machines with slow disks, and it
seems to be keeping up; B news was running at the edge (bloom-beacon's
load was continuously over 10 with B news; these days, it seems to be
hovering around 1-2..).
The five second delay seems fairly short, but 30 seconds wasn't enough
to avoid lots of dups.
--
Henry Spencer is so much of a | Bill Sommerfeld at MIT/Project Athena
minimalist that I often forget | sommerfeld@mit.edu
he's there - anonymous |