[news.admin] how to lower news-transmission overhead

eric@snark.UUCP (EricS.Raymond) (12/14/88)

In article <incredibly-long-and-bogus-ID:-)> Karl Kleinpaste writes:
<subscription-matching is a major chunk of rnews overhead>

I believe Karl is correct, and have done something about it. Henry Spencer and
I discussed this at the '86 Worldcon. He told me at the time that his profiles
showed ngmatch() eating up 15% or more of the elapsed time in rnews runs, and
passed along a nifty idea for reducing the bite that he said was going to be
implemented in C news.

I used it. If you compile 3.0 with FEEDBITS on, per-newsgroup subscription
bits are computed en masse *once* at the start of each rnews run. If you
compile with CACHEBITS, this information is stashed away in half-compiled form
the first time it's computed, and the high-overhead part of the computation
is only done when the sys-file-equivalent changes.

Result? On most runs you pay only for N sscanf(buf, "%x", ....) calls, where N
is the number of currently active groups. This is a significant win.

Another helpful feature I've implemented for shortening subscription lines
is a {foo,bar,baz,...} alternation syntax like that of csh. My ngmatch() does
the right recursive tricks so that combinations of ! and {...} work and nesting
of alternations works. Time overhead for this new feature is insignificant, and
it allows subscriptions to be expressed much more concisely. Consider

	comp.sys.{all,!ibm.pc,ibm.pc.rt,!mac,!amiga,!next,!sgi,!sun}

This is exactly equivalent to:

	comp.sys.all,!comp.sys.ibm.pc,comp.sys.ibm.pc.rt,!comp.sys.mac,
		!comp.sys.amiga,!comp.sys.next,!comp.sys.sgi,!comp.sys.sun

Which would *you* rather read? :-). And because computation cost for ngmatch()
is mostly a function of subscription list length, the top version resolves
more quickly.

Finally, I should note that Shane McCarron's UUCP multicast code is
also incorporated in 3.0. Yes, somebody *has* been listening...and I will
continue to put a high priority on supporting features that reduce and help
control the costs of feeding other sites.
-- 
      Eric S. Raymond                     (the mad mastermind of TMN-Netnews)
      Email: eric@snark.uu.net                       CompuServe: [72037,2306]
      Post: 22 S. Warren Avenue, Malvern, PA 19355      Phone: (215)-296-5718

david@ms.uky.edu (David Herron -- One of the vertebrae) (12/14/88)

ok, let me throw you a small curve and see what you do with it.

Our news system is over on one machine, s.ms.uky.edu, a Sequent Symmetry
with 22 processors, 40 megs of memory and enough CPU horsepower that it's
starting to edge into Mainframe & SuperComputer territory.  (Please,
I know that I'm slightly loose with the terminology)  We have oodles
of NNTP feeds running from that machine (and if you're reading this and
are on SURAnet, contact me for a feed ..., or if you are otherwise fairly
close to SURAnet).

*BUT* our other, non-NNTP, feeds don't run off that machine.  Instead
they run from g.ms.uky.edu which is a poor little uVaxII with only
13 megs of memory, 1 processor, and enough CPU horsepower to sneeze
at random intervals.  The hardest part of the configuration around
here is that nobody else in the world, that I know of, runs a system
with news files on one system and all the UUCP connections on another.
(Not to mention the BITNET connections ... we do feed & receive news
over BITNET).  

So of course all the shell scripts and such assume that to send a batch 
out you call uux.  On our system we call a ".cmd" file to save the
batch away into a file in the batching directory, then other shell
scripts pick up these files and compress & uux them over to the
remote machine...

Can *that* be handled cleaner?


-- 
<-- David Herron; an MMDF guy                              <david@ms.uky.edu>
<-- ska: David le casse\*'      {rutgers,uunet}!ukma!david, david@UKMA.BITNET
<--
<-- By Michelle betrayed!

henry@utzoo.uucp (Henry Spencer) (12/16/88)

In article <eTIoh#QDAT7=eric@snark.UUCP> eric@snark.UUCP (EricS.Raymond) writes:
>[HS] passed along a nifty idea for reducing the bite that he said was going
>to be implemented in C news...

Ironically, although we'd thought it out in detail, we ended up dealing
with this problem mostly by lubricating existing code, and never did
implement precompiled matching in C News!
-- 
"God willing, we will return." |     Henry Spencer at U of Toronto Zoology
-Eugene Cernan, the Moon, 1972 | uunet!attcan!utzoo!henry henry@zoo.toronto.edu