[news.software.b] Cnews performance?

zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) (06/23/89)

Here are some times for cnews vs. 2.11.14 to process a single 
compressed batch (~50K) (all the way from rnews to the spool 
directory): 

2.11.14:

real:  12.1 
user:  2.2 
sys:   1.8 

cnews

real:  28.1 
user:  2.1 
sys:   6.8

This was a quick and dirty test - can anyone confirm these results?  

I see roughly 16 processes starting up for each batch that cnews
processes.  Does anyone else find this suprising?

-- 
  Jon Zeeff			zeeff@b-tech.ann-arbor.mi.us
  Ann Arbor, MI			sharkey!b-tech!zeeff

lamy@ai.utoronto.ca (Jean-Francois Lamy) (06/23/89)

I think that a lot of things are done in the input subsystem in a
general, but inefficient way.  For example, all your UUCP feeds may
come in in the same format; testing each batch to see what format
it is in is clearly wasted, especially if the test involves starting
a process like compress only to see it fail...

We just bypass the input subsystem; we use an lpr queue that calls relaynews.
Our newsrun just runs lpr on the files, and only nntpd ever calls it.	The
"rnews" called by UUCP just does an lpr.  Our feeds from mailing lists just
does minimal header munging and calls lpr too, bypassing inews.

The gains in running C news are as follows, I think:
- fast relaynews
- faster expire
- easily customizable inews (though some parts should be done in C -- the
  search for moderators comes to mind).

The trick is to look at what your input batches are like and to get them
to relaynews asap.  In other words, trim your sails.

Jean-Francois Lamy               lamy@ai.utoronto.ca, uunet!ai.utoronto.ca!lamy
AI Group, Department of Computer Science, University of Toronto, Canada M5S 1A4

rkh@mtune.ATT.COM (Robert Halloran) (06/23/89)

In article <9476@b-tech.ann-arbor.mi.us> zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) writes:
>Here are some times for cnews vs. 2.11.14 to process a single 
>compressed batch (~50K) (all the way from rnews to the spool 
>directory): 

>2.11.14: real:  12.1 user:  2.2 sys:   1.8 
>
>cnews: real:  28.1 user:  2.1 sys:   6.8
>
>This was a quick and dirty test - can anyone confirm these results?  

On a pair of 3b2/700's, one with 2.11 and the other with Cnews, on a
common feed from 'att', I ran the 'acctcom' routine through an awk script
to sum up processes and CPU time for 2.11 rnews vs. Cnews' relaynews, on the 
grounds that these were the routines actually unbatching news.  The results I got were:

For five hours (0445 - 0945) of 2.11:  591 rnews processes, 330.03 CPU-secs total

For 5.5 hours (0400 - 0930) of Cnews:  27 relaynews processes, 52.99 CPU-secs total.


						Bob Halloran
=========================================================================
UUCP: att!mtune!rkh				Internet: rkh@mtune.ATT.COM
USPS: 17 Lakeland Dr, Port Monmouth NJ 07758	DDD: 201-495-6621 eve ET
Disclaimer: If you think AT&T would have ME as a spokesman, you're crazed.
Quote: "Will I dream?" - HAL, "2010"

chip@ateng.com (Chip Salzenberg) (06/23/89)

According to lamy@ai.utoronto.ca (Jean-Francois Lamy):
>[C News unbatching is inefficient because] all your UUCP feeds may
>come in in the same format; testing each batch to see what format
>it is in is clearly wasted, especially if the test involves starting
>a process like compress only to see it fail...

I hardly think that running compress only to see it fail is a waste.  In the
future, long after you've forgotten what you did to your news subsystem, you
may make a new connection with a site that compresses its batches.  Or your
neighbor(s) may start sending compressed batches without telling you.

>We just bypass the input subsystem; we use an lpr queue that calls relaynews.

This hack has got to be the most imaginitive use of lpr I've ever seen!  You
get a 10.0 for originality.  Of course, if you accidentally specify the
wrong "device", things could get ugly... :-)

>The trick is to look at what your input batches are like and to get them
>to relaynews asap.  In other words, trim your sails.

That's the ticket, laddie.  Or, to paraphrase Strunk & White:

    Omit needless code!  Omit needless code!  Omit needless code!

-- 
You may redistribute this article only to those who may freely do likewise.
Chip Salzenberg         |       <chip@ateng.com> or <uunet!ateng!chip>
A T Engineering         |       Me?  Speak for my company?  Surely you jest!

henry@utzoo.uucp (Henry Spencer) (06/24/89)

In article <89Jun23.003815edt.11717@neat.ai.toronto.edu> lamy@ai.utoronto.ca (Jean-Francois Lamy) writes:
>I think that a lot of things are done in the input subsystem in a
>general, but inefficient way...

The just-released patch streamlines newsrun considerably; it had got kind
of flabby and baroque.  However, Jean-Francois is correct in saying that
it will usually be a performance win to exploit special knowledge of the
nature (e.g. all compressed, none compressed) of your feeds.
-- 
NASA is to spaceflight as the  |     Henry Spencer at U of Toronto Zoology
US government is to freedom.   | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

henry@utzoo.uucp (Henry Spencer) (06/24/89)

In article <9476@b-tech.ann-arbor.mi.us> zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) writes:
>Here are some times for cnews vs. 2.11.14 to process a single 
>compressed batch (~50K) ...

This surprised us considerably.  In some ways it is not a fair test, since
newsrun (if you run it once an hour as we suggest) will normally amortize
setup and teardown overhead over a considerable number of batches, not just
one.  However, it's also true that it had gotten a bit flabby.  The just-
released patch streamlines it quite a bit.
-- 
NASA is to spaceflight as the  |     Henry Spencer at U of Toronto Zoology
US government is to freedom.   | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) (06/24/89)

>common feed from 'att', I ran the 'acctcom' routine through an awk script
>to sum up processes and CPU time for 2.11 rnews vs. Cnews' relaynews, on the 
>grounds that these were the routines actually unbatching news.  The results I got were:
>

You need to look at the whole cnews system, not just the program that 
does the unbatching.  Unless you have modified things, I maintain that 
my figures are more accurate.  

What we need is a 1 process rnews/relaynews.  I'm convinced it can be done.

-- 
  Jon Zeeff			zeeff@b-tech.ann-arbor.mi.us
  Ann Arbor, MI			sharkey!b-tech!zeeff

zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) (06/24/89)

>>Here are some times for cnews vs. 2.11.14 to process a single 
>>compressed batch (~50K) ...
>
>This surprised us considerably.  In some ways it is not a fair test, since
>newsrun (if you run it once an hour as we suggest) will normally amortize
>setup and teardown overhead over a considerable number of batches, not just
>one.  

If you don't use rnews.immed, I agree.  I've subtracted off the times 
to run newsrun when there is nothing to be done (ie, the fixed per run 
overhead at the beginning of newsrun).  

So now I get:

cnews  real: 23.6  user: 1.9  sys: 5.5    

2.11   real: 12.1  user: 2.2  sys: 1.8

Still not the results I want to see (especially since cnews did much 
more disk i/o, more of a problem here than cpu).  I'll have to try the 
latest version to see what difference that makes.  


BTW, it seems to me that a numeric patch tracking mechanism would make it
easier to know what patches you need.

-- 
  Jon Zeeff			zeeff@b-tech.ann-arbor.mi.us
  Ann Arbor, MI			sharkey!b-tech!zeeff