[news.software.anu-news] A New Topic -- Tuning News

TLIMONCE@DRUNIVAC.BITNET (Tom Limoncelli@ Drew Univ.) (10/29/89)

I would like to start a discussion on how to tune news so that it is
as fast as possible when starting up.  Currently it takes about 3-4
minutes to start.  When we get 90-100 users on-line, it goes up to
8-10 minutes.  Neither are reasonable.
     
I understand that eventually it will be brought up to almost no time
at all, but while we're waiting there must be a bunch of "do"s and
"don't"s.
     
Let me tell you about my system.
VAX 6330
VMS 5.1-1
ANU NEWS v5.9A
     
We have about 400 newsgroups.  About 300 are very inactive.  77 of
them are fed from Bitnet mailing lists.  The rest are empty and
unused (we have one for each course being offered, whether requested
or not).  We are about to start receiving a full Usenet newsfeed,
what will add another 400 newsgroups at least.  (Yea DECUS UUCP!)
     
All news is stored on a fast drive (not sure of model number) which
basically has no other activity.  We run a defragment program on the
disk.  I can certainly feel it when the defrag isn't running,
performance drops.  So, Tom's first tip: get a defragmentor program.
We've initialized the disk with a cluster-size of 1.
     
Does anyone have any advice?  What's the general feeling on how this
is set up?
     
A while ago someone posted tips for parameters that Geoff should use
when opening files, etc but I don't know if they were implemented.
Any news for 5.9B?
     
-Tom
---
 Tom Limoncelli -- tlimonce@drunivac.Bitnet -- limonce@pilot.njin.net
       Drew University -- Box 1060, Madison, NJ -- 201-408-5389
:)   Standard Disclaimer: I am not the mouth-piece of Drew University
(:  "DEC's All-In-1 isn't completely useless, but it's a nice attempt."

gih900@UUNET.UU.NET (Geoff Huston) (11/01/89)

Tom Linoncelli writes:...
	I would like to start a discussion on how to tune news so that it is
	as fast as possible when starting up.  Currently it takes about 3-4
	minutes to start.  When we get 90-100 users on-line, it goes up to
	8-10 minutes.  Neither are reasonable.
     
	I understand that eventually it will be brought up to almost no time
	at all, but while we're waiting there must be a bunch of "do"s and
	"don't"s.
     
I have done almost as much as I can without getting into rewriting slabs of
code.
     
I have implemented as many of the speedup suggestions as are feasible, but the
one about opening the indexed files in read only mode will not produce any
speed up because you have to allow concurrent writers, and therefore the RMS
lock overheads are always with you.
     
Apart from setting up global buffers for the two indexed files there is little
else that I can suggest off-hand. The required directions are:
a) make the newsrc file an indexed file rather than streamlf text.
b) make the creation of the internal newsgroup data structure more direct by
using the create and map section call on a memory image
c) store the newsgroup directory screens in a file which can also be easily
picked up on startup.
     
All these things require time and resources, both of which from a personal view
point are in very scarce supply. As with many other aspects of NEWS development
this newsgroup as an appropriate forum for others to contribute code to NEWS. I
would, as always, welcome any contributions of code.
     
cheers,
Geoff Huston

rand@merrimack.edu (01/26/90)

In article <8910310355.AA11260@uunet.uu.net>, munnari!csc.anu.oz.au!gih900@UUNET.UU.NET (Geoff Huston) writes:
> Tom Linoncelli writes:...
> 	I would like to start a discussion on how to tune news so that it is
>      
> Apart from setting up global buffers for the two indexed files there is little

Well, I got around to doing this the other day and it helped, a bit.
I dug up some DECUS symposium notes and followed Ken Henerson's notes on
'turbo charging' RMS files. I've wanted to post something more cohesive
than this but work is getting in the way ;-)

A short cookbook approach for those who don't know how to set Global buffers
on files:

First of all, if you want the quick way out and your site resembles mine
(I keep about 450-500 newsgroups [news.groups] and for most groups
Item_hold=1 [news.items]. About 30-40 are set to 4-7 days.) issue the
the following 2 commands (assuming you have a few extra GBLSECTIONS and
GBLPAGES hanging around)

$ SET FILE/GLOBAL_BUFFER=3 NEWS_ROOT:NEWS.GROUPS
$ SET FILE/GLOBAL_BUFFER=44 NEWS_ROOT:NEWS.ITEMS

Where did these two magic numbers come from? 

To do this you need to have control of news.groups and news.items. So, stop
news_batch and kick everyone off news.

Do $ ANAL/RMS/FDL NEWS.GROUPS

This produces a file NEWS.FDL (grrr).

Do $ EDIT/FDL NEWS.FDL

This drops you into the FDL Editor. Select, in order, INVOKE, OPTIMIZE,
and LINE. Press RETURN a bunch of times (unless you know what you're
doing) until a graph is displayed. Select FD. Press return at the FDL Title
section. In the following display note: buckets in index, maximum bucket
size, and pages required to cache index. Press returns through all of the
KEY 1 stuff (technical, eh?) until you get back to the main menu. Select
EXIT.

Buckets in index should be pages required to cache index divided by
bucket size. Edit news.fdl and add the line GLOBAL_BUFFER_COUNT n in
the FILE section of the FDL where n is the number of pages required to
cache index.

If you want to be more productive, look at the compression stats in
NEWS.FDL;-0 in the Analysis of Key sections. If a key or record is compressed
(look in KEY description at top of file) and the stats are lousy (<50%)
turn compression off by editing news.fdl.0.

The FDL is now optimized. To apply it to news.groups do (whilst no one is
using NEWS):
$CONVERT/FDL=NEWS.FDL NEWS.GROUPS NEWS.GROUPS

Now (huff, pant) do the same for news.items (this takes a looooong time).

Each file you slap global_buffers on needs one GBLSECTION. You may also need
to boost GBLPAGES, GBLPAGFIL, and RMS_GBLBUFQUO.

Since global buffers count against a process' working set, make sure your
news users can handle a hit of (max bucket size*pages required to cache index)
pages for each of news.groups and news.items.

Disclaimer: I know squat about rms file tuning. This improved my
performance, yours may vary.


Rand P. Hall                    UUCP: {uunet,wang,ulowell}!samsung!hubdub!rand
Merrimack College               CSNET: rand@merrimack.edu
N. Andover, MA 508.683.7111     Dukakis = 15% tax hike + $1.3 billion deficit

jeh@simpact.com (02/06/90)

The thing I'm most unhappy about wrt NEWS performance is the startup time,
ie the time from when I type NEWS at the DCL prompt until I see my first
unread item or newsgroup directory or whatever.  Once it's started I think
the performance is acceptable, except for....

...the constant returning to the news item directory when it's unnecessary 
(from my point of view, anyway). 

	--- Jamie Hanrahan, Simpact Associates, San Diego CA
Chair, VMSnet [DECUS uucp] and Internals Working Groups, DECUS VAX Systems SIG 
Internet:  jeh@simpact.com, or if that fails, jeh@crash.cts.com
Uucp:  ...{crash,scubed,decwrl}!simpact!jeh

tinkelman@ccavax.camb.com (02/06/90)

In article <903.25cdb672@simpact.com>, Jamie Hanrahan, Simpact Associates, 
San Diego CA <jeh@simpact.com> writes:

> The thing I'm most unhappy about wrt NEWS performance is the startup time,
> ie the time from when I type NEWS at the DCL prompt until I see my first
> unread item or newsgroup directory or whatever.  

This is not a solution, but what I did (to reduce the number of times that
I had to wait) was to run NEWS as one of my `kept processes'.  F17 puts me
into MAIL, F18 into NEWS, etc.  I just have to remember to type UPDATE 
every once in a while.  UPDATE takes a while, but less it seems than starting
NEWS, and it's more under my control.
-- 
Bob Tinkelman, Cambridge Computer Associates, Inc., 212-425-5830              
bob@ccavax.camb.com  or ...!uunet!ccavax!bob      

morgand@putz.uucp (Dave Morgan) (02/10/90)

Regarding NEWS performance,

In article <903.25cdb672@simpact.com>, jeh@simpact.com writes:
> The thing I'm most unhappy about wrt NEWS performance is the startup time,
> ie the time from when I type NEWS at the DCL prompt until I see my first
> unread item or newsgroup directory or whatever.  Once it's started I think
> the performance is acceptable, except for....
> 

  I agree with this -- especially in light of the fact that NEWS gets
invoked for *each* batch of news that needs to be added.  I usually
receive on the order for 100-150 news batches per day which amounts
to lots of wasted CPU time just in invoking NEWS.

> ...the constant returning to the news item directory when it's unnecessary 
> (from my point of view, anyway). 
> 

  This is another biggie on my "hit" list.  This is especially
frustrating when reading news from a 2400 baud line.

-- 

	Dave Morgan			503-643-7401 (H)
	putz!morgand			503-642-6311 (W)

spain@mdcbbs.com (02/11/90)

> In article <903.25cdb672@simpact.com>, jeh@simpact.com writes:
> The thing I'm most unhappy about wrt NEWS performance is the startup time,
> ie the time from when I type NEWS at the DCL prompt until I see my first
> unread item or newsgroup directory or whatever.  Once it's started I think
> the performance is acceptable, except for....

At least we don't have to wait for the UPDATE (like VaxNotes).  Perhaps NEWS is
spending time checking out the unseen newsitems...

-- 
  =============================================================
 | Harrison M. Spain III |    Voice: (714) 952-6114            |
 | Sr. Section Manager   |      Fax: (714) 952-5371            |
 | McDonnell Douglas M&E | Internet: spain@mdcbbs.com          |
 | 5701 Katella Ave.     |     UUCP: uunet!mdcbbs.com!spain    |
 | Cypress, CA  90630    |      PSI: PSI%31060099980019::SPAIN |
  =============================================================

jeh@simpact.com (02/12/90)

We just got our CDC Wren V disks (ESDI interface, Andromeda ESDC caching
controller) back and moved the NEWS directory tree onto one of them
(it had been on a DEC RA82).  

I note that NEWS now takes just about half the time to start up (ie from 
typing NEWS at DCL to getting the "DIR/NEW" screen displayed) that it did
while the stuff was on the RA82.  

This purely unscientific test suggests that NEWS startup time is predominated
by disk I/O.  

	--- Jamie Hanrahan, Simpact Associates, San Diego CA
Chair, VMSnet [DECUS uucp] and Internals Working Groups, DECUS VAX Systems SIG 
Internet:  jeh@simpact.com, or if that fails, jeh@crash.cts.com
Uucp:  ...{crash,scubed,decwrl}!simpact!jeh

gih900@CSC1.ANU.OZ.AU (Geoff Huston) (02/13/90)

>In article <903.25cdb672@simpact.com>, jeh@simpact.com writes:
>> The thing I'm most unhappy about wrt NEWS performance is the startup time,
>> ie the time from when I type NEWS at the DCL prompt until I see my first
>> unread item or newsgroup directory or whatever.  Once it's started I think
>> the performance is acceptable, except for....
>>
>
>  I agree with this -- especially in light of the fact that NEWS gets
>invoked for *each* batch of news that needs to be added.  I usually
>receive on the order for 100-150 news batches per day which amounts
>to lots of wasted CPU time just in invoking NEWS.
     
You can cut the startup time by tuning NEWS.ITEMS as frequently as possible.
     
Apart from that I'd welcome any code suggestions which can offer tangible
improvements in speed.
     
>> ...the constant returning to the news item directory when it's unnecessary
>> (from my point of view, anyway).
>>
>
>  This is another biggie on my "hit" list.  This is especially
>frustrating when reading news from a 2400 baud line.
     
Again I'd welcome code suggestions here. I have posted on this before, and to
date have received no real suggestions on where improvements could be made.
     
Geoff Huston
gih900@csc.anu.oz.au

gih900@UUNET.UU.NET (Geoff Huston) (02/18/90)

>controller) back and moved the NEWS directory tree onto one of them
>(it had been on a DEC RA82).
>
>I note that NEWS now takes just about half the time to start up (ie from
>typing NEWS at DCL to getting the "DIR/NEW" screen displayed) that it did
>while the stuff was on the RA82.
>
>This purely unscientific test suggests that NEWS startup time is predominated
>by disk I/O.
     
yep - and tuning the file NEWS.ITEMS is the one to concentrate on - the file is
big and very dynamic in terms of record insertions and deletions.
     
Geoff

spain@mdcbbs.com (02/20/90)

> yep - and tuning the file NEWS.ITEMS is the one to concentrate on - the file is
> big and very dynamic in terms of record insertions and deletions.

I'm sure there is a more efficient method but I found that this command file
running in batch tends to find windows when it can CONVERT the news files.

$!------------------------------------------------------------------'f$verify(0)
$! Written by Harrison Spain to convert NEWS files.
$!------------------------------------------------------------------------------
$ set noon
$ say    := write sys$output
$ submit := submit
$ if f$mode() .eqs. "BATCH" then goto convert_news
$ submit /after=tomorrow /name="Convert-News" /notify -
	/nolog /que=news$batch -
	news_manager:convert_news /noprint -
	/user=system
$ exit
$convert_news:
$ submit /after=tomorrow /name="Convert-News" /notify -
	/nolog /que=news$batch -
	news_manager:convert_news /noprint -
	/user=system
$!
$ set noon
$ on error then goto wait_a_bit
$ set def uucp_disk:[uucp.news]
$!
$ n = 0
$ error_label := news_items
$news_items:
$ if n .lt. 32 then convert /reclaim /stat news.items
$ if n .ge. 32 then gosub mail_moderator
$!
$ n = 0
$ error_label := news_groups
$news_groups:
$ if n .lt. 32 then convert /reclaim /stat news.groups
$ if n .ge. 32 then gosub mail_moderator
$!
$ n = 0
$ error_label := news_history
$news_history:
$ if n .lt. 32 then convert /reclaim /stat history.v60
$ if n .ge. 32 then gosub mail_moderator
$!
$ exit
$wait_a_bit:
$ write sys$output "''error_label': ''n'"
$ n = n + 1
$ on error then goto wait_a_bit
$ wait 00:15:00
$ goto 'error_label'
$mail_moderator:
$ mail /sub="Convert of ''error_label' NEWS file failed!" sys$input moderator
Apparently it required more than 32 passes to convert one of the NEWS files.
$ return

-- 
  =============================================================
 | Harrison M. Spain III |    Voice: (714) 952-6114            |
 | Sr. Section Manager   |      Fax: (714) 952-5371            |
 | McDonnell Douglas M&E | Internet: spain@mdcbbs.com          |
 | 5701 Katella Ave.     |     UUCP: uunet!mdcbbs.com!spain    |
 | Cypress, CA  90630    |      PSI: PSI%31060099980019::SPAIN |
  =============================================================

brent@uwovax.uwo.ca (Brent Sterner) (02/27/90)

In article <9002190935.AA25552@uunet.uu.net>, munnari!csc.anu.oz.au!gih900@
					UUNET.UU.NET (Geoff Huston) writes:
>>This purely unscientific test suggests that NEWS startup time is predominated
>>by disk I/O.
>      
> yep - and tuning the file NEWS.ITEMS is the one to concentrate on - the file
> is big and very dynamic in terms of record insertions and deletions.
>      
> Geoff

   Pardon my ignorance, but I'm new to this group (but not to ANU NEWS).  I'm
the past system manager for our site, and I've observed some inefficiencies in
ANU NEWS when I run it.  NEWS.ITEMS is one.  One thing I'd really like is a
list of known problems and work-arounds you use in the field.  Specifically,
I've noted the following issues (probably old hat?):

a) NEWS.ITEMS is very big and sparse (at our site):
	Directory BIGDISK:[NEWS]
	NEWS.ITEMS;47                   6342/77205   21-FEB-1990

   The news manager does periodic convert/reclaim processing, but as far as
I know, that's all.  The file is phenomenally fragmented (cathedral windows),
but our defragger won't touch it because it is always busy.  I suspect that
if the file were less empty (eg 6342/10000 blocks) and defragmented, our site
would see a really big performance improvement.

b) NEWSRC for every user seems to be similarly fragmented, although *much*
smaller.  (It also gets rewritten every time I exit news, whether or not I
read anything.)  Might it be a candidate for writing more contiguously (ie
FOPEN the file with a preliminary guess about file size)?

   Any suggestions?  I'm aware (Geoff) that doing this stuff up *right* will
probably require recoding and a lot of effort.  For the time being, I'd be
quite happy with any periodic work-around that made news startup faster for
our users (especially news.items fragmentation and blocks allocated).

   If this stuff is old and boring to you folk, please send me email rather
than wasting a lot of bandwidth to this group.  Thanks.  b.
--
Brent Sterner                        Technical Support Manager, Academic Systems
Network    <BRENT@uwo.ca>            <BRENT@UWOVAX.BITNET>
           <129.100.2.13>            Telephone  (519)661-2151 x6036
Last Gasp  Computing & Communications Services, Natural Sciences Building
           The University of Western Ontario, London, Ontario, Canada  N6A 5B7