duane@anasaz.UUCP (Duane Morse) (05/25/89)
Our NCR Tower 32/600 (Unix System 5.2) spends hours unbatching news. We're running with patch level 17, spooling stuff to .rnews (SPOOLNEWS is defined), and running rnews -U late at night. Looking at the code, I find that a history file is scanned for each message in a batch of messages (looking for duplicate postings). Even though we only keep about 7-days worth of news, this means the history files (0 to 9) are each about 200K long; that's a lot of disk to scan on a per-message basis! How do big System V sites deal with this? Does rnews -U run around the clock there? I'm considering modifying rnews and expire to use 101 files (00 through 99 and XX for the bizarre message id's) instead of the basic 10. Does this seem reasonable? Anybody have a better idea? -- Duane Morse ...{asuvax or mcdphx}!anasaz!duane (602) 861-7609
karl@cheops.cis.ohio-state.edu (Karl Kleinpaste) (05/25/89)
duane@anasaz.UUCP writes:
How do big System V sites deal with this? Does rnews -U run around
the clock there?
... Anybody have a better idea?
A far better solution is to convince your SysV-compiled system to cope
with DBM files. To do so, pick up copies of a dbm library from, e.g.,
the X distributions, or the once-postd mdbm library (which was my
solution). Build yourself a /usr/lib/libdbm.a, add #define DBM to
defs.h and -DDBM and -ldbm to your Makefile, and remake the Known
Universe. Re-install the result, and run expire -r to rebuild your
history files in DBM format. I got at least a 4x throughput increase
in [ir]news by doing so, on a (now defunct) 3B2/400 running SysVRel3.0.
The ancient cruft of SysV's /usr/lib/news/history.d is a crime against
Man, Nature, and Computer Science.
--Karl
zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) (05/26/89)
In article <26@anasaz.UUCP> duane@anasaz.UUCP (Duane Morse) writes: > >Looking at the code, I find that a history file is scanned for each >message in a batch of messages (looking for duplicate postings). Even >How do big System V sites deal with this? Does rnews -U run around >the clock there? Sure, use dbm(). If you don't have it, get dbz() (from me if you don't have any other source). -- Jon Zeeff zeeff@b-tech.ann-arbor.mi.us Ann Arbor, MI sharkey!b-tech!zeeff
larry@focsys.UUCP (Larry Williamson) (05/27/89)
In article <9389@b-tech.ann-arbor.mi.us> Jon Zeeff writes: >Sure, use dbm(). If you don't have it, get dbz() (from me if you don't have >any other source). Does anyone local or near to Waterloo have a copy of either of these? I scaned the archives on watmath, but I could not find any reference to either package. If it is too large to mail, then I can call your site directly. -larry -- Larry Williamson -- Focus Systems -- Waterloo, Ontario watmath!focsys!larry (519) 746-4918
jerry@olivey.olivetti.com (Jerry Aguirre) (05/31/89)
In article <KARL.89May25110233@cheops.cis.ohio-state.edu> karl@cheops.cis.ohio-state.edu (Karl Kleinpaste) writes: >solution). Build yourself a /usr/lib/libdbm.a, add #define DBM to >defs.h and -DDBM and -ldbm to your Makefile, and remake the Known >Universe. Re-install the result, and run expire -r to rebuild your >history files in DBM format. I got at least a 4x throughput increase >in [ir]news by doing so, on a (now defunct) 3B2/400 running SysVRel3.0. Actually you should only have to run expire -R (upper case). This will just rebuild the dbm copy of the history file from the text version and is a lot faster. The -r option will have to scan the entire set of news articles, parse each file, and sort the output by date. Depending on the amount of news you keep this can take many hours. And, of course, you can delete the 0-9 files.